Data trustworthiness and user reputation as indicators of VGI ...

22
Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=tgsi20 Geo-spatial Information Science ISSN: 1009-5020 (Print) 1993-5153 (Online) Journal homepage: http://www.tandfonline.com/loi/tgsi20 Data trustworthiness and user reputation as indicators of VGI quality Paolo Fogliaroni, Fausto D’Antonio & Eliseo Clementini To cite this article: Paolo Fogliaroni, Fausto D’Antonio & Eliseo Clementini (2018) Data trustworthiness and user reputation as indicators of VGI quality, Geo-spatial Information Science, 21:3, 213-233, DOI: 10.1080/10095020.2018.1496556 To link to this article: https://doi.org/10.1080/10095020.2018.1496556 © 2018 Wuhan University. Published by Informa UK Limited, trading as Taylor & Francis Group. Published online: 21 Sep 2018. Submit your article to this journal Article views: 22 View Crossmark data

Transcript of Data trustworthiness and user reputation as indicators of VGI ...

Full Terms & Conditions of access and use can be found athttp://www.tandfonline.com/action/journalInformation?journalCode=tgsi20

Geo-spatial Information Science

ISSN: 1009-5020 (Print) 1993-5153 (Online) Journal homepage: http://www.tandfonline.com/loi/tgsi20

Data trustworthiness and user reputation asindicators of VGI quality

Paolo Fogliaroni, Fausto D’Antonio & Eliseo Clementini

To cite this article: Paolo Fogliaroni, Fausto D’Antonio & Eliseo Clementini (2018) Datatrustworthiness and user reputation as indicators of VGI quality, Geo-spatial Information Science,21:3, 213-233, DOI: 10.1080/10095020.2018.1496556

To link to this article: https://doi.org/10.1080/10095020.2018.1496556

© 2018 Wuhan University. Published byInforma UK Limited, trading as Taylor &Francis Group.

Published online: 21 Sep 2018.

Submit your article to this journal

Article views: 22

View Crossmark data

Data trustworthiness and user reputation as indicators of VGI qualityPaolo Fogliaroni a, Fausto D’Antoniob and Eliseo Clementini b

aDepartment for Geodesy and Geoinformation, Vienna University of Technology, Vienna, Austria; bDepartment of Industrial andInformation Engineering and Economics, University of L’Aquila, L’Aquila, Italy

ABSTRACTVolunteered geographic information (VGI) has entered a phase where there are both asubstantial amount of crowdsourced information available and a big interest in using it byorganizations. But the issue of deciding the quality of VGI without resorting to a comparisonwith authoritative data remains an open challenge. This article first formulates the problem ofquality assessment of VGI data. Then presents a model to measure trustworthiness ofinformation and reputation of contributors by analyzing geometric, qualitative, and semanticaspects of edits over time. An implementation of the model is running on a small data-set fora preliminary empirical validation. The results indicate that the computed trustworthinessprovides a valid approximation of VGI quality.

ARTICLE HISTORYReceived 21 March 2018Accepted 4 June 2018

KEYWORDSVolunteered geographicinformation; data quality;trustworthiness; reputation

1. Introduction

Volunteered geographic information (VGI)(Goodchild 2007) is a form of user-generated content(UGC) that is primarily concerned with the survey,collection, and dissemination of spatial data. Recently,VGI projects gained increasing attention and, as aconsequence, their user basin increased notably.Several studies showed that while VGI coverage andquality is approaching that of authoritative data-sets(in areas with large number of volunteers), VGI qualityis not homogeneously distributed in space (Haklay2010; Mooney, Corcoran, and Winstanley 2010).Accordingly, the conception and implementation ofnew methods to assess VGI quality is highly needed(Elwood, Goodchild, and Sui 2013).

In general, data quality can be regarded as the level offitness between single pieces of data and the parts ofreality that these represent – i.e. the more precisely datacorresponds to reality, the higher its quality.

Thus, the only way to obtain the “true” qualityvalue of data would be to compare it to reality, butthis is impracticable because data and reality belongto two different spaces (conceptual and physical,respectively). In practice, data quality is typicallyevaluated with respect to four main dimensions:accuracy, completeness, consistency, and timeliness(Wand and Wang 1996).

There are two main approaches to assess VGI dataquality. The first one assumes that authoritative dataare quality data. Thus, it consists in comparingvolunteered against authoritative data-sets andaddresses three of the four dimensions mentionedabove: accuracy, completeness, and consistency. One

main drawback of this approach is that it requires theaccess to authoritative data, which might not be pos-sible because of limited data availability, licensingrestrictions, or high procurement costs (Antoniouand Skopeliti 2015). Moreover, while authoritativedata are regarded as quality data, in fact there is noabsolute guarantee about its correctness and consis-tency, especially considering the discrepanciesbetween data and reality that arise over time becauseof the slow update rate of authoritative data-sets. Thesecond approach aims at assessing the quality of VGIby analyzing the evolution of the data itself, i.e. itshistory or provenance. This approach overcomes thelimitations of the previous one as it does not requireexternal data sources and can also take into accountthe timeliness dimension. However, this approachprovides an approximation of the quality, ratherthan an accurate measurement. In other words, itprovides a proxy measure for data quality that isreferred to as trustworthiness (Dai et al. 2008).

This article presents a system to compute a trust-worthiness score for each version of a geographicfeature by analyzing its provenance and evolution.Typically, in the literature, provenance is defined asthe historical evolution of data over time, but in thescope of this work we distinguish between data pro-venance and data evolution. Fixed a point in time, byprovenance we intend the sequence of edits that afeature undergoes over time until the given timeinstant; by evolution we intend the edits that thefeature will undergo in the future. The edit sequenceof a feature over time is assumed to be implementedthrough versioning. Both provenance and evolution

CONTACT Eliseo Clementini [email protected]

GEO-SPATIAL INFORMATION SCIENCE2018, VOL. 21, NO. 3, 213–233https://doi.org/10.1080/10095020.2018.1496556

© 2018 Wuhan University. Published by Informa UK Limited, trading as Taylor & Francis Group.This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permitsunrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

include authoring information. Accordingly, our sys-tem computes two scores: (1) reputation is a scorethat is associated to a volunteer and that denotes herreliability; (2) trustworthiness is a score that is asso-ciated to each version of a geographic feature andthat depends on both the sequence of edits that thefeature undergoes over time and the reputation of itsauthor.

The main underlying idea is that, according to themany eyes principle (Raymond 1999), the quality of afeature will improve over time and editing: the morevolunteers contribute information about a feature, thehigher the probability that errors and inconsistencies arespotted and corrected, and, thus, the higher the prob-ability that the feature is correctly mapped in the VGIsystem. Such a concept is captured by the notion ofconfirmation. Approaches to encourage crowdsourcingtake into account people’s intrinsicmotivation, such as inMartella, Kray, and Clementini (2015).

We distinguish three main effects making up aconfirmation: direct, indirect, and temporal. Thedirect effect corresponds to the case where a newversion of a feature is contributed that leavesuntouched a subset of the attributes of the currentfeature version. We assume that this is an indica-tion that the attribute values were already correct,thus the overall trustworthiness of the featuremust increase. The indirect effect covers thecases where a feature is edited that is close enoughto the feature being confirmed. Since the volunteeronly modified the nearby feature, this can be con-sidered as an indication of the quality of thefeature being confirmed. Finally, the temporaleffect addresses the persistency of a feature versionover time, i.e. the longer a version staysuntouched, the higher the probability that it iscorrectly mapped.

Direct and indirect effects are finer divided intothree components, to more precisely account for thedifferent aspects characterizing the editing prefer-ences of contributors: thematic, geometric, and quali-tative. The first two components have been includedin order to model more precisely the skills and thepreferences of the volunteers. For example, a volun-teer with little or no surveying skills may be moreinclined to contribute information only about thethematic component of a geographic feature andleave untouched the geometric part. This means thatconfirmations generated by these kinds of peoplemust be weighted differently for the geometric andthe thematic aspects.

The qualitative aspect has been introduced toaddress properties of human spatial cognition.Findings in cognitive science (Mark 1993) haveshown that humans conceptualize space in a relativeand qualitative manner, i.e. mental representations ofspace are typically geometrically distorted, but some

qualitative relations among objects (e.g. order, direc-tion, and topology) are interiorized correctly. Thismeans that while a volunteer may be unable to noticea small geometric imprecision in a piece of data, she/he may notice much more easily a wrong qualitativespatial relation (e.g. two disconnected buildings thathave been mapped as touching each other).

The rest of the article is structured as follows. InSection 2, we discuss more in detail background workon the subject. Section 3 is the core of our model: wediscuss the types of changes that affect VGI data, wedefine trustworthiness and reputation as the mainaspects that influence VGI data quality, and we dis-tinguish among direct, indirect, and temporal effects.In Section 4, we describe a system architecture thatimplements our model: the resulting framework,called TandR (for Trustworthiness and Reputation),makes use of domain ontologies for data representa-tion. In Section 5, we carried out an experiment witha selected portion of OpenStreetMap (OSM) data(http://www.openstreetmap.org), whose various fea-ture versions were compared to ground truth datato explore how trustworthiness and reputationindexes vary over time following data editing.Section 6 draws short conclusions and highlightsfuture improvements and extensions.

2. State of the art

While VGI is a very successful means to collect geo-graphic information at low cost (Goodchild 2007), itsuffers a major drawback: VGI comes with no assur-ance of quality (Goodchild and Li 2012). The issue isinherently related to the very nature of VGI, that isprovided by volunteers in a relatively unconstrainedmanner. Indeed, most of the VGI systems currentlyimplemented do not impose any enforcement ontheir contributors, except for some guidelines tostreamline the input data. So, according to the main-stream philosophy of VGI, there is no control inplace that guarantees the goodness of the contributedinformation.

Recent surveys (Antoniou and Skopeliti 2015;Senaratne et al. 2017; Degrossi et al. 2018; Fonteet al. 2017) perform a review of quality measuresand indicators for VGI. Specifically, Senaratne et al.(2017) discriminate among three different forms ofVGI systems: map-based (e.g. OSM), image-based(e.g. Flickr), and text-based (e.g. geo-blogs). Theyconclude that the first form is by far the most wide-spread. As quality measures for map-based VGI, theyidentify completeness, consistency, positional accu-racy, temporal accuracy, and thematic accuracy.These quality measures are defined in ISO 19157.As for quality indicators, they find trustworthiness,credibility, and reputation, among others.

214 P. FOGLIARONI ET AL.

Accordingly, there exists two main methods toassess VGI quality. The first compares VGI data-setsagainst professionally surveyed ground truth data-sets. In this case, it is possible to assess the aforemen-tioned quality measures. The second aims at derivingsome quality indicators only by analyzing the VGIdata-sets themselves, without a comparison withexternal sources.

Several studies have carried out in the last years toassess quality of map-based VGI (mainly OSM) bycomparison against professionally surveyed data-sets.Most studies focused on quality assessment of roadnetworks in England (Haklay 2010), Germany(Zielstra and Zipf 2010; Neis, Zielstra, and Zipf2012), and France (Girres and Touya 2010), amongother countries. Other work focused on general fea-tures rather than street networks only (Helbich et al.2012; Mooney, Corcoran, and Winstanley 2010; Fanet al. 2014). All such contributions came to similarconclusions: (1) coverage and accuracy of map-basedVGI data-sets are approaching those of professionallysurveyed data-sets and (2) VGI quality is not homo-geneously distributed, with peaks in the most popu-lated areas.

Goodchild and Li (2012) suggest threeapproaches to assure VGI quality without resort-ing to external data sources. The first is termed“crowdsourcing” and assumes that qualityincreases with the number of contributors.Basically, this is an adaptation of the so-calledLinus’s Law in honor of Linus Torwalds.Speaking about open-source software, Torwaldsstated that “given enough eyeballs, all bugs areshallow” (Raymond 1999). When adapted tocrowdsourcing, Linus’s Law is sometimes referredto as the many eyes principle: “If something isvisible to many people then, collectively, they aremore likely to find errors in it. Publishing opendata can therefore be a way to improve its accu-racy and data quality, especially where a goodinterface for reporting errors is provided”(http://opendatahandbook.org/glossary/en/terms/many-eyes-principle/). Therefore, Goodchild andLi (2012) conclude that the crowdsourcingapproach does not properly apply to less knownfacts, such as geographic features located in asparsely populated area. Indeed, in this case, themany eyes needed to assure the quality would bemissing. The second approach mentioned byGoodchild and Li (2012) was named “social”. Itrelies on the construction of a hierarchy of mod-erators, i.e. individuals that are reputed trust-worthy in the community because of the qualityof their contributions. The third approach istermed “geographic” and is the one that is bestsuited for full or semiautomatization: it decideson the quality of a feature by comparing it against

geographical laws – e.g. a shoreline should have afractal shape.

The many eyes principle comes in support ofcollective intelligence. According to this theory, agroup of individuals performs better than its bestmembers. Spielman (2014) analyzes the issue ofVGI quality from this perspective. He argues thatthere are two approaches to quality assessment:validation-by-accuracy and validation-by-credibil-ity. The former corresponds to measuring VGIquality by comparison against ground truth data-sets. The latter aims at deriving VGI quality indi-cators based on the credibility of the volunteercontributing a feature that, in turn, depends onreputation, trustworthiness, and motivation of thevolunteer (Bishr and Mantelas 2008). Spielman(2014) concludes that VGI systems should bedesigned to foster collective intelligence.

2.1. VGI quality indicators: trustworthiness andreputation

Bishr and Janowicz (2010), among others, suggest touse data trustworthiness (aka reliability) as a proxymeasure for VGI quality. Arguably, the reliability ofan object, a person, or an action corresponds to itsgrade of predictability. Reliability reflects this ideaand corresponds to “a bet about the future contingentactions of others” (Sztompka 1999). Mezzetti (2004)provides a compatible definition of trustworthinessby asserting that an entity is trustworthy, within agiven context, if it actually justifies reliance on thedependability of its behavior within that context.Additionally, Bishr and Janowicz (2010) stress thenotion of people–object transitivity: the degree ofreliability associated to a person propagates to theentities that are somehow connected to her/him.Moreover, they argue that the reputation of a personindicates how much this person is considered reliablewithin a community.

Reliability indicators relating to spatial featurescontributed in a VGI system (trustworthiness) andto their contributors (reputation) seem to be validapproximations of VGI data quality. Lodigiani andMelchiori (2016) propose an adaptation of thePageRank algorithm for web pages to derive contri-butors reputation. Barron, Neis, and Zipf (2014)introduce a framework that incorporates 25 qualityindicators and corresponding methods for OSM dataquality. Finally, Keßler, Trame, and Kauppinen(2011) propose to derive reliability scores from his-torical information of VGI items. They argue that,from the edit history of features, one can obtain allnecessary information to define trustworthinessscores for the spatial features in a VGI system andreputation scores for their contributors.

GEO-SPATIAL INFORMATION SCIENCE 215

2.2. Data provenance

In order to use historical information to derive qual-ity indicators, it is convenient to use an appropriatemodel. Keßler, Trame, and Kauppinen (2011) tacklethe challenge by adapting the concept of data prove-nance (Hartig 2009) to VGI. More specifically,Keßler, Trame, and Kauppinen (2011) introduce theontology in Figure 1 to represent OSM data prove-nance. They introduce the concept of “EditingPattern”: a sequence of editing actions from whichit is possible to deduce useful information to derive ascore for user reputation and data trustworthiness.The identified patterns are “Confirmations”,“Corrections”, and “Rollbacks”, including the specialcases of “SelfCorrections” and “SelfRollbacks”. Eachediting pattern has a different effect on reputationand trustworthiness. Keßler and Groot (2013) suggestthat the trustworthiness of a feature should growproportionally with the number of versions, contri-butors, and confirmations, and inversely with thenumber of corrections and rollbacks.

3. Modeling trustworthiness and reputation

This section details a novel model to derive data trust-worthiness and user reputation from feature editsequences. A first version of the model appeared inD’Antonio, Fogliaroni, and Kauppinen (2014). Wedrew inspiration from the work of Keßler, Trame, andKauppinen (2011) and Keßler and Groot (2013) – seeSection 2.2–but ourmodel differs fromother approacheswith regard to the following aspects. Our model:

(1) can fit any map-based VGI system (rather thanOSM only) provided that the system imple-ments feature versioning and that differencesbetween versions are ascribable to series ofatomic operations; namely, creation, modifica-tion, and deletion;

(2) associates a reputation score to each VGI author;(3) derives a trustworthiness score for each feature

version (rather than for each feature);(4) accounts for the impact of changes between

two versions according to their relevance;(5) derives trustworthiness and reputation

scores also considering evolution informa-tion (i.e. future edits, rather than only pro-venance information).

3.1. Model overview

Ourfirst goalwas to design amodel to derive trustworthi-ness and reputation scores from both data provenanceand evolution. That is, rather than evaluating a featurecontributed at a specific point in time only by analyzingcurrent and historical information in the system, we alsoexploit information that has to come. For this to bepossible, two conditions must be satisfied: (1) the VGIsystem at hand implements versioning; (2) every time anevent alters the state of the information system, the trust-worthiness score of features affected directly or indirectly(see Section 3.5) by the change are also updated.

Versioning is a method that prevents data frombeing lost and defines a temporal order among featurestates (as shown in Figure 2). The state of a feature f iscalled feature version and is, simply, a set of attributesdescribing the feature at a given point in time. Wedenote by fi the i� th version of a feature f . A newversion is established every time a user of the VGIsystem creates or deletes a feature, or modifies a non-empty subset of the attributes of an existing feature.This user is the author of the version and to eachversion is associated exactly one author. The time inter-val between a feature version and the next is called theversion lifetime. The creation of a feature starts a featureversion lineage. Each new version extends the lineage,which terminates with the deletion of the feature.

RDF predicatesubClassOf

prv:performedBy

changesGeometry

hasKey

hasValue

addsTag

removesTag

changesValueOfKey

hasTag

prv:usedData

prv:precededBy

includesEdit

prv:createdBy

Changeset

Key

User

Edit

FeatureState

WayState

NodeState

Value

prv:DataItem

prv:CreationGuideline prv:DataCreation

rdfs:Literal

prv:HumanActor

prv:Tag

Figure 1. OSM provenance ontology diagram.

216 P. FOGLIARONI ET AL.

3.2. Model dynamics

Every time a new feature version fi is added to theVGI system the feature version lineage of f getsextended and we use the new bit of information toupdate trustworthiness and reputation for a subset ofthe system’s features and authors. For example,assume that feature f , currently at version fi, is mod-ified, generating version fiþ1. The model is designedto derive a trustworthiness value for the new featureversion by comparing it with previous versions.Concurrently, the reputation of the author of thenew feature will also be updated. The new versionalso triggers an adjustment of the trustworthiness ofprevious versions of f , as now we have a new versionthat they can be compared against. Consequently, thereputation of the authors of these versions will also beadjusted. Finally, the newly introduced version indir-ectly affects the trustworthiness of currently aliveversions of “nearby” features (see Section 3.5 fordetails) and the reputation of their authors.

Each time a trustworthiness or reputation score isupdated, it is assigned a validity timestamp, whichdenotes that the score value accounts for all informa-tion available in the VGI system from that point intime; it will be valid until the next update. Also, inorder to support the analysis and the monitoring ofthe system dynamics, scores are never deleted: sim-ply, newer scores supersede older ones.

3.3. Types of edits

We call edit any operation on the VGI system thatgenerates a new feature version. Some previousapproaches (Keßler, Trame, and Kauppinen 2011;Keßler and Groot 2013) focus on specific VGI sys-tems (i.e. OSM) and aims at modeling emergentediting patterns of the system at hand. Such anapproach allows for treating more accurately specificpatterns: one exemplary editing pattern goes underthe name of “edit wars” (Keßler and Groot 2013) or“tag wars” (Mooney and Corcoran 2012); two ormore users keep changing the attributes of a featureback and forth between the same states. Nevertheless,the approach delivers a rigid model, in the sense thatit hardly applies to VGI systems exposing differentediting patterns than the one for which it has been

designed. Conversely, we favored generality and flex-ibility over specialty and designed a model to respondto atomic editing operations.

3.3.1. CreationA new feature is added to the VGI system, along withits first version. Creation starts a version lineage thatis used at later points in time to derive trustworthi-ness scores. At creation time, though, there is nohistorical information available to derive the trust-worthiness of the new feature version, so this is setequal to the reputation of its author.

3.3.2. ModificationA new feature version is generated by adding, alter-ing, or deleting some attributes of the latest version ofthe feature. At modification time, the trustworthinessof the new version is derived and the trustworthinessof previous versions is updated according to thesimilarity to all versions in the lineage, which wasjust extended by the new version.

3.3.3. DeletionA feature is removed from the VGI system. In fact,this creates a special version of the feature denotingthe end of the feature lineage. The trustworthiness ofthis version is set equal to the reputation of its authorand the trustworthiness of previous versions are notupdated. Indeed, a deletion may denote one of twothings: either the feature is not existing any longer inthe reality or the information provided is totallywrong. In the first case, the trustworthiness of pre-vious versions should not be affected as these wereestablished in the past, when the feature was stillexisting. In the latter case, according to the manyeyes principle, the wrong information will eventuallybe corrected by restoring the deleted feature.Restoration is treated as a modification, meaningthat the trustworthiness associated to the “deleted”version will decrease drastically and, proportionally,the reputation of its author.

3.4. Impact and aspects of edits

In general, the trustworthiness of a feature at a givenpoint in time is a function of the similarity among theversions in the feature lineage. We adopt the typicalgeographical information system (GIS) perspective thatconsiders a geographic feature as consisting of thematicand geometric attributes, but we further add a qualita-tive spatial level that takes into account more drasticchanges that might be caused by small geometricchanges. In fact, a small geometric change may changedrastically the qualitative spatial relations holdingamong a feature and its neighbors and vice versa.

t1 tj-1 tj tj+1 tm

f1 fi-1 fnfi+1fi... ...

Figure 2. Versioning establishes a temporal ordering on fea-ture states.

GEO-SPATIAL INFORMATION SCIENCE 217

Accordingly, for the computation of trustworthi-ness and reputation, we consider the aspects reportedin the following subsections.

3.4.1. Thematic aspectThe thematic aspect describes the nongeometric char-acteristics of the feature. For example, the fact thatthe feature represents either a road, a traffic light, or abuilding with an orange facade. Thematic informa-tion is assumed to be represented by key–value pairs.

A simple approach to account for thematic edits isto look at the number of differences between the pairsassociated to the new version and the pairs associatedto other versions in the lineage. A finer-grained alter-native would be to look at the semantic similaritybetween the values associated to the different ver-sions. For example, this can be done by consideringthe lexical distance (i.e. the shortest path) betweentwo terms on the Wordnet graph (http://graphwords.com/) (Miller 1995). Alternatively, the shortest dis-tance in an ontology including both previous andaltered key–value pairs can be used.

For example, consider the case where the key “ame-nity” of feature version fi, initially set to“Establishment”, undergoes two thematic edits. Thefirst edit changes its value to “University”, and thesecond to “Educational Institution”. If we only look atthe number of thematic differences, both edits wouldresult in a score of one. Conversely, using the lexicaldistance on the Wordnet graph (as shown in Figure 3),it would be possible to better characterize the edits: thefirst edit would be characterized by a score of three, thelatter by a score of six with respect to fi.

3.4.2. Geometric aspectThe geometric aspect describes shape and position ofa feature. The relevance of a geometric edit is

evaluated with respect to a series of geometrical prop-erties, such as perimeter, area, number, and positionof vertices. Versions of a feature (as shown inFigure 4) are compared to see the differences inthese geometric components. For example, version fjhas more vertices than version fi, but lesser area andperimeter.

3.4.3. Qualitative spatial aspectThe qualitative spatial aspect refers to changes in thequalitative spatial relations holding between a featureand its neighbors. According to Clementini (2013),qualitative spatial relations belong to one of threeclasses: topological (Egenhofer 1989; Randell andCohn 1989; Tarquini and Clementini 2008), projective(Clementini et al. 2010; Tarquini et al. 2007; Fogliaroniand Clementini 2015; Billen and Clementini 2006,2005b, 2005a), and metric (Hernández, Clementini,and Di Felice 1995; Moratz and Ragni 2008).

The simultaneous consideration of both the geo-metric and qualitative spatial aspects allows forweighting more appropriately spatial edits. A qua-litative spatial change happens when a modificationto the geometric properties of a feature alters atleast one qualitative spatial relation between thefeature and its neighbors. The impact on trust-worthiness is a function of the distance on theconceptual neighborhood graph between the rela-tions holding before and after the qualitative spatialchange.

As an example, let us consider topological relationsonly. Consider the scenario depicted in Figure 5,where two features f and g at version i and j, respec-tively, are very close to each other but topologicallyDisjoint. A geometric edit occurs that slightly modifiesfi into fiþ1, as depicted on the right side of the figure.Although the geometric change is very small, it

Figure 3. Wordnet graph example.

fi fj

ΔPerimeter

ΔPosition

ΔArea

Δ#Vertices

Δ(fj, fj)

Figure 4. Geometric comparison between versions.

218 P. FOGLIARONI ET AL.

changes the topological relation holding between thetwo features. They are not Disjoint any longer but,rather, they Overlap, which, according to the theoryof conceptual neighborhoods (Freksa 1991), is a sen-sible change. The conceptual neighborhood graph fortopological relations (Egenhofer and Mark 1995) (asshown in Figure 6) indicates how topological relationsmay evolve following a continuous transformation.The distance on the graph evaluates the qualitativespatial difference between two topological configura-tions. For example, the relations Disjoint and Overlapare at distance two, because there is an intermediateconfiguration (Meet) between them.

3.5. Trustworthiness and reputation

Data trustworthiness (T) and user reputation (R)are discrete functions of time T;R :t1; t2; . . . ; tnf g ! ½0; 1� defined on the sett1; t2; . . . ; tnf g of time instants when new feature

versions are contributed that influence their value.Their values range in a continuum going from 0(i.e. minimum reliability) to 1 (i.e. maximumreliability). To keep notation cleaner, in the fol-lowing we avoid to make explicit the time depen-dency of trustworthiness and reputation: we writeTðfiÞ and RðaÞ instead of Tðfi; tÞ and Rða; tÞ,respectively.

The reputation RðaÞ of an author a is the averageof the trustworthiness values of all the feature ver-sions authored by a:

RðaÞ ¼P

fi2FðaÞ TðfiÞFðaÞj j (1)

where FðaÞ is the set of feature versions contributedby a until this point in time, �j j denotes the cardin-ality of a set, and TðfiÞ is the trustworthiness cur-rently associated to the i–th of such feature versions.

All the complexity of the model lies in thedefinition of trustworthiness as an implementationof the many eyes principle (Raymond 1999): themore authors contribute to a feature (i.e. moreversions), the higher the probability that errorsare spotted and corrected – so that the contribu-ted feature fits well to reality. Such a concept iscaptured by the notion of confirmation (resp. con-tradiction): the more versions conform (resp. dif-fer) to each other with respect to the aspectsdescribed in Section 3.4, the higher (resp. lower)the trust of these versions.

We differentiate among three different types ofconfirmations, each capturing a different effect onthe overall trustworthiness score: direct effect (Tdir),indirect effect (Tind), and temporal effect (Ttmp).Trustworthiness is defined as their weighted sum:

Figure 5. A small geometric change may correspond to a qualitative spatial change.

Figure 6. Conceptual neighborhood graph of topological relations, as defined in 9–Intersection model.

GEO-SPATIAL INFORMATION SCIENCE 219

TðfiÞ ¼ wdir � TdirðfiÞ þ wind � TindðfiÞ þ wtmp � TtmpðfiÞ(2)

where wdir;wind, and wtmp are weights balancing therespective components. Since T 2 ½0; 1�, we havewdir þ wind þ wtmp ¼ 1, with Tdir;Tind;Ttmp 2 ½0; 1�.

3.5.1. Direct effectThe direct effect (Tdir) expresses the similarity amongall versions in the lineage of a feature. We call thiseffect “direct” as the information conveyed in thesystem by each version of a feature affects the trust-worthiness of all versions of the same feature. So, thisis a direct confirmation (or contradiction) of the stateof a feature. Direct trustworthiness of a feature ver-sion increases as the attribute-wise difference to theaverage feature version diminishes, where the averagefeature version is obtained by averaging (attribute-wise) all versions in the lineage of a feature.

According to the tripartite classification of theaspects of an edit presented in Section 3.4, directeffect is regarded as consisting of three components:direct thematic effect (Tdir;them), direct geometriceffect (Tdir;geom), and direct qualitative spatial effect(Tdir;qual). The overall direct effect is obtained as theweighted sum of the three components:

TdirðfiÞ ¼ wdir;them � Tdir;themðfiÞ þ wdir;geom

� Tdir;geomðfiÞ þ wdir;qual � Tdir;qualðfiÞ (3)

where wdir;them;wdir;geom and wdir;qual are weights bal-ancing the respective components. Since Tdir 2 ½0; 1�,we have wdir;them þ wdir;geom þ wdir;qual ¼ 1,with Tdir;them;Tdir;geom;Tdir;qual 2 ½0; 1�.

3.5.2. Indirect effectThe indirect effect (Tind) is designed to account forthe effect of indirect confirmations on the trust-worthiness of a feature. The concept of indirect con-firmation was first introduced in Keßler and DeGroot (2013). It is based on the consideration that ifan author contributes a new version of a feature butleaves untouched nearby ones, then the latter are

likely to be correctly mapped. Accordingly, theirtrustworthiness must increase. Rather than consider-ing all nearby features, it would probably be moreappropriate to consider nearby landmarks, or moregenerally, outstanding features in the local context.There is a large literature addressing the challenge ofdefining what a landmark is – see Richter and Winter(2014) for an extensive survey – but this falls outsidethe scope of this article. So, in this work we considerindirect confirmation based on spatial proximity andtemporal co-occurrence of feature versions. That is,for a feature version fi to be indirectly confirmed byanother feature version gj, two conditions must befulfilled:

(a) (temporal co-occurrence) fi exists when gj iscreated;

(b) (spatial proximity) gj is within a given rangefrom fi.

The example in Figure 7 illustrates the indirecteffect in case of versions fi, aj, and bk of three differ-ent features. Figure 7(a) represents the lifetime of thefeature versions and can be referred to check tem-poral co-occurrence, while Figure 7(b) depicts spatialproximity, with the dotted line denoting the proxi-mity range of feature f . The entities that satisfy thetemporal co-occurrence or spatial proximity aredepicted in green. Only the trustworthiness of thoseentities that satisfy both criteria (green on both sidesof the figure) are indirectly affected by fi; those ingray are not affected. Version fi is created during ajand bk lifetime, with bk also falling in the proximityarea. Both versions aj and bk satisfy the condition ontemporal co-occurrence, but only bk satisfies the con-dition on spatial proximity. So, bk is indirectly con-firmed by fi, while aj is not. Note that the featureversion ajþ1 is not affected by fi as it is still notexisting. Conversely, when ajþ1 will start to exist, itwill influence the trustworthiness of fi – assumingthat the condition on spatial proximity will also besatisfied.

Figure 7. Versions that determine indirect effect.

220 P. FOGLIARONI ET AL.

Similarly to the direct effect, also the indirect effectconsists of three components: indirect thematic effectTind;them, indirect geometric effect Tind;geom, and indir-ect qualitative spatial effect Tind;qual. The overall indir-ect effect is obtained as the weighted sum of the threecomponents:

TindðfiÞ ¼ wind;them � Tind;themðfiÞ þ wind;geom

� Tind;geomðfiÞ þ wind;qual � Tind;qualðfiÞ (4)

where wind;them;wind;geom, and wind;qual are weights bal-ancing the respective components. Since Tind 2 ½0; 1�,we have wind;them þ wind;geom þ wind;qual ¼ 1,with Tind;them;Tind;geom;Tind;qual 2 ½0; 1�.

3.5.3. Temporal effectThe temporal effect (Ttmp) is designed to account fortemporal confirmation, that is, the longer a versionpersists in the system, the higher the probability thatit is well mapped. Accordingly, we want that thetrustworthiness of a feature version increases overtime as it remains unaltered, converging to T ¼ 1 ast tends to infinity. Again, this is a derivation of themany eyes principle: a version with serious andremarkable errors is more likely to be modifiedsooner than a version with negligible errors.

The temporal effect determines the increment oftrustworthiness until the maximum possible level. Asillustrated in Figure 8, the temporal effect asymptoti-cally increases toward 1, as described by the function:

TtmpðfiÞ ¼ tðfiÞtðf Þ þ c

(5)

where tðfiÞ is the lifetime of feature version fi, tðf Þ isthe overall lifetime of feature f (i.e. the sum of thelifetimes of all versions of f ), and c is a parameter thatcan be adjusted to modify the curve slope.

3.6. A possible implementation of direct andindirect trustworthiness

In previous sections, we introduced our model fortrustworthiness and reputation and gradually brokedown trustworthiness into finer pieces. The finestlevel comprises thematic, geometric, and qualitative

spatial aspects for both direct and indirect effects;specifically, direct thematic effect (Tdir;them), directgeometric effect (Tdir;geom), direct qualitative spatialeffect (Tdir;qual), indirect thematic effect (Tind;them),indirect geometric effect (Tind;geom), and indirect qua-litative spatial effect (Tind;qual).

In this section, we propose one possible imple-mentation of these fine-grained components of trust-worthiness. The empirical evaluation of the modelpresented in Section 5 is based on this implementa-tion. We leave for future work the task of derivingmore complex implementations.

3.6.1. Direct thematic effectThe direct thematic effect (Tdir;them) considers howthematic attributes vary among different versions. Ifa set of attributes is used more consistently through-out different versions of a feature, then the contribu-tion of these attributes to the direct thematic effect ishigher.

A possible behavior can be expressed as follows:

Tdir;themðfiÞ ¼ 1� noDiffThemðfiÞn

(6)

where fi is the version under assessment, n is thenumber of versions of the feature f , andnoDiffThemðfiÞ is the number of versions of f thathave a different thematic attribute set than fi. Asmentioned before, more complex behaviors wouldtake into consideration other metrics for the attri-butes, for example, not only by counting the numberof versions with different attribute sets, but also mea-suring the lexical distance among them.

3.6.2. Direct geometric effectThe direct geometric effect (Tdir;geom) addresses thedifference between a feature version fi and the aver-age feature version favg in terms of geometric proper-ties. Let

Δp ¼ 1� pi � pavg�� ��pi � pavg�� ��þ c

(7)

be a measure of the similarity between fi and favg forwhat concerns the generic geometric property p,

Figure 8. Temporal effect contribution on trustworthiness.

GEO-SPATIAL INFORMATION SCIENCE 221

where pi is the value of the property for fi, pavg is theaverage property value over the versions currentlypresent in the feature version lineage, and c is aparameter that can be adjusted to modify the curveslope. Then, the direct geometric effect can beexpressed as

Tdir;geom ¼P

p2P Δp

Pj j (8)

where P ¼ Area; Perimeter;VertexNumber;VertexfPositiong is the set of considered geometric propertiesand Pj j denotes its cardinality. Note that the set ofconsidered properties can be altered according tothose available in the VGI at hand.

3.6.3. Direct qualitative spatial effectThe direct qualitative spatial effect (Tdir;qual) consid-ers the change in qualitative relations among differ-ent versions. For this effect, speaking of averagevalues makes no sense, as in general qualitativerelations range in a discrete set with no totalorder defined. Therefore, we count the number oftimes a specific qualitative spatial relation occursbetween the feature version at hand fi and theother features in the system. To reduce the compu-tational complexity, we suggest to filter the featuresto be considered. A simple filter can be achieved byconsidering only those features falling within agiven distance d from the feature version fi.Alternatively, filtering can be based on more sophis-ticated criteria that only select features satisfyinggiven thematic or geometric constraints. For exam-ple, one may select only buildings whose area isbigger than a given threshold.

For simplicity, let us consider only current neigh-bors of feature version fi, i.e. the set of feature ver-sions N that are alive when version fi is generated andwhose distance from fi is not greater than d. Wedenote by F ¼ f1; � � � ; fif g the set of all versions offeature f , with fi being its latest version. Let

relj;k ¼ relðfj; kÞwith j 2 ½1; i�; k 2 N (9)

be the qualitative spatial relation holding between thej–th version of f and the k–th neighbor of fi. Finally,we use Iverson bracket notation and define

Cðreli;kÞ ¼Xi

j¼1

½relj;k ¼ reli;k� (10)

as the number of times the relation holding betweenfi and the generic neighbor k also holds for previousversions of f .

Then, we can define direct qualitative spatial trust-worthiness as

Tdir;qualðfiÞ ¼ 1Nj j �

Xk2N

Cðreli;kÞn

(11)

where n denotes the total number of versions offeature f , and Nj j denotes the number of neighborsof fi. Said differently, we first count for each neighbork 2 N of fi the number of times the relation reli;k alsoholds for past versions fj (with j< i) of f and normal-ize it by dividing by the number of versions of f .Then, we average these values over all neighbors of fi.

Note that, although previous versions of neighborsare not considered here, they will affect the trust-worthiness of fi indirectly, provided that we chose acommutative filtering function to generate set N(such as distance).

3.6.4. Indirect effectsOne very easy approach to model indirect effects(Tind;them, Tind;geom, Tind;qual) is to consider the reputa-tion of the author a that contributes the feature versiongj triggering an indirect confirmation for feature fi.

We defined the overall reputation of an author a inEquation 1 as the average trustworthiness of thefeature versions authored by a. Trustworthiness, asdefined in Equation 2, consists of direct, indirect, andtemporal effects. In turn, direct (Equation 3) andindirect (Equation 4) effects consist of thematic, geo-metric, and qualitative spatial aspects. By substitutingEquations 3 and 4 in Equation 2 and by applyingbasic algebraic transformations, we can refactor theoverall trustworthiness in terms of aspects (ratherthan by effects):

TðfiÞ ¼ wthem � TthemðfiÞ þ wgeom � TgeomðfiÞ þ wqual

� TqualðfiÞ þ wtmp � TtmpðfiÞ(12)

with the generic component wx � TxðfiÞ, x 2them; geom; qualf g equal to

wx � TxðfiÞ ¼ wdir � wdir;x � Tdir;xðfiÞ þ wind � wind;x

� Tind;xðfiÞ(13)

Equivalently, we can refactor the overall reputation ofauthor a as follows:

RðaÞ ¼ RthemðaÞ þ RgeomðaÞ þ RqualðaÞþP

fi2FðaÞ wtmp � TtmpðfiÞFðaÞj j (14)

with the generic component RxðaÞ, x 2them; geom; qualf g equal to

RxðaÞ ¼P

fi2FðaÞ wx � TxðfiÞFðaÞj j (15)

where FðaÞ is the set of feature versions contributedby a until this point in time, �j j denotes the cardin-ality of a set, and wx � TxðfiÞ is the trustworthiness

222 P. FOGLIARONI ET AL.

aspect (Equations 12 and 13) currently associated tothe i–th of such feature versions.

Then, we can define indirect trustworthinesseffects as

Tind;themðfiÞ ¼ RthemðaÞ; (16)

Tind;geomðfiÞ ¼ RgeomðaÞ; (17)

Tind;qualðfiÞ ¼ RqualðaÞ (18)

Figure 9 presents a summary of how trustworthi-ness is composed.

4. System architecture

In Section 3, we presented a model to derive trust-worthiness and reputation scores for VGI. This sec-tion presents and analyzes a system architecture thatimplements the model and that we call TandR,shortly for Trustworthiness and Reputation.

4.1. Requirements and features

The TandR framework uses feature evolution infor-mation. The outputs consist of a trustworthinessscore for each spatial feature version and a reputationscore for each contributor. The framework is able tomanage only VGI systems that adhere to the model ofSection 3; that is, feature evolution information isorganized in versions, where each version is the resultof an atomic edit, either a creation or a modificationor a cancellation.

The framework must perform data analysis onmultiple VGI system sources; it does not have toprovide direct support to each specific system format,but it must be able to analyze data from differentsources; such data have to be brought back to an

established format that is flexible and general enoughto be able to manage all necessary information.

The problem of giving the right structure to fea-ture evolution information can be solved by organiz-ing data in versions and defining a common formatfor information sources. Provenance information wasalready adopted and properly structured in Keßler,Trame, and Kauppinen (2011), as already discussedin Section 2.2. Figure 1 shows the ontology designedin this work, which correlates spatial data taken fromthe OSM VGI system with provenance information.We use such an ontology as a starting point to intro-duce, in Section 4.2, a more general ontology thatmodels VGI information.

The model introduced in Section 3 describes abasic strategy for scores calculation; nonetheless, itcan handle changes and extensions. Extensibility ofthe framework is implemented through a mechanismof module selection, with which the framework canhandle different implementations of the trustworthi-ness and reputation model: each module has the taskof managing a different implementation. The selec-tion of a given module is made by the user via astring passed to the framework configuration.

In addition to the last trustworthiness and reputationscores, the framework keeps record of historic scoresthat show how trustworthiness and reputation evolveover time. The model implementation follows eventstime line; therefore, computation takes into account aseries of trustworthiness and reputation scores, eachrepresenting the level of “reliability” that versions andusers had in themoment the scores calculation refers to.

4.2. Data models

Data are modeled following two ontologies: HVGIand TandR. The historical VGI (HVGI) ontology,

Figure 9. Graphic explanation of the composition of trustworthiness (reputation looks identical).

GEO-SPATIAL INFORMATION SCIENCE 223

shown in Figure 10, models the domain of VGIsystems provenance information and is used to man-age VGI feature versions. This ontology includesmany concepts defined in external ontologies, nota-bly, the OSM provenance information ontology pro-posed by Keßler, Trame, and Kauppinen (2011) thatdefines the concept of FeatureState (corresponding toour feature version). Also, it includes the ontologydefined by the Open Geospatial Consortium (OGC)to manage feature geometry, with the Geometry classand its subtypes Point, LineString, and Polygon.Finally, it includes the Friend Of A Friend (FOAF)ontology to manage user details.

The TandR ontology (as shown in Figure 11) mod-els trustworthiness and reputation scores domain. Itallows us to represent the score data of the modelintroduced in Section 3. The ontology provides defini-tion of Trustworthiness and Reputation concepts,linking them to the concept of Effect. This can consistof one or more Aspects, as for the case of direct andindirect effects that are broken down into semantic,geometric, and qualitative aspects. The concepts ofEffect and Aspect provide flexibility to the model, inthe sense that they are general and can be used toextend or modify the model proposed in this work.Note that Trustworthiness and Reputation are con-nected to FeatureState and User concepts, respectively,both defined in the OSM provenance informationontology (Keßler, Trame, and Kauppinen 2011).TrustworthinessValue and ReputationValue grouptwo bits of information: the numerical score valueand a timestamp, indicating the point in time sincewhen the score is valid. Thanks to this information, itis possible to reconstruct the evolution of the trust-worthiness of a feature version and of the reputation ofa user over time. The Effect concept is related to twomore concepts. The EffectDescription provides thename of the effect, a human-readable description ofit, and relates to the person that defines it. TheEffectValue reports the numeric value of the effectand the date from which that value is valid. TheAspect concept is specular to the Effect.

4.3. Framework overview

In this section, we describe the system components dia-gram of the TandR framework (as shown in Figure 12).The system uses semantic technologies, hence informa-tion persistence ismanaged by the Parliament triple store(http://parliament.semwebcentral.org/). Parliament usesthe quadruple pattern (graph {subject predicate object})and is spatially enabled via the GeoSPARQL standard(http://www.opengeospatial.org/standards/geosparql/).The system adopts the Parliament libraries available inthe Java language, that provide an API to interact withthe underlying triple store. As shown in Figure 12, thetriple store manages three different components, each of

which represents a graph that contains a set of relatedRDF (https://www.w3.org/RDF/) triples: the graph thatcontains VGI data, the graph that contains trustworthi-ness and reputation scores, and the graph that containsresults of scores validation.

The TandR framework is implemented in Java. Tomanage interactions with the triple store and processdata, TandR uses two external components: the JavaTopology Suite (JTS) (https://github.com/locationtech/jts/) and Jena (https://jena.apache.org/). The for-mer provides support for spatial operations, while thelatter collaborates with Parliament to manage theconnection to the triple store.

Data sources are divided into two kinds. The firstone, the most commonly used, needs a layer thatadapts data from the VGI system format to the appli-cation format, as specified by the HVGI ontology. Thesecond kind of source provides data directly to theformat required by the application. OSM representsan example of the first kind of source that requiresthe development of a component that translates theOSM history file into the application format: we calledsuch a component OSH2RDF. In Figure 12, the com-ponents in green indicate that they have been imple-mented in our prototype, while the components in grayindicate that they can be added in future developments.

The framework supervises the following activities:

(1) Installation. The installation process aims atpreparing a triple store to host data. More indetail, the installation process carries out themanagement and creation of graphs, the VGIsystem data import, and spatiotemporalindexing.

(2) Scores Computation. As already mentioned,multiple scores are provided for versions andusers, each associated with a validity date. Thelatest score available is always the most reli-able, due to the fact that it was computed witha greater amount of provenance information.The obtained scores are structured and madepersistent in such a way that they are properlydifferentiated by version date.

(3) Scores Validation Scores validation. comparesthe obtained scores against a ground truthdata-set, in order to assess their quality. Wewill explore how validation works in Section 5.

In the framework configuration, it is possible toselect which of the three activity the user wishes toperform or even more than one activity at a time.

4.4. Framework behavior

Events and edits are considered in temporal order.Therefore, the framework will take into account theorder in which versions were generated and process

224 P. FOGLIARONI ET AL.

osp:

Fea

ture

Stat

e

osp:

Use

r

osp:

Edi

t

geos

parq

l: Fe

atur

e

geos

parq

l: G

eom

etry

Lite

ral^

^ge

ospa

rql:w

ktLi

tera

l

foaf

: Onl

ineA

ccou

nt

Lite

ral^

^xs

d:st

ring

Lite

ral^

^xs

d:bo

olea

n

osp:

Tag

osp:

Val

ue

osp:

Key

Lite

ral^

^xs

d:bo

olea

n

hvgi

: VG

IFea

ture

foaf

: ac

coun

t

foaf

: acc

ount

Nam

e

foaf

: acc

ount

Serv

er

Hom

epag

e

prv:

Per

form

edBy

osp:

has

Tag

osp:

cre

ated

By

dcte

rms:

co

ntrib

utor

osp:

ad

dTag

sos

p:

rem

oveT

ags

osp:

ch

ange

sVal

uesO

fKey

osp:

cha

nges

Geo

met

ry

hvgi

: ha

sVer

sion

s

hvgi

: is

Vers

ionO

f

geos

parq

l: as

WKT

geos

parq

l: ha

sGeo

met

ry

prv:

pr

ecee

dedB

y

hvgi

: has

Vers

ion

hvgi

: isD

elet

ed

hvgi

: Va

lid

hvgi

: Poi

nt

hvgi

: Lin

estr

ing

hvgi

: Pol

ygon

hvgi

: Ver

sion

hvgi

: ver

sion

No

Lite

ral^

^xs

d:in

t

foaf

: Doc

umen

t

osp:

has

Valu

e

osp:

has

Key

Lite

ral^

^xs

d:st

ring

hvgi

: isV

alue

Lite

ral^

^xs

d:st

ring

hvgi

: isK

ey

hvgi

: Val

idity

Inte

rval

time:

Inst

ant

hvgi

: val

idFr

om

h vgi

: val

idTo

time:

inXS

DD

ateT

ime

Lite

ral^

^xs

d:D

ateT

ime

dcte

rms:

<ht

tp://

purl.

org/

dc/t

erm

s/>

foaf

: <ht

tp://

xmln

s.com

/foa

f/0.

1/>

geos

parq

l: <h

ttp:

//w

ww

.ope

ngis

.net

/ont

/geo

spar

ql#>

hvgi

: <ht

tp://

sem

antic

.web

/voc

abs/

hist

ory_

vgi/h

vgi#

>

osp:

<ht

tp://

sem

antic

.web

/voc

abs/

osm

_pro

vena

nce/

osp#

>

prv:

<ht

tp://

purl.

org/

net/

prov

enan

ce/n

s#>

time:

<ht

tp://

ww

w.w

3.or

g/20

06/t

ime#

>

xsd:

<ht

tp://

ww

w.w

3.or

g/20

01/X

MLS

chem

a#>

gene

raliz

atio

nRD

F pr

edic

ate

Figu

re10.D

omainon

tology:H

VGI.

GEO-SPATIAL INFORMATION SCIENCE 225

osp:

Fea

ture

Stat

e

tand

r:Tru

stw

orth

ines

sta

ndr:

Repu

tatio

n

foaf

: <ht

tp://

xmln

s.com

/foa

f/0.

1/>

tand

r: <h

ttp:

//se

man

tic.w

eb/v

ocab

s/ta

ndr_

asse

ssm

ent/

tand

r#>

osp:

<ht

tp://

sem

antic

.web

/voc

abs/

osm

_pro

vena

nce/

osp#

>

xsd:

<ht

tp://

ww

w.w

3.or

g/20

01/X

MLS

chem

a#>

tand

r: ha

sTru

stw

orth

ines

sta

ndr:

Refe

rsTo

Feat

ureS

tate

tand

r:Asp

ect

osp:

Use

r

tand

r: ha

sRep

utat

ion

tand

r:Re

fers

ToU

ser

tand

r: ha

sDes

crip

tion

Lite

ral^

^xs

d:do

uble

tand

r:ha

sVal

ue

Lite

ral^

^xs

d:st

ring

tand

r:Nam

eIs

foaf

:Per

son

Lite

ral^

^xs

d:st

ring

Lite

ral^

^xs

d:st

ring

tand

r:ta

ndr:

Des

crip

tionI

s

foaf

:nam

e

Lite

ral^

^xs

d:da

teTi

me

Lite

ral^

^xs

d:do

uble

tand

r:Asp

ectD

escr

iptio

ntand

r: ha

sDes

crip

tion

tand

r: ha

sVal

ue

tand

r:Tru

stw

orth

ines

sVal

ue

tand

r: ha

sVal

ue

tand

r:Rep

utat

ionV

alue

tand

r: ha

sVal

ue

Lite

ral^

^xs

d:da

teTi

me

tand

r:co

mpu

tedA

t

tand

r:Val

ueIs

tand

r:Asp

ectV

alue

Lite

ral^

^xs

d:da

teTi

me

tand

r:has

Asp

ect

Lite

ral^

^xs

d:st

ring

Lite

ral^

^xs

d:st

ring

tand

r:

foaf

:Per

son

Lite

ral^

^xs

d:st

ring

foaf

:nam

e

tand

r:com

pute

dAt

Lite

ral^

^xs

d:da

teTi

me

Lite

ral^

^xs

d:do

uble

tand

r:Val

ueIs

tand

r:N

ameI

s

tand

r: D

escr

iptio

nIs

Lite

ral^

^xs

d:do

uble

tand

r:Va

lueI

s

tand

r:co

mpu

tedA

ttand

r:com

pute

dAt

tand

r:Val

ueIs

RDF

pred

icat

e

Figu

re11.D

omainon

tology:trustworthinessandrepu

tatio

n.

226 P. FOGLIARONI ET AL.

them accordingly. One way to achieve this result ispresented in the following algorithm:

Retrieve all dates in which at least onefeature version was created.For each date:

Retrieve all versions whose life-time begins in that date.

For each version:Delegate scores calculation to

selected module.

Reputation and trustworthiness scores may be cal-culated according to our model (Section 3.6). Afterthe first time a version is processed, trustworthiness isupdated whenever a new feature version (directeffect) or a nearby version (indirect effect) is created;the temporal effect is updated every time either of theother effects are altered. See the following algorithm:

Retrieve, if any, previous versions tocurrent version fv.Retrieve, if any, living fv neighbors.Trigger Trustworthiness calculationof fv.Trigger Reputation calculation for fvauthor.For each fu previous version of fv:

Update Trustworthiness of fu.Update Reputation of fu author.

For each living nearby version nv:Update Trustworthiness of nv.Update Reputation of nv author.

5. Empirical validation

In order to prove the validity of the proposed model,we should show that feature versions with highertrustworthiness resemble more closely the features

mapped in authoritative data-sets. In this section,we provide a qualitative validation of the model by(1) running our TandR Java framework on VGI datato derive trustworthiness and reputation scores and(2) selecting a feature f and comparing its lowest fmin,medium fmed, and highest fmax trustworthy versionsagainst the corresponding feature ft from a groundtruth data-set. The model proves valid if the similar-ity σ of the three feature versions to the referencefeature grows as trustworthiness grows. That is, thefollowing system of equations should be satisfied:

σðfmin; ftÞ � σðfmed; ftÞ � σðfmax; ftÞTðfminÞ � TðfmedÞ � TðfmaxÞ

�(19)

where the similarity function σ accounts for the sameaspects used by the TandR framework to computetrustworthiness and reputation scores. Note that themodel does not require a comparison to a groundtruth data-set to work. This is only done to prove thevalidity of the model.

5.1. Data sources

As VGI data source we use the OSM full historydump file (http://planet.osm.org/planet/full-history/),which contains the evolution history from the begin-ning of the project until today. The ground truthdata-set was obtained from Open Government Wien(https://open.wien.gv.at/), which provides Austrianland cover data; a view of such government data isshown in Figure 13 and is about the municipality ofVienna in 2012.

For the experiment reported in this article, werestrict ourselves to the spatial window depicted inFigure 14 and to a temporal window of four yearsstarting January 2010 and ending December 2013. Theresulting data-set, obtained from the OSM full historydump, consists of 834 feature versions belonging to 552different geographic features edited by 38 users.

Figure 12. The TandR framework: component diagram.

GEO-SPATIAL INFORMATION SCIENCE 227

5.2. Experiment setup and validationmethodology

First, we run the TandR framework on the data-setdefined above with the following weights: wdir ¼ wind ¼wtmp ¼ 1

3 ; wdir;them ¼ wdir;geom ¼ wdir;qual ¼ 13 ; wind;them

¼ wind;geom ¼ wind;qual ¼ 13 . For the slope parameters of

the temporal effect (Equation 5) and the direct geometricaspect (Equation 7), we set the values 500 and 1, respec-tively. For the geometric aspect we adopted the measuresArea; Perimeter;VertexNumberf g and for the qualitative

aspect we considered topological relations as defined inthe 9-Intersection model (Egenhofer 1989). For the the-matic aspect we used the formula in Equation 6.

As a second step, we performed a qualitative valida-tion. The validation methodology associates to eachground truth feature ft the corresponding VGI feature fand compares each version fv of f against the

corresponding feature ft from the ground truth data-set.In other words, we consider the feature ft from theground truth as if it was an extra version of the corre-sponding VGI feature. Then, the comparison is done byapplying the same calculations used for computing thedirect trustworthiness values only (since the ground truthdata-set does not evolve over time, it does notmake senseto account for the temporal and the indirect effects).

Since the ground truth data-set does not reportthematic information, the comparison is per-formed only considering the direct geometricand direct qualitative aspects. Accordingly, thefirst-level weights of the model (Equation 2)have been set as follows: wdir ¼ 1,wind ¼ wtmp ¼ 0. The finer-grained weights forthe direct effect (Equation 3) have been set to:wdir;geom ¼ wdir;qual ¼ 0:5, wdir;them ¼ 0.

Figure 13. Wien ground truth data-set from open government Wien.

Figure 14. Screenshot of the selected area in OSM.

228 P. FOGLIARONI ET AL.

The high-level algorithm used to validate trust-worthiness indexes can be described as follows:For each feature ft in the ground truthdata-set:

Retrieve corresponding VGI feature f.For each version fv of the retrieved

VGI feature:Compare the geometry of fv with the

geometry of ft.Apply comparison logic to obtain

the similarity score σðfv; ftÞ.

To identify correspondences between the VGI andthe ground truth data-set, we consider the versions fvof VGI feature f , whose geometry intersects ft. Hence,for each feature ft in the ground truth data-set, weretrieve the VGI feature f that has in its lineage theversion fv with the largest intersection area with ft .The similarity score σðfv; ftÞ is obtained by comparingthe geometries of the VGI features against those ofthe authoritative ones, as explained in Section 3.6.

5.3. Results

To showcase the results obtained in our experiment, wereport in Figure 15 the computed trustworthiness valuesfor the OSM feature 32506491 over time. Note that whileedits are identified by the date and time they occurred, thex-axis is not scaled over time. That is, dates are nominalvalues in this plot. The different versions of the feature arereported in different colors. Also, note that, although thisspecific feature (in the time range considered in ourexperiment) consists of 16 versions, the number ofplotted trustworthiness values are much more. The first

data point of each version (color) is equal to the reputa-tion of the version’s author at the time of insertion in theVGI system. All the other data points in a feature versionlife correspond to an adjustment of the trustworthinessignited by indirect confirmations.

In general, we can observe how the trustworthinessvalue swings quite smoothly up and down according tothe impact of the effects and aspects discussed in previoussections. However, at the beginning of the time intervalthe trustworthiness value jumps quite abruptly from 0 toapproximately 0.3. This might be related to the specificchoice of the slope parameters influencing the behavior ofthe temporal aspect (Equation 5) and of the direct geo-metric aspect (Equation 7). The value of the parametersshall be tuned in future experiments to obtain a smoothereffect.

As an example, we show in Tables 1 and 2 trust-worthiness and similarity scores obtained for twopolygons: OSM feature ids 38838966 and 45275690,respectively. The trustworthiness scores are com-puted using all the effects and aspects discussed inSection 3.6. The similarity values are computedusing only the direct geometrical aspect and thedirect qualitative spatial aspect, as explained in theprevious section. Both types of values, trustworthi-ness and similarity, do not have an absolute mean-ing. Rather, they have to be interpreted in a relativemanner among several versions of the same features.So, for example, we can read that version 4:0 offeature 38838966 is the most trustworthy among allof its versions, but we cannot infer anything abouthow reliable this version is with respect to otherfeatures.

Both the examples reported below satisfy the qua-litative validation inequalities reported in Equation

0.0

0.1

0.2

0.3

0.4

0.5

2010

−05−

31 1

7:33

:21

2010

−10−

08 1

2:52

:14

2010

−11−

11 2

1:03

:36

2010

−12−

01 2

1:47

:07

2010

−12−

01 2

1:47

:12

2010

−12−

01 2

2:05

:14

2010

−12−

01 2

2:05

:27

2011

−01−

18 1

8:26

:39

2011

−01−

18 1

8:26

:50

2011

−01−

23 0

8:06

:17

2011

−02−

13 1

1:33

:49

2011

−02−

13 1

4:33

:25

2011

−03−

27 2

1:09

:11

2011

−06−

03 1

6:05

:33

2011

−06−

25 0

9:18

:36

2011

−06−

25 0

9:18

:41

2011

−06−

25 0

9:18

:48

2011

−06−

25 0

9:18

:55

2011

−06−

25 0

9:19

:01

2011

−06−

25 2

0:23

:03

2011

−07−

17 1

4:22

:43

2011

−07−

27 0

7:08

:11

2011

−09−

23 1

8:58

:16

2011

−10−

13 1

6:19

:42

2011

−10−

15 1

0:26

:16

2011

−11−

01 1

4:54

:55

2011

−11−

22 2

1:24

:01

2012

−01−

04 2

3:12

:48

2012

−02−

03 1

8:37

:59

2012

−02−

03 1

8:38

:09

2012

−03−

05 2

1:03

:20

2012

−04−

15 1

7:56

:40

2012

−05−

13 1

6:56

:23

2012

−05−

13 1

6:56

:31

2012

−05−

13 1

6:57

:57

2012

−05−

13 1

6:58

:03

2012

−05−

13 1

6:58

:11

2012

−05−

19 0

7:49

:57

2012

−05−

21 1

6:16

:03

Edits

Trus

twor

thin

ess

Version number

v: 52.0

v: 53.0

v: 54.0

v: 55.0

v: 56.0

v: 57.0

v: 58.0

v: 59.0

v: 60.0

v: 61.0

v: 62.0

v: 62.1

v: 62.2

v: 62.3

v: 63.0

v: 64.0

Figure 15. Trustworthiness values for the OSM feature 32506491 varying over edits and feature versions.

GEO-SPATIAL INFORMATION SCIENCE 229

19. For feature 38838966 (Table 1), the qualitativevalidation condition reported in Equation 19 is satis-fied: it is enough to set fmin, fmed, and fmax to versions3:0, 5:0 (version 6:0 would work as well), and 4:0,respectively. Note that we have several versions(namely 4:0, 5:0, and 6:0) of feature 38838966 withdifferent trustworthiness but same similarity scores.This is due to the fact that we could only apply thegeometric and qualitative spatial aspects to compareagainst the ground truth data-set, while these versionsare generated by thematic edits.

For the other sample feature 45275690 (Table 2),we only have two versions. In this case, one cansimply let two of the versions in Equation 19 be thesame. For example, we can set fmin ¼ fmed and fmax toversions 3:1 and 3:0, respectively.

Figures 16 and 17 allow for a visual comparison ofthe versions of the selected VGI features with minimum

(in red) and maximum (in green) trustworthinessagainst the ground truth data-set (in blue) in the back-ground. Figure 16 reports about VGI feature 38838966and clearly shows how better feature version 4:0(Figure 16(a)) geometrically matches the reference fea-ture from the ground truth data-set. The latter is visiblein Figure 16(b), underneath version 3:0 of the VGIfeature (in red). Figure 17 reports about VGI feature45275690: Figure 17(a) shows version 3:0 with max-imum similarity to the ground truth data-set; Figure 17(b) shows version 3:1 with minimum similarity. In thiscase, neither of the two versions really fits well to theground truth reference, yet, according to the similaritymeasures used, version 3:1 results the best fit. In thiscase the lineage of the feature (with only two versions) isdefinitely too short to efficiently implement the manyeyes principle.

All results in the selected territory portion seems toindicate that our model of trustworthiness provides avalid measure of VGI data quality with respect to groundtruth data. Collected data encourages this approach andshows its validity; nevertheless, it is necessary to improveit and perform more exhaustive tests.

6. Conclusions and future work

VGI systems collect spatial information from Internetusers and follow the principles that stand behind crowd-source information. The scope of map-based VGI

Table 1. Validation indexes for feature id 38838966.Version Number Trustworthiness Similarity

3.0 0.1179 0.41664.0 0.6138 0.38885.0 0.4462 0.38886.0 0.4874 0.3888

Table 2. Validation indexes for feature id 45275690.Version Number Trustworthiness Similarity

3.0 0.5065 0.37493.1 0.4419 0.3749

Figure 16. Two versions of feature 38838966 superimposed on the ground truth data-set.

Figure 17. Two versions of feature 45275690 superimposed on the ground truth data-set.

230 P. FOGLIARONI ET AL.

systems, such as OSM, is to build a global map by fillingspatial information and to maintain it freely accessible.One key factor for developing and popularizing thesesystems are “web users” and the contribution they pro-vide; the higher is the number of contributors, the largeris the collected information, in terms of both amountand quality. The definition of a method that quantifieshow good VGI is constitutes an active research topic.Among proposed methods, we based our work on theapproach taken by Keßler, Trame, and Kauppinen(2011) and Keßler and De Groot (2013).

We presented a model to derive trustworthinessand reputation scores for geographic features andcontributors of a VGI system, respectively. Themodel relies on provenance information and appliesto all VGI systems that support versioning, whereeach version is the outcome of an atomic edit action,either creation, modification, or deletion. The modelprovides a trustworthiness score for each feature ver-sion and a reputation score for each author.

The model was implemented as a Java applicationusing semantic technologies: ontologies have beendeveloped for the representation of data that wasmade persistent with triplestores. The system archi-tecture is thought of as a framework that can be easilyextended and modified by introducing new modulesthat implement different score calculation models.

The software framework was run on an OSM data-set (full history dump). The obtained scores werevalidated against a ground truth data-set providedby an authoritative source. For each authoritativefeature, we selected the corresponding VGI featureand verified that versions with increasing trustworthi-ness scores have increasing similarity to the referencefeature. In the validation, we only applied direct geo-metric and qualitative spatial aspects, but we did notconsider the thematic aspect and the temporal effect.The results met the expectations since an increase intrustworthiness corresponds to an increase in thesimilarity to the authoritative reference feature.

The current implementation of the framework canbe extended in many directions. We foresee to extendthe model by also considering the typical region ofcontribution of single users. That is, typically thesame user contributes information about the samegeographic region(s). Consequently, it is reasonable toderive a reputation score for this user that is a proxymeasure of her/his local knowledge. Arguably, overtime the user reputation will increase and, accordingly,the trustworthiness of the features authored by her/him. Yet, if the same user starts contributing informa-tion about a different area (maybe because she/hemoved to a different city or state), there is no reasonto assume that she/he has good local knowledge aboutthe new region. Thus, the reliability of her/his contri-butions should be rated accordingly. This is especiallytrue for the thematic aspect of edits but not much for

her mapping and cognitive skills (i.e. the geometric andqualitative aspect, respectively).

The reputation of a user should also account for thefrequency of contributions, that is, if a user becomesinactive, her/his reputation should diminish over time.This behavior might be modeled similarly to the tem-poral effect of trustworthiness. In this case, the curveshould bemonotonically decreasing and converging to 0.

The computation of trustworthiness scores might beimproved by introducing a temporal window. Indeed,as of now, trustworthiness scores are updated by com-paring each version in a feature lineage against the“average feature version”. This might led to very strangebehaviors in case a feature disappears or changes dras-tically in the real world. For example, consider the casewhen a building is demolished. In this case, the olderversions of the feature will be completely different thanthe newer versions. With time, the trustworthiness ofthe newer versions will increase while the trustworthi-ness of the older versions will decrease. The reputationof their authors will be altered accordingly. Yet, thedeterioration of the reputation of the authors of oldversions should be mitigated as at the time of contribu-tion the building was still existing. To solve this issue,we suggest to introduce a temporal window that, at themoment of updating trustworthiness scores, only selectsa subset of the versions in a feature lineage, leavinguntouched very old versions and, consequently, thereputation of their authors.

Finally, the framework can be more thoroughlyvalidated by considering other case studies, the the-matic aspect, more domain ontologies, and a widerrange of qualitative spatial relations (such as direc-tional and distance operators).

Notes on contributors

Paolo Fogliaroni is a post-doctoral researcher in theDepartment for Geodesy and Geoinformation, ViennaUniversity of Technology, Austria. He received his BSand MS degrees from the University of L’Aquila in 2005and 2008, respectively. He received his Doctoral degree(Dr.-Ing.) from University of Bremen, Germany, in 2012.His research interests are in the areas of spatial Human–Computer Interaction (HCI), spatial databases, spatial cog-nition, and Qualitative Spatial Representation andReasoning (QSR). His latest research endeavors are in theareas of Spatial HCI with cutting-edge media technology,Spatial and Geographic Information Science, and VirtualSpatial Cognition. He investigates how the virtual augmen-tation of physical reality affects spatial cognition and whatare the best metaphors to spatially interact with a virtuallyaugmented world.

Fausto D’Antonio obtained his BS and MS degrees fromthe University of L’Aquila in 2011 and 2014, respectively.He was on Erasmus exchange with Vienna University ofTechnology. He discussed an MS thesis on evaluation ofdata trustworthiness through the analysis of historical datawithin VGI. His research interests are VGI and Semantic

GEO-SPATIAL INFORMATION SCIENCE 231

Web. He is currently collaborating with University ofL’Aquila and working in a private company Thales AleniaSpace on On-Board Software for Space and AvionicSystems.

Eliseo Clementini is an associate professor of computerscience at the Department of Industrial and InformationEngineering and Economics of the University of L’Aquila(Italy). He received his M.Eng. in Electronics Engineeringfrom University of L’Aquila in 1990 and his Ph.D. inComputer Science from University of Lyon (France) in2009. He has been a visiting professor at the NationalCenter for Geographic Information and Analysis of theUniversity of Maine, at the Department of Geographyand Geomatics of the University of Glasgow, at theDepartment of Geography of the University of Liege, andat the Department of Geodesy and Geoinformation ofVienna University of Technology. His research interestsare mainly in the fields of spatial databases and geographi-cal information science. One of his well-known contribu-tions is the DE+9IM model for topological relations, whichis part of Open Geospatial Consortium recommendationsand ISO/TC 211 standard. He is a member of the programcommittee of important conferences such as theConference on Spatial Information Theory (COSIT), theACM SIGSPATIAL International Conference on Advancesin Geographic Information Systems (ACM GIS), and theInternational Conference on Geographic InformationScience (GIScience).

ORCID

Paolo Fogliaroni http://orcid.org/0000-0002-0578-8904Eliseo Clementini http://orcid.org/0000-0002-3057-7105

References

Antoniou, V., and A. Skopeliti. 2015. “Measures andIndicators of VGI Quality: An Overview.” ISPRSAnnals of the Photogrammetry, Remote Sensing andSpatial Information Sciences 1: 345–351. doi:10.5194/isprsannals-II-3-W5-345-2015.

Barron, C., P. Neis, and A. Zipf. 2014. “A ComprehensiveFramework for Intrinsic OpenStreetMap Quality Analysis.”Transactions in GIS 18 (6): 877–895. doi:10.1111/tgis.12073.

Billen, R., and E. Clementini. 2005a. “Introducing aReasoning System based on Ternary ProjectiveRelations.” In Developments In Spatial Data Handling11th International Symposium on Spatial Data Handling,edited by P. Fisher, 381–394. Berlin, Heidelberg:Springer.

Billen, R., and E. Clementini. 2005b. “Semantics ofCollinearity among Regions.” In On the Move toMeaningful Internet Systems 2005: OTM 2005 Workshops,edited by R. Meersman, Z. Tari, and P. Herrero, Vol. 3762,1066–1076. LNCS. Berlin, Heidelberg: Springer.

Billen, R., and E. Clementini. 2006. “Projective Relations in a3D Environment.” In Geographic Information Science: 4thInternational Conference, GIScience 2006, edited by M.Raubal, H. J. Miller, A. U. Frank, and M. F. Goodchild,Vol. 4197, 18–32. LNCS. Berlin, Heidelberg: Springer.

Bishr, M., and K. Janowicz 2010. “Can We TrustInformation? The Case of Volunteered GeographicInformation.” In Towards Digital Earth Search Discoverand Share Geospatial Data Workshop at Future InternetSymposium, edited by A. Devaraju, A. Llaves, P. Maué

and C. Keßler, Vol. CEUR-WS Vol. 640. ISSN 1613-0073. http://ceur-ws.org

Bishr, M., and L. Mantelas. 2008. “A Trust and ReputationModel for Filtering and Classifying Knowledge aboutUrban Growth.” GeoJournal 72 (3–4): 229–237.doi:10.1007/s10708-008-9182-4.

Clementini, E. 2013. “Directional Relations and Frames ofReference.” GeoInformatica 17 (2): 235–255.doi:10.1007/s10707-011-0147-2.

Clementini, E., S. Skiadopoulos, R. Billen, and F. Tarquini.2010. “A Reasoning System of Ternary ProjectiveRelations.” IEEE Transactions on Knowledge and DataEngineering 22 (2): 161–178. doi:10.1109/TKDE.2009.79.

D’Antonio, F., P. Fogliaroni, and T. Kauppinen. 2014.“VGI Edit History Reveals Data Trustworthiness andUser Reputation.” In Proceedings of the 17th AGILEConference on Geographic Information Science,Connecting a Digital Europe through Location andPlace, edited by J. Huerta, S. Schade and C. Granell.Castellon, Spain: AGILE Digital Editions. 03–06 June,2014. ISBN: 978-90-816960-4-3

Dai, C., D. Lin, E. Bertino, and M. Kantarcioglu. 2008. “AnApproach to Evaluate Data Trustworthiness Based on DataProvenance.” In Secure Data Management: 5th VLDBWorkshop, SDM 2008, edited by W. Jonker and M.Petković, Vol. 5159, 82–98. LNCS. Berlin, Heidelberg:Springer.

Degrossi, L. C., J. P. De Albuquerque, R. D. Santos Rocha,and A. Zipf. 2018. “A Taxonomy of Quality AssessmentMethods for Volunteered and CrowdsourcedGeographic Information.” Transactions in GIS 22 (2):542–560. doi:10.1111/tgis.2018.22.issue-2.

Egenhofer, M. J. 1989. “A Formal Definition of BinaryTopological Relationships.” In Foundations of DataOrganization and Algorithms: 3rd International Conference,FODO 1989, edited by W. Litwin and H. J. Schek, Vol. 367,457–472. LNCS. Berlin, Heidelberg: Springer.

Egenhofer, M. J., and D. M. Mark. 1995. “ModellingConceptual Neighbourhoods of Topological Line-Region Relations.” International Journal ofGeographical Information Systems 9 (5): 555–565.doi:10.1080/02693799508902056.

Elwood, S., M. F. Goodchild, and D. Sui. 2013. “Prospectsfor VGI Research and the Emerging Fourth Paradigm.”In Crowdsourcing Geographic Knowledge: VolunteeredGeographic Information (VGI) in Theory and Practice,edited by D. Sui, S. Elwood, and M. F. Goodchild, 361–375. Dordrecht: Springer.

Fan, H., A. Zipf, Q. Fu, and P. Neis. 2014. “Quality Assessmentfor Building Footprints Data on OpenStreetMap.”International Journal of Geographical Information Science28 (4): 700–719. doi:10.1080/13658816.2013.867495.

Fogliaroni, P., and E. Clementini. 2015. “ModelingVisibility in 3D Space: A Qualitative Frame ofReference.” In 3D Geoinformation Science: The SelectedPapers of the 3D GeoInfo 2014, edited by M. Breunig, M.Al-Doori, E. Butwilowski, P. V. Kuper, J. Benner, and K.H. Haefele, 243–258. Cham: Springer.

Fonte, C. C., V. Antoniou, L. Bastin, J. Estima, J. J.Arsanjani, J. C. L. Bayas, L. M. See, and R. Vatseva.2017. “Assessing VGI Data Quality.” In Mapping andthe Citizen Sensor, edited by G. Foody, L. See, S. Fritz, P.Mooney, A. M. Olteanu-Raimond, C. C. Fonte, and V.Antoniou, 137–163. London: Ubiquity Press.

Freksa, C. 1991. “Conceptual Neighborhood and its Role inTemporal and Spatial Reasoning.” In Decision Support

232 P. FOGLIARONI ET AL.

Systems and Qualitative Reasoning, edited by M. Singh andL. Travé-Massuyès, 181–187. Amsterdam: North Holland.

Girres, J. F., and G. Touya. 2010. “Quality Assessment of theFrench OpenStreetMap Dataset.” Transactions in GIS 14(4): 435–459. doi:10.1111/j.1467-9671.2010.01203.x.

Goodchild, M. F. 2007. “Citizens as Sensors: The World ofVolunteered Geography.” GeoJournal 69 (4): 211–221.doi:10.1007/s10708-007-9111-y.

Goodchild, M. F., and L. Li. 2012. “Assuring the Quality ofVolunteered Geographic Information.” Spatial Statistics1: 110–120. doi:10.1016/j.spasta.2012.03.002.

Haklay, M. 2010. “How Good is Volunteered GeographicalInformation? A Comparative Study of OpenStreetMap andOrdnance Survey Datasets.” Environment and Planning B:Planning and Design 37 (4): 682–703. doi:10.1068/b35097.

Hartig, O. 2009. “Provenance Information in the Web ofData.” In Proceedings of the 2nd Workshop on LinkedData on the Web (LDOW2009), edited by C. Bizer, T.Heath, T. Berners-Lee and K. Idehen, vol. CEUR-WS538. ISSN 1613-0073. http://ceur-ws.org

Helbich, M., C. Amelunxen, P. Neis, and A. Zipf. 2012.“Comparative Spatial Analysis of Positional Accuracy ofOpenStreetMap and Proprietary Geodata.” InGeoinformatics Forum 2012: Geovisualization, Society andLearning, edited by T. Jekel, A. Car, J. Strobl, and G.Griesebner, 24–33. Berlin, Offenbach am Main: HerbertWichmann-Verlag

Hernández, D., E. Clementini, and P. Di Felice. 1995.“Qualitative Distances.” In Spatial Information Theory:A Theoretical Basis for GIS, edited by A. Frank and W.Kuhn, Vol. 988, 45–57. LNCS. Berlin: Springer-Verlag.

Keßler, C., and R. T. A. De Groot. 2013. “Trust as a ProxyMeasure for the Quality of Volunteered GeographicInformation in the Case of OpenStreetMap.” InGeographic Information Science at the Heart of Europe,edited by D. Vandenbroucke, B. Bucher, and J.Crompvoets, 21–37. Cham: Springer.

Keßler, C., J. Trame, and T. Kauppinen. 2011. “TrackingEditing Processes in Volunteered Geographic Information:The Case of OpenStreetMap.” In Proceedings of Workshopon Identifying Objects, Processes and Events in Spatio-Temporally Distributed Data (IOPE 2011), Workshop atCOSIT’11, edited by A. Galton, M. Worboys, M. Duckhamand J. Emerson, unpublished. Belfast: Maine. 12–16September 2011.

Lodigiani, C., and M. Melchiori. 2016. “A PageRank-basedReputation Model for VGI Data.” Procedia ComputerScience 98: 566–571. doi:10.1016/j.procs.2016.09.088.

Mark, D. M. 1993. “Human Spatial Cognition.” In HumanFactors in Geographical Information Systems, edited byD. Medyckyj-Scott and H. M. Hearnshaw, 51–60.London: Belhaven Press.

Martella, R., C. Kray, and E. Clementini. 2015. “AGamification Framework for Volunteered GeographicInformation.” In AGILE 2015 Geographic InformationScience as an Enabler of Smarter Cities andCommunities, edited by F. Ba¸Cão, M. Y. Santos, andM. Painho, Vol. 217, 73–89. LNGC. Cham: Springer.

Mezzetti, N. 2004. “A Socially Inspired Reputation Model.”In Public Key Infrastructure. EuroPKI 2004, edited by S.K. Katsikas, S. Gritzalis, and J. López, Vol. 3093, 191–204. LNCS. Berlin, Heidelberg: Springer.

Miller, G. A. 1995. “WordNet: A Lexical Database forEnglish.” Communications of the ACM 38 (11): 39–41.doi:10.1145/219717.219748.

Mooney, P., and P. Corcoran. 2012. “Characteristics ofHeavily Edited Objects in OpenStreetMap.” FutureInternet 4 (1): 285–305. doi:10.3390/fi4010285.

Mooney, P., P. Corcoran, and A. C. Winstanley. 2010.“Towards Quality Metrics for OpenStreetMap.” InProceedings of the 18th SIGSPATIAL InternationalConference on Advances in Geographic InformationSystems, edited by D. Agrawal, P. Zhang, A. ElAbbadi and M. Mokbel, 514–517. New York: ACMPress.

Moratz, R., and M. Ragni. 2008. “Qualitative SpatialReasoning about Relative Point Position.” Journal ofVisual Languages & Computing 19 (1): 75–98.doi:10.1016/j.jvlc.2006.11.001.

Neis, P., D. Zielstra, and A. Zipf. 2012. “The StreetNetwork Evolution of Crowd-sourced Maps:OpenStreetMap in Germany 2007–2011.” FutureInternet 4 (1): 1–21. doi:10.3390/fi4010001.

Randell, D. A., and A. G. Cohn. 1989. “ModellingTopological and Metrical Properties of PhysicalProcesses.” In Proceedings of the 1st InternationalConference on Principles of KnowledgeRepresentation and Reasoning (KR’89), edited by R.J. Brachman, H. J. Levesque and R. Reiter, 357–368.San Francisco, CA: Morgan Kaufmann.

Raymond, E. S. 1999. The Cathedral and the Bazaar:Musings on Linux and Open Source by an AccidentalRevolutionary. Sebastopol, CA: O’Reilly Media.

Richter, K. F., and S. Winter. 2014. Landmarks: GISciencefor Intelligent Services. Cham: Springer InternationalPublishing.

Senaratne, H., A. Mobasheri, A. L. Ali, C. Capineri, andM. Haklay. 2017. “A Review of VolunteeredGeographic Information Quality AssessmentMethods.” International Journal of GeographicalInformation Science 31 (1): 139–167. doi:10.1080/13658816.2016.1189556.

Spielman, S. E. 2014. “Spatial Collective Intelligence?Credibility, Accuracy, and Volunteered GeographicInformation.” Cartography and GeographicInformation Science 41 (2): 115–124. doi:10.1080/15230406.2013.874200.

Sztompka, P. 1999. Trust: A Sociological Theory.Cambridge: Cambridge University Press.

Tarquini, F., and E. Clementini. 2008. “Spatial Relationsbetween Classes as Integrity Constraints.” Transactionsin GIS 12 (SUPPL. 1): 45–57. doi:10.1111/tgis.2008.12.issue-s1.

Tarquini, F., G. De Felice, P. Fogliaroni, and E. Clementini.2007. “A Qualitative Model for Visibility Relations.” InKI 2007: Advances in Artificial Intelligence, edited by J.Hertzberg, M. Beetz, and R. Englert, Vol. 4667, 510–513.LNCS. Berlin, Heidelberg: Springer.

Wand, Y., and R. Y. Wang. 1996. “Anchoring DataQuality Dimensions in Ontological Foundations.”Communications of the ACM 39 (11): 86–95.doi:10.1145/240455.240479.

Zielstra, D., and A. Zipf. 2010. “A Comparative Studyof Proprietary Geodata and Volunteered GeographicInformation for Germany.” In Proceedings of 13thAGILE International Conference on GeographicInformation Science, edited by M. Painho, M. Y.Santos and H. Pundt, 1–15. Guimarães, Portugal:AGILE Digital Editions. ISBN: 978-989-20-1953-6

GEO-SPATIAL INFORMATION SCIENCE 233