Visualisation of the Semantic Web: Topic Maps Visualisation

6
Visualisation of the Semantic Web: Topic Maps Visualisation Bénédicte Le Grand, Michel Soto, Laboratoire d'Informatique de Paris 6, France [email protected]; [email protected] Abstract Topic Maps provide a bridge between the domains of knowledge representation and information management by building a structured semantic network above information resources. Our research at LIP6 aims at visualizing this semantic layer efficiently, which is a critical issue as Topic Maps may contain millions of elements. This paper is divided into two parts. First, we depict briefly basic Topic Maps concepts. Then, we discuss Topic Maps visualisation requirements and we study how existing visualisation techniques may be applied to Topic Maps representation. We conclude by giving a few directions that could lead to the "ultimate" Topic Map visualisation tool. 1. Introduction The standards developed in the context of the Semantic Web provide means of adding semantics to the data available on the Web. XML is a first level of semantics which allows users to structure data with regard to their content rather than their presentation. However, more semantics can be added with the Resource Description Framework (RDF) or Topic Maps standards. RDF was developed by the World Wide Web Consortium [14] whereas Topic Maps were defined by the International Organisation for Standardization [4]. The Topic Map paradigm was recently adapted to the Web by the TopicMaps.Org consortium [13]. Both RDF and Topic Maps aim at representing knowledge about information resources by annotating them; these standards are compatible. Topic Maps provide a bridge between the domains of knowledge representation and information management. They build a semantic network above information resources, which allows users to navigate at a higher level of abstraction. However, Topic Maps are multidimensional knowledge bases and they may be very large. Users may still have problems to find relevant information within a Topic Map, therefore, the issue of Topic Maps visualisation and navigation is essential. This paper is organised as follows: first we present Topic Maps basic concepts; then, we discuss Topic Maps visualisation requirements and we study how existing visualisation techniques may be applied to Topic Maps representation. We conclude by giving a few directions that could lead to the "ultimate" Topic Map visualisation tool. 2. Topic Maps basic concepts Topic Maps are a new ISO standard [4] which allows to describe knowledge and to link it to existing information resources. Topic Maps are described as the "GPS of the information universe", as they are designed to enhance navigation in complex data sets. Although Topic Maps allow to organise and represent very complex structures, the basic concepts of this model – topics, occurrences and associations - are simple. Figure 1. Example of Topic Map

Transcript of Visualisation of the Semantic Web: Topic Maps Visualisation

Visualisation of the Semantic Web: Topic Maps Visualisation

Bénédicte Le Grand, Michel Soto,Laboratoire d'Informatique de Paris 6, France

[email protected]; [email protected]

Abstract

Topic Maps provide a bridge between the domains ofknowledge representation and information managementby building a structured semantic network aboveinformation resources. Our research at LIP6 aims atvisualizing this semantic layer efficiently, which is acritical issue as Topic Maps may contain millions ofelements.

This paper is divided into two parts. First, we depictbriefly basic Topic Maps concepts. Then, we discussTopic Maps visualisation requirements and we studyhow existing visualisation techniques may be applied toTopic Maps representation. We conclude by giving a fewdirections that could lead to the "ultimate" Topic Mapvisualisation tool.

1. Introduction

The standards developed in the context of theSemantic Web provide means of adding semantics to thedata available on the Web. XML is a first level ofsemantics which allows users to structure data withregard to their content rather than their presentation.However, more semantics can be added with theResource Description Framework (RDF) or Topic Mapsstandards. RDF was developed by the World Wide WebConsortium [14] whereas Topic Maps were defined bythe International Organisation for Standardization [4].The Topic Map paradigm was recently adapted to theWeb by the TopicMaps.Org consortium [13]. Both RDFand Topic Maps aim at representing knowledge aboutinformation resources by annotating them; thesestandards are compatible.

Topic Maps provide a bridge between the domains ofknowledge representation and information management.They build a semantic network above informationresources, which allows users to navigate at a higherlevel of abstraction. However, Topic Maps aremultidimensional knowledge bases and they may bevery large. Users may still have problems to find

relevant information within a Topic Map, therefore, theissue of Topic Maps visualisation and navigation isessential.

This paper is organised as follows: first we presentTopic Maps basic concepts; then, we discuss Topic Mapsvisualisation requirements and we study how existingvisualisation techniques may be applied to Topic Mapsrepresentation. We conclude by giving a few directionsthat could lead to the "ultimate" Topic Map visualisationtool.

2. Topic Maps basic concepts

Topic Maps are a new ISO standard [4] which allowsto describe knowledge and to link it to existinginformation resources. Topic Maps are described as the"GPS of the information universe", as they are designedto enhance navigation in complex data sets. AlthoughTopic Maps allow to organise and represent verycomplex structures, the basic concepts of this model –topics, occurrences and associations - are simple.

Figure 1. Example of Topic Map

A topic is a syntactic construct which corresponds tothe expression of a real-world concept in a computersystem. The figure 1 represents a very small Topic Mapwhich contains four topics: XML Europe 2000, Paris,Ile-de-France and France. These topics are instances ofother topics: XML Europe is a conference, Paris is acity, Ile-de-France is a region and France is a country.

A topic may be linked to several informationresources – e.g. Web pages - which are considered to besomehow related to this topic. These resources are calledoccurrences of a topic. In the Topic Map represented onthe figure 1, occurrences of the topic Paris may beURLs pointing to some pictures or maps. Occurrenceshelp organise data and understand their context byproviding means of linking real resources to abstractconcepts. It is important to notice that topics andinformation resources belong to two different layers.Users may navigate at an abstract level – the topic level– rather than directly within data.

The concepts presented so far – topics andoccurrences – allow to organise information resourceswith regard to a concept. However, it is interesting todescribe relationships between these concepts, which ispossible in Topic Maps through topic associations. Anassociation adds semantics to data by expressing arelationship between several topics, such as XMLEurope 2000 takes place in Paris, Paris is located in Ile-de-France, etc. Every topic involved in an associationplays a specific role in this association, for example, Ile-de-France plays the role of container and Paris playsthe role of containee.

One advantage of Topic Maps is that they addsemantics to existing data – by organising anddescribing them – without modifying them. Moreover,one single Topic Map may describe several informationpools and several Topic Maps may apply to one singleinformation pool.

In this section, we described Topic Maps basicconstructs: Topic Maps contain topics which areconnected by associations and which point toinformation resources through occurrences. TopicsMaps can enhance navigation and information retrievalin complex data sets by adding semantics to theseresources. However, a topic may have a high number ofdimensions, as it is characterised by its name(s), itstype(s), its occurrence(s) – the resources which arerelated to it - and the role(s) that it plays in associations.Moreover, Topic Maps may also be complex because oftheir size: they may contain millions of topics andassociations.

3. Topic Maps Visualisation

As stated in the previous section, Topic Maps arevery powerful but they may be complex. Intuitive visualuser interfaces may significantly reduce the cognitive

load of users when working with these complexstructures. Visualisation is a promising technique forboth enhancing users' perception of structure in largeinformation spaces and providing navigation facilities.According to Gershon et al. [3], it also enables people touse a natural tool of observation and processing – theireyes as well as their brain – to extract knowledge moreefficiently and find insights.

In the following, we first present the goals of TopicMaps visualisation. Then we review differentvisualisation techniques which may be applied to TopicMaps representation.

3.1. Visualisation goals

The goal of Topic Maps visualisation is to help userslocate relevant information quickly and explore thestructure easily. Thus, there are two kinds ofrequirements for Topic Maps visualisation:representation and navigation. A good representationhelps users identify interesting spots whereas an efficientnavigation is essential to access information rapidly.

According to Schneiderman [11], "the visualinformation-seeking mantra is: overview first, zoom andfilter, then details on-demand".

3.1.1. Representation requirements. First of all, usersshould be provided with an overview of the Topic Map.This overview must show the main features of thestructure in order to deduce the Topic Map’s maincharacteristics at a glance. Visual representations areparticularly fitted to these needs, as they exploit humanabilities to detect patterns.

The first thing we need to know about a Topic Map iswhat it deals with, i.e. what its main concepts are. Oncethey are identified, more structural information isneeded, such as the generality or specificity of the TopicMap. This kind of information should appear clearly onthe representation so as to help users compare differentTopic Maps quickly and explore only the most relevantones in detail. The position of topics on the visualdisplay should reflect their semantic proximity. Theseproperties can be deduced from the computation ofTopic Maps metrics, as shown by Le Grand and Soto [7].

Moreover, Topic Maps are multidimensionalknowledge bases. Topics, associations and occurrencesare characterised by many parameters, which shouldappear somehow in the visualisation.

The requirements we stated before are notcompatible, as it is not possible – nor relevant – todisplay simultaneously general information and details.We can compare this to a geographic map; a map of theworld cannot – and should not – be precise. If a user isinterested in details, she must precise her center ofinterest, for example by choosing a specific country. As

in geographical maps, we need to provide differentscales in Topic Maps representations.

Moreover, visualisations should be dynamic to adaptto users' needs in real time. Combinations of time andspace can help ground visual images in one's experienceof the real world and so tap into the users' knowledgebase and inherent structures.

3.1.2. Navigation requirements. A good navigationallows users to explore the Topic Map and accessinformation quickly. Navigation should be intuitive sothat it is easy to get from one place to another. Severalmetaphors are possible: users may travel by car, byplane, by subway or may simply be “teleported” – as onthe Web - to their destination. The differences lie inwhat they see during their journey. From a car, they seedetails, from a plane they have an overview, etc.Navigation is essential because it helps users build theirown cognitive map – a map-like cognitive representationof an environment - and increase the rate at which theycan assimilate and understand information.

Free navigation should be kept for small structures orexpert users as the probability of getting lost is veryhigh. For beginners, who do not know where to starttheir exploration, predefined navigation paths arepreferable until topics of interest are identified.

To sum up the visualisation goals, the whole TopicMap should be represented in order to help usersunderstand it globally. This overview should reflect themain properties of the structure. However, users shouldbe able to focus on any part of the Topic Map and see allthe dimensions they need. Providing these several scalesrequires the use of different levels of detail. Finally,users should be able to navigate easily and intuitively atthese different levels of detail.

In the following section, we study whichvisualisation techniques meet our requirements and maybe used to represent Topic Maps.

3.2. Visualisation techniques

Many visualisation techniques are currently availableto represent complex data. Among them, graphs, treesand maps seem to be the best suited metaphors for TopicMaps visualisation.

3.2.1. Graphs and trees.Representing Topic Maps as graphs seems natural as

they can be seem as a network or graph of topics andassociations.

The Topic Map shown on the figure 1 is displayed asa graph consisting of nodes and arcs. However, thissimple representation may become cluttered and difficultto interpret when the size of the structure increases. As aTopic Map may contain millions of topics and

associations, it becomes necessary to use sophisticatedgraphs visualisation techniques.

Graphs and trees are suited for representing theglobal structure of a Topic Map. However, trees arebetter understood by human beings since they arehierarchical. Trees are easier to interpret than graphs.Topic Maps are not hierarchies and thus may not bedirectly represented as trees. However, it may beinteresting to transform small parts of the Topic Mapinto trees. By doing so on a little part of the Topic Map(to avoid clutter), we may benefit from the advantages oftrees.

The challenge of graph visualisation is to providegraphs which display many nodes but remain readable. Afirst solution, proposed by Munzner [9], is to usehyperbolic geometry – instead of euclidian geometry –which allows to display a very large number of nodes onthe screen. Another solution to the lack of space on thescreen is to represent Topic Maps in three dimensions. A3D interactive Topic Map visualisation tool – UNIVIT(Universal Interactive Visualisation Tool) - wasimplemented by Le Grand and Soto [6], which usesvirtual reality techniques such as 3D, interaction anddifferent levels of detail, as shown on figure 2.

Figure 2. UNIVIT Representation

Moreover, the quality of the visualisation can beincreased by an efficient node positioning, which makesit possible to intuitively derive information from thedistance between nodes. For instance:

• topics linked together by an association may berepresented close to each other in the graph.

• topics of the same type or pointing to the sameoccurrences may be clustered.

Graphs and trees meet our first requirement sincethey may represent the whole Topic Map. However,users also need to see detailed information about thetopics they are interested in. This second requirement,which consists in representing all the different

parameters of a Topic Map (name, type, etc.), may bereally challenging. Different shapes and colors may beused to symbolize various dimensions of nodes and arcsof the graph. However, the number of different shapes,colors, icons and textures is limited, which makes thiskind of representation unsuited to a Topic Mapcontaining millions of topics and associations.

In order to display detailed information, it is thusnecessary to focus on a part of the Topic Map. A graph-like overview of a Topic Map helps users understand thestructure globally and select specific topics they may beinterested in. Once a topic is selected, it is easy todisplay very precise information about it. Once users'needs are clearly identified, a useful way to represent aTopic Map is to display a list -or index- from which it ispossible to select a topic and see related information.The navigation is usually the same as on Web sites:users click on a link to open a new topic or association.An example of such a visualization is provided by theOntopia Omnigator [10], as shown on figure 3.

Figure 3. Ontopia Navigator

We showed that the global view and preciseparameters could not be displayed at the same level ofdetail. It is essential that users can navigate easily fromone level of detail to another. Some tools alreadyprovide interactive graphical visualizations. Mondeca’sTopic Navigator [8] builds graph representations in real-time, according to what users are allowed or need to see(figure 4).

Figure 4. Mondeca's Topic Navigator

3.2.2. Maps. Topic Maps aim at enhancing navigationwithin complex data sets. In the real world, maps areused to achieve this goal. It is thus natural to considerrepresenting Topic Maps as maps.

A map should display the most significant elementsof the structure. The location and the size of theseelements on the map is essential. In the Tree-Mapsproposed by Bruls et al. [1], data is reorganized so as toreflect the relative importance of topics, as shown on thefigure 5. These Tree-Maps may be used to representTopic Maps.

Figure 5. Tree-Map

One of the challenges of Topic Maps visualisation asmaps is to find optimal coordinates for the topics. TheSelf-Organizing Maps (SOM) algorithm, proposed byKaski, Kohonen et al. [5] can be used to achieve this by

organising the topics onto a two-dimensional grid so thatrelated topics appear close to each other. Factor analysiscan also be used to compute topics coordinates. Davison[2] explains how the multi-dimensional scaling (MDS)algorithm uses similarity measures between topics toprovide a 2D map of the structure. The figure 6 is anexample of map which represents a small Topic Mapabout rock music.

Figure 6. MDS Map

The ThemeScape software by Cartia Inc. (1992)provides different types of maps. They look liketopographical maps with mountains and valleys, asshown on the figure 7. The concept of the layout issimple: documents with similar content are located closeto each other and peaks appear where there is aconcentration of closely related documents. Highernumbers of documents create higher peaks. The valleysbetween peaks may be interesting because they containfewer documents and more unique content. The labelsreflect the major two or three topics represented in agiven area of the map, providing a quick indication ofwhat the documents are about. Additional labels oftenappear when we zoom into the map for greater detail.We can zoom to different levels of magnification todeclutter the map and reveal additional documents andlabels.

This visualisation is very interesting since itcombines different representations in several windows.Users may choose one of them according to the selectedtype of information.

Visual tools may utilise animated 3D landscapeswhich take advantage of human beings' ability tonavigate in three-dimensional spaces, recognize patterns,track movement, and compare objects of different sizesand colors. Users may have complete control over theappearance of data. Virtual reality techniques includeinteractivity and the use of different levels of detail(LOD). Immersion in virtual worlds makes users feelmore involved in the visualisation.

Figure 7. Themescape Representation

A representation of Topic Maps as virtual cities,developed by Le Grand and Soto [7], is shown on thefigure 8. Topics are represented as buildings whichcoordinates are computed from a matrix of similaritiesbetween topics. Users may navigate freely or follow aguided tour through the city; they may also choose towalk or fly. The properties of topics are symbolised bythe characteristics of the corresponding buildings, suchas name, color, height, width, depth, etc. Occurrencesand associated topics are displayed in two windows atthe bottom of the screen. As human beings are used to2D, a traditional 2D map is also provided and the twoviews – the map and the virtual city – are alwaysconsistent.

Figure 8. Topic Map as a Virtual World

Users may explore the world and interact with data.However, they may get lost in the virtual world. In orderto avoid these problems, predefined navigation paths arealso proposed. The different levels of detail make it

possible to display many scales: details appear onlywhen the user is close to the subject of interest.

4. Conclusion

In this paper, we presented Topic Maps basicconcepts. These structures may be complex, thusefficient visualisation techniques are essential. Wereviewed different types of visualisation metaphors,especially graphs, trees, maps and virtual worlds. Someof them may represent efficiently the global structurewhile others are better at displaying details or providinginteraction with data. In fact, each technique is well-suited for a specific level of detail. The best way tobenefit from the advantages of each method is thus toprovide several levels of details for the representation ofTopic Maps. This can be done by displaying severalwindows or by selecting the most appropriaterepresentation at a given level of detail. A specificvisualisation tool is usually adapted to display a certainamount of data. The limit of this approach is that givingusers too many options may confuse them and result inan over-complicated interface.

If only one tool is to be used for visualisation, theTopic Map may be filtered or clustered to reach the scaleat which the tool is useful, which was proposed by LeGrand and Soto [7]. This is achieved through aconceptual analysis of the Topic Map, which allows usto characterize the structure and identify the topicswhich are the most relevant to users needs.

5. References

[1] Bruls, M., Huizing, K., van Wijk, J. J., 2000,Squarified Treemaps, Proceedings of Joint

Eurographics and IEEE TCVG Symposium onVisualization, IEEE Press, pp. 33-42.

[2] Davison M. L., 1992, Multidimensional scaling,Malabar, Fl, CA : Krieger Publishing.

[3] Gershon, N., Eick, S.G., 1995, Visualisation's NewTack: Making Sense of Information, IEEE Spectrum,pp. 38-56.

[4] International Organisation for standardization (ISO),International Electrotechnical Commission (IEC), TopicMaps, International Standard ISO/IEC 13250:1999, 19April 1999.

[5] Kaski, S., Honkela, T., Lagus, K., Kohonen, T., 1998,WEBSOM – self_organising maps of documentcollections, Neurocomputing, volume 21, pp 101-117.

[6] Le Grand, B., Soto, M., 2000, Information management– Topic Maps visualization, XML Europe 2000, Paris,France.

[7] Le Grand, B., Soto, M., 2001, XML Topic Maps andSemantic Web Mining, Semantic Web MiningWorkshop, ECML/PKDD 2001 conference, Freiburg,Germany.

[8] Mondeca, 2001, Topic Navigator,http://www.mondeca.com/site/products/products.html.

[9] Munzner, T., 1997, H3: Laying Out Large DirectedGraphs in 3D Hyperbolic Space, IEEE Symposium onInformation Visualization.

[10] Ontopia, 2001, Ontopia Topic Map Navigator,http://www.ontopia.net/solutions/navigator.html

[11] Shneiderman, B., 1996, The eyes have it: a task by datatype taxonomy for information visualizations,Proceedings of 1996 IEEE Visual Languages, Boulder,CO, pp. 336-343.

[12] ThemeScape Product Suite,http://www.cartia.com/products/index.html

[13] TopicMaps.Org XTM Authoring Group, 2001, XTM:XML Topic Maps (XTM) 1.0: TopicMaps.OrgSpecification.

[14] World Wide Web Consortium, 1999, ResourceDescription Framework (RDF) Model and SyntaxSpecification, W3C.