Institutionalised collaborative tagging as an instrument for managing the maturing learning and...

15
70 Int. J. Technology Enhanced Learning, Vol. 1, Nos. 1/2, 2008 Copyright © 2008 Inderscience Enterprises Ltd. Institutionalised collaborative tagging as an instrument for managing the maturing learning and knowledge resources Ronald Maier and Stefan Thalmann* Information Systems, University of Innsbruck, Universitaetsstrasse 15, A-6020 Innsbruck, Austria E-mail: [email protected] E-mail: [email protected] *Corresponding author Abstract: Recently, social software and collaborative tagging have received high levels of attention in Internet communities and have also been discussed as interesting approaches to annotate resources and distribute the cumbersome task of designing ontologies from few domain experts to large numbers of users of digital resources. This paper discusses the suitability of collaborative tagging for annotating knowledge and learning resources in the institutionalised setting of businesses and organisations. Specifically, the paper discusses commitment, convergence and coordination issues and presents the results of a multi-round experiment involving 174 Bachelor students at the Innsbruck University School of Management. Keywords: collaborative tagging; KM; knowledge management; knowledge maturing; metadata; social software; technology-enhanced learning; learning objects; knowledge objects. Reference to this paper should be made as follows: Maier, R. and Thalmann, S. (2008) ‘Institutionalised collaborative tagging as an instrument for managing the maturing learning and knowledge resources’, Int. J. Technology Enhanced Learning, Vol. 1, Nos. 1/2, pp.70–84. Biographical notes: Ronald Maier holds a PhD in Management Information Systems from The Koblenz School of Corporate Management and a habilitation Degree from the University of Regensburg. He worked as Visiting Assistant Professor at the Terry College of Business, University of Georgia, and as full-Professor at the Martin-Luther-University Halle-Wittenberg, Germany. Since 2007, he is Professor of Information Systems at the School of Management, University of Innsbruck, Austria. He has published contributions on Knowledge Management (KM) (systems) in a number of research journals, books and conference proceedings. His research interests include data management, flexible and adaptive business processes, KM and technology-enhanced learning. Stefan Thalmann received his Diploma in Management Information Systems (MIS) from the Martin Luther University of Halle-Wittenberg. He was research assistant at the University of Halle and since March 2007, he has been research assistant and PhD candidate at the institute for management information

Transcript of Institutionalised collaborative tagging as an instrument for managing the maturing learning and...

70 Int. J. Technology Enhanced Learning, Vol. 1, Nos. 1/2, 2008

Copyright © 2008 Inderscience Enterprises Ltd.

Institutionalised collaborative tagging as an instrument for managing the maturing learning and knowledge resources

Ronald Maier and Stefan Thalmann* Information Systems, University of Innsbruck, Universitaetsstrasse 15, A-6020 Innsbruck, Austria E-mail: [email protected] E-mail: [email protected] *Corresponding author

Abstract: Recently, social software and collaborative tagging have received high levels of attention in Internet communities and have also been discussed as interesting approaches to annotate resources and distribute the cumbersome task of designing ontologies from few domain experts to large numbers of users of digital resources. This paper discusses the suitability of collaborative tagging for annotating knowledge and learning resources in the institutionalised setting of businesses and organisations. Specifically, the paper discusses commitment, convergence and coordination issues and presents the results of a multi-round experiment involving 174 Bachelor students at the Innsbruck University School of Management.

Keywords: collaborative tagging; KM; knowledge management; knowledge maturing; metadata; social software; technology-enhanced learning; learning objects; knowledge objects.

Reference to this paper should be made as follows: Maier, R. and Thalmann, S. (2008) ‘Institutionalised collaborative tagging as an instrument for managing the maturing learning and knowledge resources’, Int. J. Technology Enhanced Learning, Vol. 1, Nos. 1/2, pp.70–84.

Biographical notes: Ronald Maier holds a PhD in Management Information Systems from The Koblenz School of Corporate Management and a habilitation Degree from the University of Regensburg. He worked as Visiting Assistant Professor at the Terry College of Business, University of Georgia, and as full-Professor at the Martin-Luther-University Halle-Wittenberg, Germany. Since 2007, he is Professor of Information Systems at the School of Management, University of Innsbruck, Austria. He has published contributions on Knowledge Management (KM) (systems) in a number of research journals, books and conference proceedings. His research interests include data management, flexible and adaptive business processes, KM and technology-enhanced learning.

Stefan Thalmann received his Diploma in Management Information Systems (MIS) from the Martin Luther University of Halle-Wittenberg. He was research assistant at the University of Halle and since March 2007, he has been research assistant and PhD candidate at the institute for management information

Institutionalised collaborative tagging as an instrument 71

systems at the University of Innsbruck. His research interests include knowledge maturity and implications for design of adaptive learning material, design of adaptive learning material and collaborative tagging.

1 Introduction

From the 1950s to the 2000s, the share of data work and, especially during the last decade, particularly the share of knowledge work has risen continuously (Wolff, 2005). Many organisations have been engaged in a transformation process turning them into knowledge-intensive organisations to significantly accelerate innovation and improve productivity of knowledge work (Drucker, 1994). Evidence has been gained that many businesses and organisations have established numerous initiatives to implement Knowledge Management (KM) (see Lytras et al., 2008; Maier, 2007) and the studies cited there). KM measures and tools have been bundled as KM instruments applied to provide specific KM services for, e.g., a process, a work group, a user community or an outlet as a solution to a defined business problem. Consequently, a number of fragmented KM instruments have been proposed claiming to solve particular knowledge-related problems (e.g., Holsapple, 2003; Jennex, 2005). Many of these are supported by information and communication technologies.

Particularly in larger organisations, however, many KM initiatives implement a number of KM instruments in parallel, which called for integrating solutions that bind the individual contents and functions together, e.g., in the form of enterprise information or knowledge portals (Collins, 2003; Firestone, 2003) or KM Systems (KMS) (Maier, 2007). The recent conceptual shift and standardisation efforts surrounding the service-oriented architecture (Alonso et al., 2004; Krafzig et al., 2005; Marks and Bell, 2006) have stressed the importance of establishing an integrative infrastructure for the individual support systems for KM, which can be done in a rather centralised way, also called an enterprise knowledge infrastructure (Maier et al., 2005), or in a decentral way, as a peer-to-peer KMS (Maier and Hädrich, 2006). On the core of these integrating systems or infrastructures is a set of metadata elements based on standards, e.g., DC,1 exchangeable image file format (Exif),2 ID3,3 IMS LIP4 or IMS LD,5 IEEE LOM,6 Open Document Format,7 vCard8 or vCalender.9 A typology of these and other organisation-specific metadata help to structure the resulting application profiles (Maier and Sametinger, 2007) and additional structuring and relating information is typically provided as part of an ontology (Gruber, 1993; Staab et al., 2001; Fensel, 2004). However, the process of designing application profiles and mapping or merging ontologies that are applicable throughout an organisation and encompass all (relevant) IT-supported KM instruments has remained a challenging task.

More specifically, for administration and exchange of knowledge elements and of learning objects, which are an important part of an organisational knowledge base, meaningful metadata are required. Typically, knowledge elements and learning objects are not only text-based, much information refers to context, such as creation and application context, pedagogical methodology, the domain and relationships to other knowledge elements (Motelet et al., 2006). For these important yet complex metadata categories, entirely automatic generation of metadata seems to be currently hardly realisable (Pinkwart et al., 2004; Motelet et al., 2006). However, annotation completely

72 R. Maier and S. Thalmann

done by humans seems also not to be practicable (Cardinaels et al., 2005; Motelet et al., 2006). Reasons are, e.g., the extra time needed for that task, the problem of formalising and structuring knowledge, especially because of incompatible mental models and technical barriers (Shipman and McCall, 1994). Thus, annotation is an often neglected task.

A blend of both approaches extended with user-friendly technologies could be an appropriate solution. This would comprise automatic metadata generation for the more technical aspects and a human-centric approach for the softer and more context-specific metadata elements (Bauer et al., 2008). For the second part, social software, in general, which enables richer capturing of context (Klamma et al., 2006) and tagging, in particular, as one exponent providing a collaborative and easy-to-use technique of adding annotations to resources have been suggested as solutions. Gaining valuable metadata (McGregor and McCulloch, 2006; Rollett et al., 2007) by applying these solutions seems to be promising to ease the time- and resource-consuming tasks of designing, mapping and merging ontologies that are at the heart of the integration layer of KMS (Maier, 2007) or, in a service-oriented perspective, enterprise knowledge infrastructures (Maier et al., 2005).

This paper argues that using the ‘collective wisdom’ (Surowiecki, 2004) of as many participants of KM solutions throughout an organisation or beyond is an approach potentially superior to the design by a small number of human experts that are supposed to come up with such a solution. Of the many facets involved in this proposition, the paper concentrates on the findings resulting from a series of experiments in collaborative tagging and interprets them in order to give recommendations for the design of an institutionalised collaborative tagging solution. Thus, the paper discusses tagging as a participatory approach to gain metadata (Section 2), presents an approach to establish collaborative tagging in businesses and organisations (Section 3) and reports the results of a multi-round experiment on three central aspects of the institutionalisation of collaborative tagging, i.e., commitment, convergence and coordination of the annotated metadata (Section 4), before the paper is concluded in Section 5.

2 Gaining valuable metadata through tagging

A knowledge element is the smallest unit of atomic, explicit, formally defined knowledge content, a record of some form of externalisation viewed as a single organised unit both from a conceptual and from a technical perspective. It is composed of a grouping of formatted information objects, which cannot be separated without substantial loss of meaning together with metadata describing the element (Maier, 2007). A learning object is defined as (non-)digital entity that may be used for learning, education or training (IEEE, 2002). Thus, a learning object can be a whole course, a graphic, a table, even in non-digital form. In e-Learning research, learning objects are restricted to digital resources (Hodgens, 2002; Wiley, 2001), which can be (re-)used in defined contexts. For the purpose of this paper, the focus is on the characteristics of (a) a digital resource that can (b) be reused readily for some knowledge activity in the case of knowledge elements or learning activity in the case of learning objects. Thus, for reasons of convenience, the term Knowledge or Learning Resource (KLR) will be used in the remainder of this paper. Consequently, KLRs should be limited to digital resources and economically important reusability should be fixed. But, reusability demands a thorough description of KLR

Institutionalised collaborative tagging as an instrument 73

characteristics to support selecting KLRs, typically stored in metadata (Maier and Thalmann, 2007).

In a very general way, metadata can be described as data over data. From a database view, metadata can be defined as structural descriptions of data elements (Heuer and Saake, 2000). More detailed metadata can be seen as descriptions of an entity in dependence of the user (Tannenbaum, 2002). However, metadata are not objective, they are influenced by the context of creation (Nilsson et al., 2002). To reuse KLRs from the organisational knowledge base, contextual information, i.e. metadata, represented in a semantically described specification is needed (Pawlowski and Bick, 2006).

Traditionally, a small group of experts categorises or indexes resources on the basis of an agreed, structured catalogue of keywords, a taxonomy, to make the resources accessible for the organisation (McGregor and McCulloch, 2006). But, with the rapidly increasing amount of resources (Lyman and Varian, 2003), here KLRs, time exposure and cost for organisations are unsustainable. Furthermore, experts find it challenging to describe KLRs for all kinds of application areas that can be reused, owing to the fact that they cannot be experts in all domains that KLRs are developed for (Shipman and McCall, 1994).

Metadata generation generally remains the responsibility of authors of KLRs. While learners or educational professionals may benefit from metadata, the authors rarely take advantage (Motelet et al., 2007). So it is not surprising that one of the most often heard critical remarks about metadata for KLRs is that KLR authors are not willing to spend additional effort to add metadata to their KLRs (Duval and Hodgins, 2003). The use of automatic processes can resolve the problem in part by reducing the number of metadata elements, which has to be humanly edited (Duval and Hodgins, 2004).

For the remainder of metadata elements, tagging could be a solution. Generally, adding annotations to web resources is called tagging. Thereby, a tag is a comment unrestricted to a controlled vocabulary and represented in different kinds of structures, which have come to be called folksonomies (Rollett et al., 2007). The basic element of information in such a system is the following triple (Cattuto et al., 2007a, 2007b):

( , , ).tag user resource keyword=

Frequently, a user assigns multiple keywords to the same resource in a single step, which is called a post:

( ) ( , { }) | ( , , ).n n rpost user timestamp tags tags user resource keyword= ∀

Tags and posts can be statistically analysed and lead to tag clouds, tag ncetworks or tag clusters (Rollett et al., 2007). Tags can be used to organise the personal workspace and for searching and browsing in a shared setting (McGregor and McCulloch, 2006). If a large number of users tag resources while using them, a wide range of keywords and descriptions for several application areas can be gathered. Easy handling can reduce the time required for tagging, the barrier to contribute and the costs for the organisation.

Users find it difficult to contribute with established methods for structuring resources using controlled vocabularies, however, because structuring thoughts in a given format represents a significant barrier to participate (Shipman and McCall, 1994). Also, controlled vocabularies primarily used in libraries are not always transferable to the wide variety of digital resources (McGregor and McCulloch, 2006).

74 R. Maier and S. Thalmann

Tagging with its unrestricted opportunities can reduce these barriers and motivate users to contribute. This model has been successfully applied on the internet. Moreover, with the advent of web technologies allowing for large numbers of users to participate in content production, sometimes termed Web 2.0 (O‘Reilly, 2005), ‘collective intelligence’ emerging from the contribution of many has been discussed as a promising phenomenon (Surowiecki, 2004) that requires further investigation. With respect to the task in question here, the annotation of KLRs could be moved from a small number of authors to a potentially much larger number of users with what has come to be called collaborative tagging.

Typically, users can add free-text descriptions to web resources and additionally they can select descriptions from a list of the most frequently used descriptions, for example in delicious10 or cannotea.11 Doing so, generally meaningful tags used by many people can be distinguished from personal or specialised tags used by only a few people (Golder and Huberman, 2005; Guy and Tonkin, 2006). The first group of tags might be more interesting for organisations, because this group represents the common agreement within the respective community and can be used to describe resources, here KLRs, on an institutional level. In that sense, the number of top-ranked tags can be interpreted as a measure of the ‘semantic breadth’ of a resource (Cattuto et al., 2007b). Thus, the first proposition that will be tested in the experiment described in Section 4 is “whether taggers commit to a stable set of generally meaningful tags that can be securely separated from specialised tags used by only a few users”.

Organisations look for appropriate and durable descriptions of KLRs. Therefore, in addition to commitment to generally meaningful tags, annotated tags should converge to a stable set of tags with (almost) no more additions of keywords. In an organisational setting with an almost stable number of contributors and a slowly growing number of topics, the number of tags used in a given domain might converge. This convergence might be interpreted as a sign of maturity of a certain topic, which in turn could trigger measures that bring the underlying tagged resources as well as the contributors, the underlying social network or community, to a higher level of maturity (Maier and Schmidt, 2007). The second proposition tested in the experiment is “whether the number of different tags for one resource converges to a stable set in an organisational setting”.

3 Life cycle of institutionalised tagging

Besides the advantages of tagging, there are also some problems. At first, probability of noise in the tags is high and rises over the time of usage (McGregor and McCulloch, 2006). The number of erroneous tags also rises and can be between 28% and 40% (Guy and Tonkin, 2006). A general problem of structuring electronic resources without a controlled vocabulary are homonyms, synonyms, plurals, and different levels of aggregation of the descriptions (Golder and Huberman, 2005). Considering all these problems, precision of annotation declines and eventually risk of insufficient quality in the sense of fitness-for-use increases over time. Klamma et al. (2006) identify quality as one of the key success factors for using social software in professional learning and, therefore, quality should be ensured by an institutionalised process.

In analogy with other life cycles in information technology (Balzert, 1998), a life cycle for institutionalised tagging should be closely monitored to manage, ensure quality and guide usage of electronic resources. The theoretical framework is based on the

Institutionalised collaborative tagging as an instrument 75

concept of Seeding, Evolutionary growth and Reseeding (SER) for collaborative design of content and systems introduced by Fischer et al. (2001) and on the knowledge maturity model introduced by Maier and Schmidt (2007). The SER-model is an evolutionary model that starts with seeding in which domain experts initialise the process with the existing domain knowledge. In the following evolutionary growth process, users add information to the artefacts as they use them until there is a need for reorganisation. In the last process, reseeding, information will be reformulated so that it fits to the new or changed requirements (Fischer et al., 2001).

In the case of tagging, the seeding process can be carried out by introducing a KLR in an organisation. The initial set of metadata can be created by the author, by domain experts or by using automatic techniques. The seeding set of metadata is important for future development, because they might influence users’ choices and enhance the visibility in organisational retrieval systems. Owing to seeding, these elements appear in the list of most frequently used tags, which are highly visible to users. In the usage process, users then tend to apply recently used tags more frequently than older tags (Cattuto et al., 2007b). By defining the initial set, the organisation might be able to influence the tags’ future representations, like level of granularity or technical terms. These deliberations lead to the second proposition which is: “The initial set of tags significantly influences the converged set of top ranked tags”.

After seeding, users formulate their daily experiences with KLRs and the set of tags increases in evolutionary growth. After a certain period of time, the number of different tags should converge against a stable set (Proposition 1) with a potentially huge amount of noise, i.e., tags that are reused hardly or not at all. Finally, tags could be harvested and a new seeding in a new community could be realised.

The SER process can occur multiple times consecutively during the continuous process of development of KLRs and the environment they are used in. To conceptualise this, the idea of the SER model can be applied to collaborative tagging as shown in Figure 1.

Figure 1 Basic organisational design of collaborative tagging (see online version for colours)

The three steps of SRE can be applied repeatedly. Tagging starts with KLRs without descriptions or with descriptions of unknown suitability as the seed of the tagging process. Then, the evolutionary process, in our case tagging of KLRs, takes place. In an institutionalised process, it is proposed to use up to a predefined number of top-ranked tags per KLR and discard the rest of the tags. KLRs thus might be suitably described for a certain period of time. However, after a certain period of evolutionary growth, either no useful annotations for the resource could be gained or the gained annotations do not fit anymore. In both cases, reseeding and a further collaborative tagging process is necessary. For the first case, reseeding in a greater community or even in a changed setting might be useful. In the second case, selected descriptions that were discarded

76 R. Maier and S. Thalmann

before might be used for reseeding. By modifying the initial set of metadata, usage and therefore further annotations might be influenced. Thus, annotations can be continuously improved and adapted to new or changing requirements or to changes in the contents themselves. Thus, the reseeded collaborative tagging process is sensitive for new influences.

The collaborative tagging process can be regarded as part of a larger knowledge maturing process (Figure 2). Changeovers in knowledge maturing can be seen as fundamental events in the development of a certain topic with respect to the community owning that topic. The knowledge maturing process consists of the four steps expressing ideas, distribution in communities, formalisation and ad hoc learning (Maier and Schmidt, 2007). During each of these steps, collaborative tagging can facilitate the knowledge maturing process. Improvements in KLRs can be fostered by annotations accepted throughout the community of users. Lifting KLRs onto a new stage of maturity implies a new seed and a restart of the tagging process. Thus, the basic organisational design of collaborative tagging (Figure 1) can be seen as one evolutionary step in knowledge maturing (Figure 2). Like in biological evolution, not every trial has to be successful and thus reseeding on the same level of maturity could be necessary.

Figure 2 Life cycle of collaborative tagging supporting knowledge maturing (see online version for colours)

By applying the proposed life cycle, suitable descriptions for KLRs could be gained and thus reusability of resources increased. To achieve this, the three premises formulated above are:

• commitment to securely separate generally accepted tags from specialised tags

• convergence to ensure a stable set of tags that is suitable for reuse of KLRs and consequently increases productivity of dealing with KLRs

• coordination of the semantic ‘direction’ of descriptions, thus influencing tagging done by the seeding of collaborative tagging processes in order to accelerate the processes of commitment and convergence.

4 Collaborative tagging experiment

The authors conducted a multi-round lab experiment to shed some light on the three issues of commitment, convergence and coordination and thus the actual workings of collaborative tagging. The experiment was realised within a course on introduction to information systems between May and June 2007 at the Innsbruck University School of Management. Altogether, 174 Bachelor students of business and economics participated. The experiment was realised in two experimental series with seven course groups of between 20 and 28 students each.

Each student had to tag ten KLRs (two videos, three presentation slides, three screen shots and two pictures). As common in many available tagging tools, the most frequently

Institutionalised collaborative tagging as an instrument 77

used tags were presented and could be chosen easily. Also, additional tags could be entered. Technically, the study was realised by a questionnaire tool. With this tool, the students could check one or more tags from the five most frequently used tags or enter own tags into a text box. Thereby, any combination of own and given tags was possible, whereas at least one tag per resource had to be assigned. The set of the five most frequently used tags was calculated after every round, not after every new individual input.

Eight KLRs were assigned with two semantically different initial sets of five tags in order to examine the influence on the semantic direction of description. ‘Semantic direction’ in this case refers to either content- or context-oriented description of the KLRs. Thereby, tags in the first experimental series provided a description of the content, like topic or suitable terminology. In contrast to that, the second experimental series provided descriptions of the KLRs’ contexts, like author or language. Additionally, two resources were assigned with identical initial sets of tags. One resource was seeded with, from the authors’ point of view, unsuitable tags.

Overall, 4246 tags were assigned to the ten resources. Thereby, 404 different tags occurred. 10% of the given tags were replaced by new tags in the first experimental series and 20% of the given tags were replaced in the second series. Generally, useful tags (high number of assignments) and individual tags could clearly be separated. The distribution of tags assigned to the ten KLRs is summarised in Table 1. The tags are sorted by the relative and absolute frequency of assignments given by the students.

Table 1 Distribution of tags for the second experimental series

resource 1 resource 2 resource 3 resource 4 resource 5

rank 1 70 23.8% 58 22.1% 68 29.1% 92 29.5% 69 28.0%

rank 2 60 20.4% 55 21.0% 56 23.9% 52 16.7% 47 19.1%

rank 3 57 19.4% 53 20.2% 40 17.1% 51 16.3% 44 17.9%

rank 4 37 12.6% 34 13.0% 25 10.7% 46 14.7% 37 15.0%

rank 5 37 12.6% 12 4.6% 25 10.7% 41 13.1% 22 8.9%

rank 6 5 1.7% 8 3.1% 3 1.3% 7 2.2% 3 1.2%

rank 7 3 1.0% 4 1.5% 2 0.9% 4 1.3% 3 1.2%

others 25 8.5% 38 14.5% 15 6.4% 19 6.1% 21 8.5%

sum 294 100% 262 100% 234 100% 312 100% 246 100%

resource 6 resource 7 resource 8 resource 9 resource 10

rank 1 69 25.2% 82 34.7% 76 28.6% 75 34.2% 79 30.7%

rank 2 57 20.8% 51 21.6% 57 21.4% 54 24.7% 62 24.1%

rank 3 46 16.8% 33 14.0% 56 21.1% 31 14.2% 30 11.7%

rank 4 40 14.6% 30 12.7% 25 9.4% 30 13.7% 30 11.7%

rank 5 38 13.9% 22 9.3% 23 8.6% 15 6.8% 29 11.3%

rank 6 13 4.7% 3 1.3% 4 1.5% 1 0.5% 4 1.6%

rank 7 3 1.1% 2 0.8% 3 1.1% 1 0.5% 3 1.2%

others 8 2.9% 13 5.5% 22 8.3% 12 5.5% 20 7.8%

sum 274 100% 236 100% 266 100% 219 100% 257 100%

78 R. Maier and S. Thalmann

The keywords in the positions 1–5 or in the case of resource 6, 1–6, have continuously relatively high frequencies of occurrence when compared with the following positions. The remaining set of tags is given as one sum with an average frequency of occurrence close to 1. The threshold for separation can be realised on the sharp decline of the frequency of occurrence. Table 1 shows the data from the second experimental series. However, the distribution over both experimental series is quite similar.

In the presented case, a threshold of 5% of assignments can be used to separate the generally accepted tags from the specialised tags. This means that if a tag has a relative frequency of occurrence greater than or equal to 5%, then it is classified as a generally accepted tag; otherwise, it is classified as a specialised tag. By applying that criterion, between 5 and 6, generally accepted keywords can be extracted. These represent 90% of assigned tags on average. Alternatively, a 0.9 quartile can be used with the consequence that in some cases, tags with a relative frequency of occurrence of about 1% are part of the generally accepted set of tags.

In the experimental series, a stabilisation of the set of Top 5 tags could be noticed. The number of changes in the Top 5 ranking was monotonically decreasing. As shown in Figure 3, in the first round of the two experimental series, 40 and 27 changes of 50 possible changes were observed. In contrast, 15 and 14 changes happened in the third round. Additionally, the number of new tags in the Top 5 also decreased and had a value of zero after two rounds. Thus, the set of Top 5 tags was fixed after two rounds and only the rankings changed after that.

Figure 3 Changes in the ranking of TOP-5 resources (see online version for colours)

Figure 4 depicts the number of distinct tags12 for all resources of the second experimental series13 in relation to all (text-based and clicked) entered tags. The bold diagonal line represents the idealised run of the curve if every tag was unique. The slope of graphs for all resources decreases and the curves are flat when compared with the idealised graph. But nevertheless, the number of distinct tags still increases.

Institutionalised collaborative tagging as an instrument 79

As Cattuto et al. indicated, the rate of new tags decreases monotonically and might become zero in an unlimited setting. But they also mentioned that the set of possible keywords in reality as well as in experimental settings is too big for reaching this point (Cattuto et al., 2007a). This leads to the question: What is the sufficient level of convergence for harvesting generally accepted tags?

Figure 4 Relation of distinct tags to number of tags (see online version for colours)

Summing up, the observations depicted in Figure 3 and Figure 4 lead to the assumption that the annotated tags converge against a stable set and that, from a certain point in time, changes are so marginal that they have no influence on the ranking. The number of tags entered additionally by students varied across the groups between 1.6 and 4.0. Overall, on average 2.8 tags were additionally entered.

The number of different tags rose in both experimental series, but the speed of rising decelerated in both cases. To compare our findings to related work, a ratio based on (Cattuto et al., 2007a) was calculated. The vocabulary growth exponent γ = log(new tags)/log(distinct tags) and P(γ) the distribution for an experimental setting is depictured in Figure 5. In contrast to the findings of Cattuto et al. (2007a), which are based on an analysis of the community of delicious users, no normal distribution of the curves can be observed. Furthermore, the values for γ are much smaller than the values reported by Cattuto. This might be an indicator, that γ and thus the time for convergence can be influenced by the interface of the tagging tool or the research setting. Following this assumption, design of a setting could influence the number of taggers required to reach a certain level of convergence. But that point is part of our future research and not part of this paper.

A manifold number of combinations of shortcuts and combined terms, like Power_Point_Presentation, PowerPoint, Power_Point or PPT, semantically describing the same term was given, but counted as individual terms within this experiment. Furthermore, tags were given in German and English languages. Because of the huge

80 R. Maier and S. Thalmann

number of combinable terms, it is practically very difficult to show a complete convergence to a stable set of tags. Nevertheless, the mentioned indicators show that a convergence takes place and changes in the dominant set decrease over time. The growing set of tags with increasing stability becomes more and more insensitive for new influences and trends. For that reason, reseeding was proposed in Section 3. The questions when KLRs should be reseeded and when the size of the set is sufficient for gaining valuable metadata are still unsolved and subject to further research.

Figure 5 Distribution of vocabulary growth exponent (see online version for colours)

From a qualitative point of view, the textual input of both experimental series was quite different. In the series with start tags describing content, further tags describing content of KLRs were entered and in the series with start tags describing context, further tags describing context were entered. Overall, the textual input differs between 71% and 93% with 81% on average for the same resources over both experimental series. This means that on average, 81% of the tags were different between the series. The 19% tags that were identical on average were mainly words visible in the resource itself, like headlines.

With respect to those two resources that were initially described with identical tags, the final Top 5 sets of tags contained the same keywords and differed only marginally in the ranking. These observations support the third proposition on coordination, i.e. the formation of the starting set of tags indeed can influence the semantic direction of the metadata descriptions.

The resource with the unsuitable starting set received above-average textual input from the students (12.8 textual inputs vs. 6.1 textual inputs for the other resources). Furthermore, the number of additionally entered tags appearing in the Top 5 was much higher than on average. Nevertheless, some unsuitable tags were still in the Top 5. Possibly, students tried to somehow relate the unsuitable tags to the resources and therefore they were not unbiased. This would further strengthen the assumption that the starting set of tags strongly influences tagging behaviour.

Institutionalised collaborative tagging as an instrument 81

From the authors’ point of view, the first empirical evidence about trends can be given that should be verified in further experiments.

• Convergence of the set of tags cannot be shown finally, but several findings indicate a tendency towards convergence.

• Commitment was confirmed, i.e., separation of generally accepted and individual tags seems to be realisable.

• The thesis on coordination was also confirmed, i.e., the design of the starting set of tags seems to influence the direction of describing the resources.

5 Conclusion

This paper has reviewed aspects of collaborative tagging and proposed a systematic process for deploying collaborative tagging in organisational settings for annotation of KLRs. The process is based on an integration of the SER model with the knowledge maturing model. Three propositions on commitment, convergence and coordination were tested in a multi-round experiment. Whereas the propositions on commitment and coordination were supported, it is yet unclear whether the set of keywords assigned to digital resources converges. Even though it is intuitively convincing that convergence should eventually happen, this might take a long while and thus may not be an arguable characteristic for organisational settings with limited time for sensibly annotating KLRs. The opposite, low or no new tags assigned might also be a common phenomenon for a part of KLRs in organisational settings. Based on the findings here, one might suggest that if the number of posts is too low after a certain period of time and, as a consequence, γ is still close to 1, one might reseed the resource in a larger community or discard the resource for lack of interest.

We are only at the beginning of designing collaborative tagging processes in an institutionalised organisational setting. Arguably, the level of control and rigidness typically found in organisational settings contrasts the uncontrolled processes found in collaborative tagging settings with the entire internet as target group. Comparing our results with Cattuto’s results, a key factor explaining the differences could be the much larger homogeneity of the users in our experiments in terms of educational background (school, same program of study) compared with the supposed diversity of delicious users. The much lower γ values in our experiments reflecting the rather homogeneous group of students of business and economics, however, could be much closer to the expected γ values in an organisational setting of institutionalised collaborative tagging, as backgrounds within organisations are rather similar when compared with the diversity of delicious users. For KLRs that are used in only a portion of an organisation, e.g., in a community of interest, homogeneity might even be higher. Thus, based on the findings presented above, collaborative tagging promises to be an appropriate solution to the problem of annotating KLRs in organisations when designed carefully. The design comprises the determination of the ‘right’ moment for gaining annotations (when does the process converge sufficiently?), criteria for selecting the set of metadata (which metadata should be dropped?) and the ‘right’ moment for reseeding (“when should the process of collaborative tagging be restarted with a modified starting set?”).

82 R. Maier and S. Thalmann

More generally, knowledge maturing is a metaphor deemed suitable for structuring real-world phenomena of dealing with knowledge in businesses and organisations and for systematically elaborating technical solutions. Collaborative tagging is part of the maturing level of communities and might be followed by a formalisation step that allows for distribution of KLRs beyond the current community of users. Furthermore, KLRs might also be didactically refined to better support (ad hoc or self-guided) learning processes in organisations. In this larger scenario, collaborative tagging might provide important indicators for selecting those KLRs that seem to be worth the effort needed for formalising and didactically refining KLRs.

References Alonso, G., Casati, F., Kuno, H. and Machiraju, V. (2004) Web Services. Concepts, Architectures

and Applications, Springer, Berlin. Balzert, H. (1998) Lehrbuch der Software-Technik, Band 2, Spektrum Akademischer, Verlag,

Heidelberg. Bauer, M., Maier, R. and Thalmann, S. (2008) Metadata Generation for Learning Objects an

Experimental Comparison of Automatic and Collaborative Solutions, MKWI 2008, Springer, München.

Cardinaels, K., Meire, M. and Duval, E. (2005) ‘Automatic metadata generation: the simple indexing interface’, The International World Wide Web Conference (WWW05), ACM, Chiba, Japan, pp.548–556.

Cattuto, C., Baldassarri, A., Servedio, V. and Loreto, V. (2007a) Vocabulary Growth in Collaborative Tagging Systems, URL: http://arxiv.org/PS_cache/arxiv/pdf/0704/0704. 3316v1.pdf, last access 2008-07-06.

Cattuto, C., Loreto, V. and Pietronero, L. (2007b) ‘Semiotic dynamics and collaborative tagging’, PNAS, Vol. 104, pp.1461–1464.

Collins, H. (2003) Enterprise Knowledge Portals, AMACOM, New York. Drucker, P.F. (1994) ‘The age of social transformation’, The Atlantic Monthly, Vol. 274, pp.53–80. Duval, E. and Hodgins, W. (2003) ‘A LOM research agenda’, 12th International World Wide Web

Conference, Budapest, pp.659–671. Duval, E. and Hodgins, W. (2004) ‘Metadata matters’, International Conference on Dublin Core

and Metadata Applications, Shanghai, pp.11–14. Fensel, D. (2004) Ontologies: A Silver Bullet for Knowledge Management and Electronic

Commerce, Springer, Berlin. Firestone, J.M. (2003) Enterprise Information Portals and Knowledge Management, KMCI-Press

Amsterdam. Fischer, G., Grudin, J., Mccall, R., Ostwald, J., Redmiles, D., Reeves, B. and Shipman, F. (2001)

‘Seeding, evolutionary growth, and reseeding: the incremental developement of collaborative design environments’, in Olson, G.M., Malone, T.W. and Smith, J.B. (Eds.): Coordination Theory and Collaboration Technology, Mahwah, Erlbaum Publishers, New Jersey, USA, pp.447–472.

Golder, S.A. and Huberman, B.A. (2005) The Structure of Collaborative Tagging Systems, Palo Alto, Information Dynamics Lab, HP Labs, CA, USA.

Gruber, T.R. (1993) ‘A translation approach to portable ontology specifications’, Knowledge Acquisition, Vol. 5, pp.199–220.

Guy, M. and Tonkin, E. (2006) ‘Folksonomies: tidying up tags?’, D-Lib Magazine, Vol. 12, URL: http://www.dlib.org/dlib/january06/guy/01guy.html, last access 2008-07-06.

Heuer, A. and Saake, G. (2000) Datenbanken Konzepte und Sprachen., mipt, Bonn.

Institutionalised collaborative tagging as an instrument 83

Hodgens, W.H. (2002) ‘The future of learning objects’, in Wiley, D.A. (Ed.): The Instructional Use of Learning Objects: Online Version, 2000, URL: www.reusability.org/read/chapters/ hodgins.doc, last access 2008-07-06.

Holsapple, C.W. (2003) Handbook on Knowledge Management, Springer, Berlin. IEEE (2002) Draft Standard for Learning Object Metadata (LOM), IEEE, New York. Jennex, M.E. (2005) Case Studies in Knowledge Management, Hershey, PA, USA. Klamma, R., Amine Chatti, M., Duval, E., Fiedler, S., Hummel, H., Hvannberg, E.T., Kaibel, A.,

Kieslinger, B., Kravcik, M., Law, E., Naeve, A., Scott, P., Specht, M., Tattersall, C. and Vuorikari, R. (2006) ‘Social software for professional learning: examples and research issues’, 6th IEEE International Conference on Advanced Learning Technologies ICALT, IEEE Computer Society, Kerkrade, The Netherlands, pp.912–915.

Krafzig, D., Banke, K. and Slama, D. (2005) Enterprise SOA: Service-Oriented Architecture Best Practices, Prentice Hall International, Upper Saddle River.

Lyman, A. and Varian, B. (2003) How Much Information? Berkeley, University of California, Berkeley, California, USA.

Lytras, M.D., Russ, M., Maier, R.K. and Naeve, A. (2008) Knowledge Management Strategies: The Handbook of Applied Technologies, Idea Group, Hershey, PA, USA.

Maier, R. (2007) Knowledge Management Systems: Information and Communication Technologies for Knowledge Management, Springer, Berlin.

Maier, R. and Hädrich, T. (2006) ‘Centralized versus peer-to-peer knowledge management systems’, Knowledge and Process Management – The Journal of Corporate Transformation, Vol. 13, pp.47–61.

Maier, R., Hädrich, T. and Peinl, R. (2005) Enterprise Knowledge Infrastructures, Springer, Berlin. Maier, R. and Sametinger, J. (2007) ‘A top-level ontology for smart document access’, in Stary, C.,

Barachini, F. and Hawamdeh, S. (Eds.): International Conference on Knowledge Management, Singapore, pp.153–164.

Maier, R. and Schmidt, A. (2007) ‘Charcterizing knowledge maturing: a conceptional model integrating E-Learning and knowledge management’, 4th Conference of Professional Knowledge Management (WM07), Potsdam, pp.325–333.

Maier, R. and Thalmann, S. (2007) ‘Describing learning objects for situation – oriented knowledge management applications’, in Gronau, N. (Ed.): 4th Conference of Professional Knowledge Management (WM07), GITO Verlag, Potsdam, pp.343–351.

Marks, E.A. and Bell, M. (2006) Service-Oriented Architecture (SOA): A Planning and Implementation Guide for Business and Technology, Wiley, New York.

Mcgregor, G. and Mcculloch, E. (2006) ‘Collaborative tagging as a knowledge organisation and resource discovery tool’, Library Review, Vol. 55, pp.291–300.

Motelet, O., Baloian, N. and Pino, J.A. (2006) Learning Object Metadata and Automatic Processes: Issues and Perspectives, Informing Science Press, Santa Rosa, CA.

Motelet, O., Baloian, N. and Pino, J.A. (2007) ‘Hybrid system for generating learning object metadata’, Journal of Computers, Vol. 2, pp.34–42.

Nilsson, M., Palmér, M. and Naeve, A. (2002) ‘Semantic web meta-data for e-learning – some architectural guidelines’, 11th World Wide Web Conference (WWW2002), Honolulu, Hawaii, USA.

O‘Reilly, T. (2005) Is Web 2.0. Design Patterns and Business Models for the Next Generation of Software, URL: http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html, last access 2008-07-06.

Pawlowski, J.M. and Bick, M. (2006) ‘Managing & re-using didactical expertise: the didactical object model’, Educational Technology and Society, Vol. 9, pp.84–96.

84 R. Maier and S. Thalmann

Pinkwart, N., Jansen, M., Oelinger, M., Korchounova, L. and Hoppe, U. (2004) ‘Partial generation of contextualized metadata in a collaborative modeling environment’, 2nd International Workshop on Applications of Semantic Web Technologies for E-Learning – International Conference on Adaptive Hypermedia and Adaptive Web-based Systems AH 2004, Eindhoven, The Netherlands, pp.211–220.

Rollett, H., Lux, M., Strohmaier, M., Dösinger, G. and Tochtermann, K. (2007) ‘The web 2.0 way of learning with technologies’, International Journal of Learning Technology, Vol. 3, pp.87–107.

Shipman, F. and Mccall, R. (1994) ‘Supporting knowledge-base evolution with incremental formalization’, in Adelson, B. (Ed.): Conference on Human Factors in Computing Systems, ACM Press, Boston, pp.285–291.

Staab, S., Studer, R., Schnurr, H-P. and Sure, Y. (2001) ‘Knowledge processes and ontologies’, IEEE Intelligent Systems and their Applications, Vol. 16, pp.26–34.

Surowiecki, J. (2004) The Wisdom of Crowds: Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations Little, B&T Press, Mansfield.

Tannenbaum, A. (2002) Metadata Solutions: Using Metamodels, Repositories, XML, and Enterprise Portals to Generate Information on Demand, Addison-Wesley, Boston.

Wiley, D.A. (2001) ‘Connecting learning objects to instructional design theory: a definition, a metaphor, and a taxonomy’, in Wiley, D.A. (Ed.): The Instructional Use of Learning Objects: Online Version, URL: http://reusability.org/read/chapters/wiley.doc, last access 2008-07-06.

Wolff, E.N. (2005) ‘The growth of information workers’, Communications of the ACM, Vol. 48, pp.37–42.

Notes 1http://dublincore.org/ 2http://www.exif.org/ 3http://www.id3.org/ 4http://www.imsglobal.org/profiles/index.html 5http://www.imsglobal.org/learningdesign/index.html 6http://ltsc.ieee.org/wg12/files/LOM_1484_12_1_v1_Final_Draft.pdf 7http://www.odfalliance.org/press/Release%2020060503.pdf 8http://www.imc.org/pdi/vcardoverview.html 9http://www.imc.org/pdi/vcaloverview.html 10http://del.icio.us 11http://www.connotea.org/ 12Values are interpolated. 13Values of the first series are similar, but the number of tags is smaller.