Databases as citadels in the web 2.0

13
DATABASES AS CITADELS IN THE WEB 2.0 / MARTIN WARNKE NETWORK WEB NODES SCALE INTERNET NETWORKS LINKS NUMBER POWER FREE DATABASES CONNECTED 76

Transcript of Databases as citadels in the web 2.0

DATABASES AS CITADELS IN THE WEB 2.0

/MARTIN WARNKE

NETWORK WEB NODES SCALE INTERNET NETWORKS LINKS NUMBER POWER FREE DATABASES CONNECTED

76

What, if the promises of the Web 2.0 – grassroots democracy – were pure ideology? What, if the content we generate also generated massive inequality: power to the very few over the many of us? What, if this were equally annoying and, at the same time, unavoidable?

There is a paradoxical development taking place on the World Wide Web. This con-sists of the truly mass-medial use of the web, which effectively everyone in developed industrialized countries are taking part in, a comprehensive popularization if you will, and the fact that the places on the net where such communication practices take place are themselves extremely concentrated. Essentially, everyone meets at very few places on the web. And these locations are, one and all, either private or unregulated by nation states. This communal experience is realized by singular institutions, the model for which seems to be that of absolutism rather than that of government by the people. And the palaces of the absolute, mostly private rulers, have the technological shape of databases, and militarily of citadels. Whatever you might think of this, it is necessarily the case.

The Promise of the Web 2.0

Fig. 1. Source: http://yarikson.files.wordpress.com/2008/04/web-20-scheme.png.

77THEORY OF SOCIAL MEDIA

The public evidently has quite a different opinion, as can be seen in this perfectly ordinary graphic (Fig. 1): in Web 2.0 there is an outbreak of freedom and joy. Instead of media companies controlling the flows of information, making some happy, many sad, in Web 2.0 our peers have taken over, which confusingly refers both to those of the same status as well as members of the English aristocracy. The impression how-ever is that the many happy faces under a rainbow seem to promise the collective ecstasy of the miracle of Pentecost amid our friends. After all they control the flow of information themselves by making use of ‘tools’, serviceable means.

And so the many form the body of the Web 2.0, which could remind one of Thomas Hobbes, but we’ll return to that later. The phantasm of the Web 2.0 is one of self-determination, of communion, even of communism, the association of free individu-als. It is the promise of equality and – with apologies to Jürgen Habermas – of an egalitarian discourse. But nothing could be farther from the truth.

The Science of NetworksStabile, scalable, very large networks must always have a network topology with a highly unequal distribution, so that there can be no talk of equality in such con-structions. I am drawing here on the work of Albert-László Barabási in Linked: How Everything Is Connected to Everything Else and What It Means for Business, Science, and Everyday Life1, which can be highly recommended for the overview it provides of network theory. Barabási introduces his readers to the emergence of networks with the example of a flight attendant whose numerous sexual contacts around the world has contributed considerably to the spread of AIDS.2 The nodes of the network being considered here are the individuals involved; the link is the sexual contact. The nodes with their links create the network that we will now investigate. The reason this flight attendant has become so prominent was that he was solely responsible for a quarter of around 250 of the first AIDS patients to be registered. He was one of the few with an exceptionally large number of contacts; many of those infected had had contact with him or with one of his partners. There are many examples of such highly net-worked individuals. To take another example, in a database of actors and actresses3 we can see who has appeared with whom in the same film.4 And now we can ask: how many degrees separate one actor from another? The result is surprisingly small; the answer is three. Each actor or actress is connected to each other by three links, each link corresponding to a joint appearance in a film. If, as an example, we take Kevin Costner and Helmut Qualtinger, we can test this ourselves. What do these two gentlemen have to do with one another? The answer is, in spite of all differences in appearance, a lot. Because Costner appeared with Sean Connery in The Untoucha-bles, Connery with Qualtinger in The Name of the Rose, they are only two links apart. Another more extreme example provides us with the highest value: Werner Krauss as Dr. Caligary in the 1920 silent movie of the same name is three films away from Sam Worthington in James Cameron’s Avatar of 2009. What is at first sight amazing is the

1. Albert-László Barabási, Linked: How Everything Is Connected to Everything Else and What It Means for Business, Science, and Everyday Life, New York: Plume, 2003.

2. Barabási, Linked, p. 123.3. See, http://oracleofbacon.org. 4. Barabási, Linked, p. 60.

78

minimal distance between selected links, considering the actual meager connectivity of the individuals, which for a third of all actors is less than ten.

Stanley Milgram – the same one who in 1961 conducted the experiment named after him, which showed that participants were willing, in supposed service to science, to deliver others to death by electric chair if they were only far enough away from the suffering – published in Psychology Today (May 1967) the results of a study5 on the ac-quaintance distance between two individuals in the USA. The number here is six. This distance between people on earth became known as ‘six degrees of separation’, and a study from 2007 shows that for participants in instant messaging services worldwide, there is an average distance of 6.6. 6 This world is a small world.

The question now, is how to explain this minimal distance between the nodes of such large networks, since the experiment from 2007 studied a network with 180 million nodes. If we estimate the extent to which an individual would have to be networked in order to achieve a distance of seven in a population of 180 million, then by dividing 180 million by seven gives us more than 25 million. And not even the most popular guy at the party knows that many people, but he would have to in order to be only seven degrees away from everyone else.

This is actually quite easy to understand: assuming that connectedness is uniform to the degree of k=2 (where k equals the number of links), you would need half as many links as there are nodes to reach the farthest point of a circular network. The diameter of the network then would be N/2 (where N equals the number of nodes). If every node is linked to the node just beyond the one it is immediately connected with, which is a degree of connectivity equal to four, then it is possible to skip a neighboring node and you only need half as many ‘hops’, N/4. The diameter is always the number of nodes divided by the degree of connectivity. And that means that 180 million must be divided by a round 25 million in order to get seven.

5. Stanley Milgram, ‘The Small World Problem’, Psychology Today 1.1 (May 1967): 61-67.6. Jure Leskovec and Eric Horvitz, ‘Planetary-Scale Views on an Instant-Messaging Network’,

Microsoft Research Technical Report (June, 2007): 1-28.

Fig. 2. Source: following Barabási 2003, p. 51.

79THEORY OF SOCIAL MEDIA

Fig. 3. Source: Leskovec and Horvitz 2007, p. 22.

80

With a uniform but also with a random degree of connectedness showing a significant-ly average value, such short distances cannot be achieved. It is only when a few nodes are given additional links that distances on the whole become considerably shorter:

A network structure that consists of a collection of nodes that are not uniformly con-nected, a small number of strongly connected and a large number of weakly connected objects, enables and even requires the qualities being discussed here: a very small di-ameter with very many nodes without an overall extremely high degree of connectivity. Most of us potter along among our immediate friends and acquaintances with few but strong ties, a few of us connect these ‘islands’ with weak acquaintance relationships. This at any rate is the claim of Mark Granovetter in his essay on weak ties.7

Such very special nodes in a network are called hubs. Only a few are needed to create a network with a small diameter and high cohesion. You may have already guessed that in the World Wide Web our top sites, behind which there are enormous databases, will take on this role.

Scale-freeIn his draft on a distributed communication network, which became the ARPANET and then the internet, Paul Baran differentiated among three types of networks: the star, the tree and the mesh network.8

7. Mark S. Granovetter, ‘The Strength of Weak Ties’, American Journal of Sociology 78 (May, 1973): 1360–1380.

8. Paul Baran, On Distributed Communications: IX Summary Overview, Santa Monica: The RAND Corporation, 1964.

Fig. 4. Source: following Barabási 2003, p. 51.

81THEORY OF SOCIAL MEDIA

Our trained eye will now recognize in the variant (C) a network with a somewhat uni-formly distributed degree of connectivity. If we count then we find that:

# Links # Such nodes

2 3

3 8

4 17

5 15

6 3

There are three nodes with two links (upper right and left, lower left), eight with three links, most have four or five links. In a diagram this looks as follows:

What emerges is a roughly normal distribution; the typical degree of connectivity is between four and five. The diameter is approximately ten or eleven hops across the network. Try it out yourself, moving from lower left to upper right!

Not all of the networks with a very small diameter are however of this kind, with a characteristic scale, here about four links per node. And the internet does not look like this either, as it appears that even in the network of all computers connected to the academic internet – and this involves millions – the maximum

Fig. 5. Source: http://www.rand.org/pubs/research_memoranda/2006/RM3767.pdf.

Fig. 6

82

distance between any two is only about twelve hops.9 That is not a lot and cannot be accomplished with a uniform connectedness. A different model has to be found then, that does without this characteristic scale. It will be one that has very many nodes with few links, and very few with a large number of nodes. Such networks are called scale-free, because a medium degree of connectivity is missing. The distribution looks more like this:

By the way, this also applies to Shakespeare, and if we were to count the words in Shakespeare’s col-lected works, we would find that 14,376 words occur a single time, 4,343 words twice and 364 as many as ten times. Auxiliary verbs and conjunctions occur very often and it is wonderful that such words introduce the best known of his monologs: ‘To be or not to be’.10 There are very few words occurring very often and very many occurring very rarely. Even if Shakespeare had written much more than he did, there is good reason to believe that there would not be a central distribution but instead the form of the curve would remain the same. This form can be modeled exactly with a power law:

y = a*x-k

Barabási compares the two network types:11

9. Martin Warnke, Theorien des Internet, Hamburg: Junius-Verlag, 2011. p. 73.10. See, http://math.ucdenver.edu/~wbriggs/qr/shakespeare.html. 11. Albert-László Barabási and Eric Bonabeau, ‘Scale-Free Networks’, Scientific American (May,

2003): 50-59.

Fig. 8. Source: Barabási 2003, p. 53.

Fig. 7

83THEORY OF SOCIAL MEDIA

On the left-hand side is a highway network and on the right airline routes. Highway intersections do not have an unlimited number of exits, airports on the other hand differ in the number of starts and landings. Intersections have a typical number of ac-cess points, while the number of starts and landings can vary greatly; there are many small, and very few, big airports.

Statistical investigations of the internet have shown that the network structure at the level of IP, the router network, follows a power law very closely.12 How else to explain only twelve hops between European cities? Only by assuming that there are just a few big hubs and many very little network nodes. This immensely shortens the dis-tance between any network nodes. Massively connected nodes enable great leaps and provide for the overall cohesion of the network. Unevenness is the most impor-tant ingredient of scale-free networks – they are driven with the help of databases.

StabilityOne of the most amazing characteristics of scale-free networks is their stability against random disturbances. In his RAND report Paul Baran investigated the be-havior of network topologies in the case of a thermonuclear war and discovered that redundancy strengthens the network against destruction. But he also studied evenly distributed random networks, not scale-free, since they were still unknown. They were first discovered as a result of his work and then more closely investigated.

If the latter are namely affected by random destruction, the hubs are damaged with the same probability as the unimportant nodes on the far right-side of the distribu-tion. But because there are far fewer hubs, unimportant nodes are almost only hit; a scale-free net first fragments into isolated islands when it is completely destroyed.13 Put more precisely, this is how scale-free networks with a power of less than three behave, and this just happens to include the internet. If however the hubs are hit, then everything goes down very quickly and the network collapses.

The three figures next to each other (see next page) show three different scenarios of destruction. If the nodes of the random networks are attacked (Fig. 9) then only a very high level of redundancy can help. The scale-free network (Fig. 10) is practically inde-structible if it is accidentally attacked. A targeted attack (Fig. 11) however quickly has drastic effects. Then as resilient as the internet may be against attacks, if it is possible for an entity to control the major nodes it will have consequences. This situation can be seen in totalitarian states, such as in China on the occasion of its censuring search engines. The one hub – the Chinese state – is struggling with another – Google, and the outcome is uncertain. One might suspect at this point that these circumstances give rise to the citadel-like shape of certain network nodes.

GrowthHow do scale-free networks grow? And what happens to them when they are grow-ing? If a network can add new nodes according to a simple rule, which will be intro-

12. Michalis Faloutsos, Petros Faloutsos and Christos Faloutsos, ‘On Power-law Relationships of the Internet Topology’, Computer Communication Review 29 (1999): 251.

13. Barabási, Linked, p. 109.

84

duced in a moment, then limitless growth is possible and it retains its network charac-teristic, its power law. This simple rule prescribes that the preferential attachment of a new network node is proportional to the number of links that the attachment candidate already has14: a node preferably attaches itself to a highly connected node. Scale-free networks have their favorites, and most of the newcomers want to go there; it could be called a migration of lemmings, a star culture, a pop culture, the dominance of the taste of the masses or possibly: Favorite Contacts? If a network is generated like this, it is scale-free to a power of three.

Growth and preferential attachment are both driving forces behind scale-free net-works, and that is also true for the internet. This explains why the internet has been able to undergo such breathtaking rates of growth without collapsing into itself: the

14. Barabási, Linked, p. 96.

Fig. 9-11. Source: Barabási, Albert-László et al. (2003), pp. 50-59, 57.

85THEORY OF SOCIAL MEDIA

very large centers attract the greatest share of connectivity and they are the ones best able to handle it. Road traffic would have collapsed long ago if it had had to grow from four to seven hundred billion15 in 40 years, as the number of internet hosts has. Unevenness provides stability.

Databases: The Citadels of the WebThe distribution of links in the World Wide Web can be estimated from a Nielsen study.16 In terms of active use Google reaches about 90% of all WWW users, Facebook 73%, Wikipedia hovering over one-third with 38%.

These sites are the major nodes in the web. They administer an enormous number of links, Google for example now surely has several tens of billions of pages, and just as large a degree of distribution. It goes without saying that such an enormously large number of links on a website cannot be maintained by hand. The data are stored in gigantic databases; programs create websites as required by database contents. This is the obligatory structure of a really large website.

Databases facilitate access by the many; they compensate a disadvantage exhibited by the web as planned by Sir Tim Berners-Lee: with suitable technologies provided by the major database providers you can take part without having to understand much about technology. A content management system ensures that users can input data through web forms, which ends up in the databases and then, presented in websites, can be seen by others. This is how Google works, Facebook, Wikipedia, Twitter, Flickr, and all the others as well. And there isn’t any other way to do it, since a connection of so many websites as there are now in the WWW is only possible through highly con-nected, automatically operated centers.

The equality that had been originally planned has been upset. Only completely unim-portant sites still work according to the pattern that was originally planned. The web exists only through its most popular sites, which attract essentially all of the traffic. And, obviously enough, these sites must not be destroyed in order to maintain the internet’s stability. This function of a highly sheltered place, unreachable by the public and the enemy likewise, a space fortified to guarantee power and prosperity of the sovereign, is in military jargon called a citadel.

The mechanisms of Web 1.0, which were supposedly so non-transparent and auto-cratic, were overturned. Even citadel web pages can no longer be linked, with only a few exceptions such as Wikipedia. A tweet cannot be referenced from outside Twitter just by using the URL in the address bar.

Within the Twitter citadel a reference in the form of a reply, a retweet, is the norm; refer-ences on the web are also allowed, but domination over the material of a database-supported website always means that you can only proceed if you follow the rules of the provider. The structure and function of a citadel page are specific and completely

15. See, The Internet Systems Consortium, http://www.isc.org/solutions/survey/history. 16. ‘Top 10 Global Web Parent Companies, Home & Work’, Nielsen, September 2012, http://www.

nielsen.com/us/en/insights/top10s/internet.html.

86

under the control of the provider. In Facebook you can only declare someone as a friend, but not as an enemy. Google Ads are excluded by providers, since after all this is a competition for domination on the web, and territories are being divided up. Even for Wikipedia, if you are not high enough in the Wikipedia hierarchy, a page can only be added by following a painstakingly difficult path through a bureaucratic system.

The original idea of the web, participation by everyone under equal conditions, no longer exists. In Web 2.0 everyone is allowed to take part, however only on the terms of the citadel rulers. Without the need for technical expertise, as was the case in the days of Web 1.0, everyone is able to disclose information about themselves and exchange personal data in return for the services of the database operation. The consequence is that databases are in many respects citadels in Web 2.0: as major nodes they have to hold the web together and so they are extremely well-protected, while they exercise unlimited power over the content and discourse. Michel Foucault had to write two books to describe this situation. In his The Order of Discourse he wrote, ‘We must conceive of discourse as a violence which we do to things […]’.17 He admonished us to analyze the discourse not by what is articulated but by what is excluded in addition to what is included. He couldn’t anticipate that such exclusions and inclusions would be done algorithmically, embedded in technology. And we should add to his own ob-servations that not only speaking and writing belong to the practices that structure discourse, but also silence in the form of listening and reading have to be considered part of these practices, even when humans are not involved but computer programs are. On this topic he had yet to write his book on panopticism.18

Today the role of social media in autocratic regimes is broadly discussed. The North African uprisings in Egypt, Tunisia, Libya or Syria are even called Facebook or Twitter revolutions by some. But we know that these companies themselves exercise power over the discourse. They do not permit censorship. They only let their databases be shut down by the U.S. government, as happened during the WikiLeaks scandal. Or, when the ‘cloud’ is operated on pontoons outside the territorial waters of a nation, not at all. They permit the self-organization of the masses on the boulevards of emerging countries. They transfer the communication structures of the rich Western world into the bazaars and in the deserts; and local governments can only attempt to turn off, as an access point to the citadels, the internet itself.

The communion in the databases of the Web 2.0 is about gathering in very special places, operated and monitored by private companies wanting to overhear the dis-courses taking place there in order to sell them to still other companies. Foucault wrote about panopticism:

The ceremonies, the rituals, the marks by which the sovereign’s surplus power was manifested are useless. There is a machinery that assures dissymmetry, dis-equilibrium, difference. Consequently, it does not matter who exercises power […] Similarly, it does not matter what motive animates him […] the external power

17. Michel Foucault, Die Ordnung des Diskurses, Frankfurt/ M.: Fischer, 1996. p 34, translated in Robert Young (ed.) Untying the Text: A Post-structuralist Reader, London: Routledge & Kegan Paul, 1982, p. 67.

18. Michel Foucault, Discipline & Punish: The Birth of the Prison, New York: Vintage Books, 1995.

87THEORY OF SOCIAL MEDIA

may throw off its physical weight; it tends to the non-corporal; and, the more it approaches this limit, the more constant, profound and permanent are its effects: it is a perpetual victory that avoids any physical confrontation and which is always decided in advance.19

The discursive power of the databases of stock market-listed companies pursues ex-clusively the goal of economically exploiting what is said and written. Censorship in the traditional sense of the term does not interest them; it would in fact be bad for business because it would falsify the discourse analysis, known today as data mining. And that is what is incompatible with the politics and culture of an autocratic state such as Tunisia or Libya.

In effect, the community of Web 2.0 forms a body as described by Thomas Hobbes in Leviathan. Only the image of the sovereign, which is formed by the bodies of the indi-viduals, seems to be highly inappropriate today. In the left-hand the unavoidable can of Coke, in the right a credit card, on the head a baseball cap, on the body designer clothes and in front of the house an oversized SUV; this is how the Leviathan should be portrayed today. Just a good consumer, someone who allows himself to be courted and promoted, and is meant to enjoy this cosseted role. This life plan is a blueprint for the whole world, and there are worse guarantees for civil rights. To build a citadel that protects this model out of stone is no longer necessary, as this is done by databases in their air-conditioned and high-security data centers.

ReferencesBarabási, Albert-László. Linked: How Everything is Connected to Everything Else and What It Means for

Business, Science, and Everyday Life, New York: Plume, 2003.Barabási, Albert-László and Eric Bonabeau. ‘Scale-Free Networks’, Scientific American (May, 2003):

50-59.Baran, Paul. On Distributed Communications: IX Summary Overview, Santa Monica: The RAND Corpo-

ration, 1964.Faloutsos, Michalis, Petros Faloutsos and Christos Faloutsos. ‘On Power-law Relationships of the

Internet Topology’, Computer Communication Review 29 (1999): 251-262.Foucault, Michel. Discipline & Punish: The Birth of the Prison, New York: Vintage Books, 1995.______. Die Ordnung des Diskurses, Frankfurt/ M.: Fischer, 1996.Granovetter, Mark S. ‘The Strength of Weak Ties’, American Journal of Sociology 78 (May, 1973):

1360–1380.Leskovec, Jure and Eric Horvitz. ‘Planetary-Scale Views on an Instant-Messaging Network’, Microsoft

Research Technical Report (June, 2007): 1-28.Milgram, Stanley. ‘The Small World Problem’, Psychology Today 1.1 (May, 1967): 61-67.‘Top 10 Global Web Parent Companies, Home & Work’. Nielsen, September 2012, http://www.nielsen.

com/us/en/insights/top10s/internet.html.Warnke, Martin. Theorien des Internet, Hamburg: Junius-Verlag, 2011.Young, Robert (ed.) Untying the Text: A Post-structuralist Reader, London: Routledge & Kegan Paul,

1982.

19. Foucault, Discipline & Punish, p. 195.

88