Using social networks to solve crimes: A case study

15
Using social networks to solve crimes: A case study Alexiei Dingli, Mark Caruana, Robert Zammit Department of Intelligent Computing Systems, University of Malta, Malta ABSTRACT In this paper, we investigate the use of the popular Social Networking Site (SNS) Facebook to solve crimes. In particular we’ll use car thefts as a case study. When a car owner discovers that his or her vehicle has been stolen, every means will help to recover the vehicle. Reporting the incident immediately to the police is obligatory, but alerting your network of friends on a social networking site about your misfortune, could prove useful. In particular, we’ll look into a real case study. This report will try to answer several questions, such as: How useful can these sites be to help an owner recover the vehicle? How far can an appeal reach? What type of feedback users send? We will analyse how people create the appeal in Facebook and what information they share. Keywords: Social Networking, Networks, Propagation, Information Theory, Crime INTRODUCTION The social networking site Facebook started as a college oriented social networking platform in 2004 and was founded by a Harvard University undergraduate student Mark Zuckerberg. Facebook helps people communicate with their friends, family and co-workers. The company develops technologies that facilitate the sharing of information through the social graph, the digital mapping of people's real-world social connections. By giving the Facebook application programming interface (API) to developers, the platform stimulates the development of Facebook specific applications and data exchange with other online services. To use most of the features, users must create a Facebook account and they must be logged-in to Facebook. Facebook provides users with privacy control over their profile, allowing profile information to be classified as either private, visible only to their friends, or, the default, public (Facebook Press 2012)

Transcript of Using social networks to solve crimes: A case study

Using social networks to solve crimes: A case study

 

Alexiei Dingli, Mark Caruana, Robert Zammit Department of Intelligent Computing Systems,

University of Malta, Malta      ABSTRACT In this paper, we investigate the use of the popular Social Networking Site (SNS) Facebook to solve crimes. In particular we’ll use car thefts as a case study. When a car owner discovers that his or her vehicle has been stolen, every means will help to recover the vehicle. Reporting the incident immediately to the police is obligatory, but alerting your network of friends on a social networking site about your misfortune, could prove useful. In particular, we’ll look into a real case study. This report will try to answer several questions, such as: How useful can these sites be to help an owner recover the vehicle? How far can an appeal reach? What type of feedback users send? We will analyse how people create the appeal in Facebook and what information they share.

Keywords: Social Networking, Networks, Propagation, Information Theory, Crime INTRODUCTION The social networking site Facebook started as a college oriented social networking platform in 2004 and was founded by a Harvard University undergraduate student Mark Zuckerberg. Facebook helps people communicate with their friends, family and co-workers. The company develops technologies that facilitate the sharing of information through the social graph, the digital mapping of people's real-world social connections. By giving the Facebook application programming interface (API) to developers, the platform stimulates the development of Facebook specific applications and data exchange with other online services. To use most of the features, users must create a Facebook account and they must be logged-in to Facebook. Facebook provides users with privacy control over their profile, allowing profile information to be classified as either private, visible only to their friends, or, the default, public (Facebook Press 2012)

FACEBOOK STATISTICS (Facebook Factsheet 2012)

• More than 800 million active users (users who have returned to the site in the last 30 days)

• More than 50% of our active users log on to Facebook in any given day

• Average user has 130 friends

• More than 900 million objects that people interact with (pages, groups, events and community pages)

• Average user is connected to 80 community pages, groups and events. On average, more than 250 million photos are uploaded per day

According to Gross et al. (2005) a reason which might shed light on the exponential growth of these particular social networking sites is “college-oriented social networking sites provide opportunities to combine online and face-to-face interactions within an ostensibly bounded domain” (Gross et al. 2005)

Our findings are in conflict with the common perception that information spreads widely and quickly across Facebook. Our observations about some of the existent groups may be related to the burnout process in the theory of information diffusion (Rogers 2003). The slow pace of information propagation might reflect the challenges in recovering the stolen vehicle, even if information is exposed to immediate friends. This is because propagation of information in social networking sites is very abstract. In fact, propagation through SNRs has been studied and mapped onto different propagation models, mostly viral propagation in computer networks. Mapping of propagation can be studied against propagation in computer viruses an Internet surfing habits. These studies proved well when considering specific assumptions, but cannot be used as a general model for propagations through computer networks. The scope of our work is not to explain and go into different propagation techniques but to show how previous studies in different fields provide insight on the propagation paths of information in social networking sites. Instead of listing similarities in different concepts such as viral or propagation, we focused on the differences that appear between them in order not to generate confusion.

As mentioned before, most propagation studies are based on and related to the epidemiological studies. Mathematical models developed to model propagation of infectious diseases have been adapted to model propagation of computer worms. In the epidemiology area, both deterministic and stochastic models exist for modeling the spreading of infectious diseases. In network security area, both deterministic and stochastic models of active worms based on their respective counterpart in epidemiology area have emerged. (Rushkof  1994) In Epidemiology, the spread of viruses and other pathogens have been studied. Research models based on the

same rate of possible contacts are used in different fields and have been used for a long time producing a general good result when assuming that for everyday experience, random contacts gave good results. In our case, recurrent contact is made between fixed number of users and some contacts are more prone to infection than others. This makes it more challenging mapping the social network propagation to epidemiological propagation since only certain aspects can be mapped, and can only be done studying and mapping on what is considered relevant from different point of views. The major differences in both fields can be grouped in two major areas being differences in available data and in the format of the network nodes.

A meaningful connection in social networks such as Facebook is solved by the sociotechnical environment of the sites itself. In the Boyd-Ellison article about Social Network definition and scholarship (Boyd  et  al.  2007): a SNS is defined by its power of articulating a set of connections between users. A connection between users is explicitly established by the users themselves, by friending, commenting and sharing, which make up a meaningful link between users. Thus, this does not mean that every connection has the same value to the user. Differences between values of connections occur because friendship on Facebook is different from reality, where online users befriend each other even though they do not know each other in real life; hence the social connection level is in the technical structure of the system itself.

The other difference when comparing to epidemiology propagation is about the nature of the virus itself and the nodes of the network. The metaphor of media virus (Morley 1980) had great success among the large audience. According to (Jenkins & Krauskopf 2010), who investigated how cultural contents spread through our society, there are many crucial differences in the way viruses and human habits spread. The epidemiology propagation model, even though sometimes used in many propagation systems, should generally be avoided. Jenkins’ point stresses the role of end users in the propagation process. This can be proven since in virus spreading, persons contact with virus is passive (no control on infection process and transmission), while in SNS it is not. In SNS, if a user is exposed to some information, it is up to the user to either choose to spread the information or not. A user can spread information for different reasons, especially when it is unintentionally exposed to certain information. Most spreading and propagation happens because a particular user pursues specific personal interest, expand the relationship between other users or on how relevant is the information to the user (Scott et al. 2010).

Uses and relevance of information spreading is very important along with how the information is used by a SNS group of people of the same culture. This contrasts the other major propagation aspect of what information has the best chance of being propagated. Therefore, to a specific group of people that have the same interest and are exposed to some information, contagion will evolve in spreading of the information, hence the difference when comparing to the spreading of a viral agent between nodes of a social network.

MEASURING CONNECTIDENESS IN SOCIAL NETWORKING SITES  

One of the most famous claims is that anyone can reach anyone else through a chain of acquaintances no more than six people long. This idea, known as the "six degrees of separation", is a measure of our social networks. Six degrees of separation refers to the notion that everyone is on average six steps away, from anybody else in the world, so that a chain of, "a friend of a friend" statements can be made, on average, to connect any two people in six steps or fewer

(Watts 2003). Although, this original idea was a concept of Frigyes Karinthy, in 1929 in a short story ‘Chains’ (Láncszemek), several other researchers, sociologists, psychologists and authors have contributed to further study and analyze the notion of ‘Small World’. One of the famous articles and study about the connectedness between people, was an “An Experimental Study of the Small World Problem” by Travers and Milgram 1967 (Harvard University). In this study, an arbitrarily selected number of people (296) in Nebraska and Boston are asked to generate acquaintance chains to a target person in Massachusetts, by employing the “the small world method” by Milgram (a social psychologist) by using a series of traceable letters. The letters could be sent only to someone whom the current holder knew by first name and who presumably more likely than the holder to know the person to whom the letter was ultimately addressed. 64 chains, reach the target person. Within this group, the mean number of intermediaries between starters and targets is 5.2. Boston starting chains reach the target person with fewer intermediaries than those starting in Nebraska. This study in 1967, demonstrated the feasibility of the “small world” technique, and took a step toward demonstrating, defining, and measuring inter-connectedness in a large society, whereas the mean number of intermediaries (median chain length) observed in this study was greater than five (Travers & Milgram 1967).

However, other studies were intended to prove the “six steps” as a myth. Milgram's original research notes have been re-analyzed by Judith Kleinfeld, a professor psychology at Alaska Fairbanks University in which she found something surprising. The analyzes of Milgram’s study, found that 95% of the letters sent, had failed to reach the receiver. This means that most of the people did not reach the target person. Although 296 possible chains were available in the technical research report, it was only 217 chains that in fact have started, while 64 were completed; a success rate of only 29% of started chains. According to Judith Kleinfeld study, it was evident that the claim of six degrees of separation by Milgram, was not supported by his experiments. Furthermore when Judith Kleinfeld analayzed other studies, none of those matched up to the claim. The idea of "six degrees of separation" may, in fact, be plain wrong-the academic equivalent of an urban myth (Kleinfeld 2002).

Another important experiment was conducted by Professor Duncan Watts in 2001. The experiment was modeled on Milgram’s method by using the Internet. The experiment consisted in an email message that had to be delivered by 48,000 senders with 19 receivers in 157 countries. Watts found that the mean number of intermediaries was around six. This experiment was included in his book Small Worlds in 2003, were the connectedness in a social network is

illustrated by using a variety of models and mathematical theories such as graph theory. More than friends and friends of friends degrees of separation in social networks, these studies applies to every type of network, be it a network of computers, organisations; a network of people, the brain; a network of neurons, the global economy is a network of national economies, which are networks of markets, which are networks of producers and consumers. Hence, it is valuable to many fields, including physics and mathematics, as well as sociology, economy and biology.

These results, were both striking and surprising and continues to be so today although SNSs have contributed to further shrink the median chain length. This is because the conscious construction of such chains of intermediaries is very difficult to do, since our social world is confined to our group of acquaintances at most a thousand people.

In 1967, an experiment involving 296 people which had to deliver a letter was conducted. The problem was that this only involved few persons, and there was no way to know that the routes the letters took were the most direct ones possible. Further on in 2001, another experiment, this time involving 48,000 senders which had to send an email message to 19 targets was also conducted. This time, since it involved a larger number of people, was by far more accurate.

What if the same study is done with 10% of the global population? The fact that nowadays social networks are now in digital form, it will enable researchers to study SNSs on a bigger scale. Facebook data scientists, in collaboration with researchers at the “Università degli Studi in Milano” have released a study (Backstrom et al. 2011) of the Facebook social graph. These researchers managed to process the available data using a 24-core computer with a 1TB hard drive. The hardware is said to have costed not more than a couple of thousand pounds (BBC News 2011). These studies were done in 2011, and all 721 million active Facebook users, totalling to 69 billion friendships, have been examined. To date, this is the largest SNS study ever released. It, must be noted that Celebrities' "Facebook Pages" were excluded from this study and the study was done before Facebook introduced the Subscriptions feature in which users will be able to link to other people they might be interested in. The study is divided in two branches. First, the amount of friends people have was measured and found that this distribution differs significantly from previous studies of large-scale social networks. Secondly, the degrees of separation between any two Facebook users was found to be smaller than the commonly cited six degrees, and has been shrinking over the past three years as Facebook has grown. Finally, it was observed that while the entire world is only a few degrees away, a user’s friends are most likely to be of a similar age and come from the same country. Although Facebook limits users to having 5,000 friends, it was found that the median friend count on Facebook is 100 or 0.000014% of Facebook's total membership. Advanced algorithms were developed at the Laboratory for Web Algorithmics of the University of Milano. These algorithms were used to enable the researchers to approximate the number of hops between all pairs of individuals on Facebook. It was found that 99.6% of all pairs of users are connected by paths with 5 degrees (6

hops), 92% are connected by only four degrees (5 hops).In addition, this study showed that 84% of all connections are between users in the same country.

Figure 1. Percentage of pairs at given distance vs Hop Distance

(Source: (Backstrom et al. 2011))

Over the years, Facebook has sustained its growth and this is making the people more connected. The average distance in 2008 was 5.28 degrees when the network was smaller, while now (end 2011) it is 4.74, hence the four degrees of separation. However, the study suggest that the average distance is stabilising, which means that even if the other 90% of the world join Facebook, our degree of separation will not get much smaller.

ANALYSIS

In Facebook, users can start a discussion by either posting to their wall, or their friend’s wall via the status update. Every wall post will be visible to the user’s subscribers, and everyone following the poster can in turn comment, like or share the post. For every comment and share by a particular user, an entry in their wall is listed, hence visible by friends and subscribers. Therefore, for this study, a discussion consists of an entry followed by a number of comments and shares. A post will indicate any text entry or comment posted by a user to a subscriber, an entry a new share or conversation started by a user and a comment for any entry posted by a user on a share or post.

Social networking web sites such as Facebook, are available globally and can be used in different ways according to the social and cultural contexts of a particular geographical location. Since the specific role of a social networking medium like Facebook plays, within a specific media system, it can only be specifically understood by applying the findings to a specific cultural group. In order to keep to this extent, the database we extracted has been filtered in order to keep only entries on the Maltese network in order to make our experiments manageable. Hence, our analysis will track a particular post and its propagation along the Maltese network, which resulted in finding a stolen car in less than 48 hours.

In our analysis, we used Facebook Graph API to extract any share, post, entry or comment relating to a real stolen car case which happened in 2011. Through Mr. L’s wall, a specific search criteria was used to retrieve public user posts. At the end of the analysis period, we came up with a network of nodes interrelated to each other, starting from the initial wall post by Mr. L and retrieving all the connected graph of the followers. As search criteria, we used the number plate ‘LOI-555’, because of its uniqueness and most posts, even when shared, included the number plate. We also searched for posts containing Mr. L’s mobile number as a second run, then compared results to obtain a complete data set.

Propagation, in its simple form means transmission or dissemination. Factors that enable propagation of posts and conversations are many, but in this analysis these factors were based on a few factors in well-defined metrics. This study focuses on two main metrics being the number of interactions and the audience. Number of interactions refers to the shares, likes and comments of the original post or its replica. Interactions are used as a measurement to show the ability of a Facebook user or a Facebook post to generate participation in conversations. Audience refers to the number of Facebook users exposed to a post, entry or comment. For example, in our case, all users who had the stolen car post on their wall, being because of either posted, commented or like by a friend were the audience. These factors were identified by analysing the Facebook data activity. When analysing at a high level of abstraction, the model of social data is made up of a friend/subscriber social network where posts are exchanged as messages. The classification identified above is only used in order to simplify the data obtained and simplify the complex web of network interactions. These metrics do not segregate the data, but to the contrary, show the strict relationship between a post and the viewers.

The nodes in the network are equally related to each other but differentiate from each other by the number or the tendency to reproduce content. Factors like time of posting, Facebook activity and number of friends all factor in the reproduction of data. As time and day of week was also collected by our application, we were able to determine time statistics. From the data gathered, it was possible to describe a rather accurate time trend on a weekly basis for nodes reproducing content. In our case, since the car was found in 48 hours, we could only analyse the time. But, from research done, nodes were more likely to reproduce content from Monday to Friday, mainly because of the availability of internet at workplaces, were as a regression in posting was noticed during the weekend, except on Sunday evening. This showed that social

media usage is widely popular in workplaces, were it might be working alongside other media such as email as a communication base. It can be said that social networking has been introduced to the daily routine, such as checking email, browsing daily websites and news portals. This results in greater sharing through social networking sites since daily issues become topics of conversations. On the contrary, users seem to use less social networking sites during the weekend. Other issues in daily lives such as over working may impact the use of social networks and introduce a high level of variability in average posting and conversation. On the contrary, users might even post more in periods when they are more likely to be using a computer or online device, for example while studying for an exam or in the case of illness when users are forced to stay at home. Examining the stolen car case, one can notice that the first post Mr. L posted was at 3:50am and was shared only twice, with 10 comments. At 9:50an, Mr. L posted a photo of his stolen car. The photo was shared 30 times in between his friends. The only difference between both postings is the time of posting relating to the number of friends online. At 9:50am, shares increased three times. This increase happened because users are more likely to be online at 9:50am then at 3:50am since they usually are at work.

Figure 2. Analyses of Facebook postings (Source: Vitrue)

As shown in Figure 2, Vitrue, a Social Media Management Company, analysed Facebook data from the 10th of August 2007 to October 2010 for brand streams, posts and comments. Some interesting facts came out of this study as below (Warren 2010):

§ The three biggest usage spikes tend to occur on weekdays at 11:00 a.m., 3:00 p.m. and 8:00 p.m. ET.

§ The biggest spike occurs at 3:00 p.m. ET on weekdays.

§ Weekday usage is pretty steady, however Wednesday at 3:00 pm ET is consistently the busiest period.

§ Fans are less active on Sunday compared to all other days of the week.

The same study conducted by the company Vitrue, showed that morning posts are more effective, hence increasing the possibility of interaction, hence increasing propagation. As in Figure 2, posts published in the morning tend to perform better than those published in the afternoon even though most posts appear around 3:00 pm. Posts published in the morning are 39.7% more likely for users to interact with than those published in the afternoon.

Figure 3. Days of week posts on Facebook (Source: Vitrue)

Another interesting finding is that the early minutes of every hour, i.e. between :00 and :15 of every hour, shows an increase in posting along with the second half between :30 and :45 of every hour. This does show a quite an interesting fact, mainly that increase in the second half of the hour could happen in breaks and in between meetings. Might be the case that users check

Facebook in between meetings since checking Facebook is more likely to happen at the start of an hour rather than in the middle.

These network nodes form a network structure and studies shows how it impacts on the propagation of the content of a social networking site. It can be described as the analyses of the process of all factors put together from an originating post to the interaction by other users with the particular post and its sharing. It was also observed that between network nodes, there exists a closer network made of most important connections, where for a post on a wall, these important connections commented more than once. This produces a communication network in a connection network, as specific nodes are more active than other, hence more likely to share, repost and comment. The greater the frequency a user participates in conversation over social networks, the more there is the chance of propagating data. This suggests that interaction does not only refer to simple exchange of information about raised topics, but involves a more complex relational context made up of friends and subscribers. In fact, when modeling network propagation, the existence of such preferred paths should be considered. These preferred paths and others through the social graph give a single message a greater probability of being commented and having its visibility increased. This can be clearly seen on Facebook user’s wall were the more a user participates in online conversation, the greater is the probability that the user posts on his wall and being commented by friends. Hence, this way the probability of propagation of a message is highly increased.

When network nodes leave comments for a particular post, a social space is created where the visibility of published messages is defined. The more likes and comments generated for a share or post, the more the probability for a message to be seen outside the original network of subscribers. Hence, interaction is increased the more a post is shared. Another factor which increases the possibility of propagation is the amount of online users at a particular time when a message is posted, increasing the chance of visibility of a post. As Facebook is a dynamic environment, most posts, which are commented on, gain a high level of visibility because of the real time generation of the post sequence and user interactions. This in fact makes Facebook network centered rather than user centered. As mentioned above in the stolen car example, the post at 9:50am had much more shares than the one posted at 3:50am. The more users online, the more a post is commented on and then propagated. Therefore, the number of users online increases the possibility of post visibility. The lifetime of a post and lifetime of distribution of posts does not correlate to the amount of comments left on a post. The short lifetime indicates an average use of the social network as a tool for informal conversation with no topics of importance.

Propagation of this particular event, which started on Facebook, extended also to other networking media all over the internet such as forums and blogs. From the collected data, we found that the initial post not only propagated as seen on the network graph presented, but also on forums frequented by Maltese and other social networking sites. Users who saw one of the

posts on Facebook might have also reposted pictures, videos or similar media describing the event. These other forms were not catered for in our study, since only text was collected from Facebook and subsequently analysed. The extent of the propagation of this event can actually be much larger than the actual data presented in our study. For example, searching on Google for the keywords “stolen”, “Range Rover” and mobile number "79XXXXXX", will lead to several results. Considering that the initial post was shared on Facebook, it shows that propagation of information happens both on and off social networking sites were usually posts on social networking sites originate from other media such as news portals. Factors enabling propagation can be based as a short list of defined metrics as stated in (Rossi et al. 2010). The probability that a generic user will see a message of another user which are not in the same network of friends was analysed and defined as Pu,m,t(u’), were u is the user that introduces a message m into a social network site was and u’ is the generic user. Since u’ is not in the circle of friends of user u, its participation in the conversation depends on user’s u friends posting on their wall, which in turn propagates until it reaches u’ wall. The average number of users exposed to a conversation was defined as ∑U Pu,m,t(u’) knowing that each user will receive a message. Propagation to user u’ depends on the time taken by the user’s circle of friends to interact with the message. The time taken can be said to be the delay of reception. In the stolen car case, the delay of reception when considering the first post will be much longer than that of the second post. In fact, the first share of the first post was at 8:45am, nearly 5 hours later, whereas the first share on the second post was shared nearly an hour later, hence lessening the delay of reception.

Facebook’s Graph API was used in order to investigate Mr L’s stolen car case. As every object in Graph API is represented by a unique ID, Mr L’s profile unique ID was needed in order to have access to his wall’s properties via the Graph API. Going through the wall, we used part of his car’s number plate ‘LOI’ to search for posts relating to the stolen car case. It was found that the post which propagated most was posted at 10:50 am which included a photo of the car. The post was shared 30 times and comments on the photo added to three. So from the above study, one can see that social networks are somewhat unique, since they are well-connected and at the same time, they are very locally clustered, with the vast majority of connections spanning a short distance.

In order to further on our studies and collect data for future work, a public page with the name Stolen Cars Community - http://www.facebook.com/SCCMalta was created. The Description and scope of the page was to enable “friends” and “fans” to make use of the page in order to find their stolen car or bike. This enabled friends to communicate via a non-personal wall, which has a specific purpose. In the first couple of weeks, 52 likes were registered with a steady 20% increase, with an amazingly 18,634 connections i.e. friends of fans, and with a weekly total reach of 540 visits.

CONCLUSION

In the above work the propagation of data in Facebook was analysed, along with how a post on a user’s wall propagates through the social graph and compared to similar posts in groups and pages and their propagation. From the analysis and data gathered data propagation on Facebook was found to be driven by motivation, where users either share the data, so data is propagated in all its form, or interacted with via comments and likes, without any intention of making the data available to other users. In fact, according to another study (Kobler et al. 2010), results show that most users (more than 37% - daily) not only passively follow status messages of their friends but do actually react on them by writing comments.

With this analysis, we managed to draw up some patterns of data propagation in a SNS site. Propagation of data was modeled onto epidemiological studies in order to extract different propagation paths. One of the difficulties encountered was the mapping of the user activities in propagation to the observational model was that the information could only be analysed by looking into the interactions users had with the initial appeal. Therefore the mapping between propagation models showed direct paths which are visible in the SNS and other which are indirect and assumed as propagation paths as interacted with the analysed data.

The analyses of the propagation paths identified resulted in the findings below:

1. Time-trends seem to be the key factor in data propagation, were posting in specific time periods will result in the length of data propagation

2. User generated posts generate much more interactions in the form of shares, likes or comments then data imported from other sources. This can be seen in the Facebook page experiment ‘Stolen Car Community’ were imported data generated much less shares/comments then the wall posts such as that by Mr. L.

3. Propagation trends are determined by events.

4. Data propagation, particularly of an event, happens in two ways, by users giving the news and those that interact with the event by developing discussion via comments. Users who shared the news generated by Mr. L himself and those who interacted with the shares not directly with his post. Both types of interactions on a user’s wall generate a high number of interactions.

5. The first post with its original content may evolve into other posts.

6. Interaction with external news, i.e. imported into Facebook generated very little interaction. This may happen because users were either already exposed to the news or because the post was not socially affecting them or the users in their social network.

7. Time of posting is very important for posts which give information or news. Early posts,

such as the one by Mr. L posted at 3.50am had a saturation effect while other further in the morning had a high number of interactions.

These findings show how data propagation in Facebook is influenced by many factors and cannot be neither mapped to other models nor compared to other network models in order to come up with a simpler social network model. Obviously, our approach suffered from various limitations. Ideally, the Facebook group mentioned above (Stolen Car Community) should have been expanded further. Membership to this group is very particular and mainly involves people who were either victims of a theft or who know someone whose car was stolen. However given these limitations, it might be interesting to study group dynamics in such a specialized group. One could delve into issues such as solidarity, which might be shown by acquaintances (which are members of the same group) and the effect of their action on the group as a whole. Another important fact might be peer pressure. It's a known fact that, peer pressure in groups is very strong. Thus this pressure might be a very important factor in the dissemination of the message. All these different factors should be identified and studied on their own in order to identify which of these factors is the strongest in the social world.

Further work is needed in order to deduce the propagation capability within these factors by a formal model, which can be mathematically analysed and simulated. It would be interesting to develop a mathematical model, which can help us understand the propagation of these messages and thus providing us with guidelines on how the delivery of such a message can be optimized. Also, it would be interesting to find out if the above factors hold ground in analysing different scenarios, which can be investigated by using the methodology in this work. These scenarios are extremely important and can make a huge difference. Some of them might be time sensitive. If we consider a post about someone needing money in order to treat a terminal illness, this urgency might prompt people to participate and thus spread the message around. Others might not have an explicit time limit, however if urgency is created, it might push people to act. Another factor to consider is the profile of the person. Some people simply read a post without doing anything, others would simply share it around whilst some of them would even comment on the post. A person might be a combination of these as well. So when posting, it might be more effective to target certain people rather than others. These are all different considerations which should be examined in order to create an optimized posting strategy.

Notwithstanding these shortfalls, it is clear that the system does work. The reason why it works might be because it follows a similar principle to Linus’ Law, which states that “with enough eyeballs, all bugs are shallow”. Obviously this law was written for software development. However we can modify the law slightly to say that “if enough people (who count) see the message, action will be taken”.

REFERENCES

Facebook Press (2012) from http://www.facebook.com/press.php Facebook Factsheet (2012) from http://www.facebook.com/press/info.php?factsheet Gross, R., & Acquisti, A. (2005). Information revelation and privacy in online social networks. Proceedings of the 2005 ACM workshop on Privacy in the Electronic Society. Alexandria, USA. Rogers, E. (2003). Diffusion of Innovations. Free Press, NewYork, 5th edition. Rushkof, D. (1994). Media Virus!: Hidden Agendas in Popular Culture. Ballantine Books. Boyd, D. & Ellison, N. (2007). Social network sites: Definition, history, and scholarship. Journal of Computer-Mediated Communication 13. Morley, D. (1980). The Nationwide audience: structure and decoding. British Film Institute. Jenkins, H. & Krauskopf, A. (2010). If it does not spread it’s Dead: Media Viruses and Memes. Convergence Culture Consortium. Scott, G. & Gilad, L. & Danah, B. (2010). Tweet, Tweet, Retweet: Conversational aspects of retweeting on Twitter. Proceedings of the 43rd Hawaii International Conference on Social Systems (HICSS). Watts, D. (2003). Small Worlds: the dynamics of networks between order and randomness, Princeton University Press. Travers, J. & Milgram, S. (1967). An Experimental Study of the Small World Problem, In Sociometry, Volume 32, No. 4. Kleinfeld, S. (2002). The Small World Problem. In Society, Volume 39, No. 2. Backstrom, L. & Boldi, P. & Rosa, M. & Ugander, J. & Vigna, S. (2011). The Anatomy of the Facebook Social Graph, Published in the Computing Research Repository of the ACM. BBC News. (2011). Facebook users average 3.74 degrees of separation. from http://www.bbc.co.uk/news/technology-15844230 Warren, C. (2010). When are Facebook users most active?. From http://mashable.com/2010/10/28/facebook-activity-study/

Rossi, L. & Montesi, D. & Magnani, M. (2010). Information propagation analysis in a social network. Proceedings of the International Conference on Advances in Social Networkins, Analysis and Mining. Kobler, F. & Riedl, C. & Vetter, C. & Leimeister, J. & Kremar, H. (2010). Social Connectedness on Facebook, Proceedings of the Americas Conference on Information Systems (AMCIS)