The Presence of Hyperlinks on Social Network Sites

14
Journal of Computer-Mediated Communication The Presence of Hyperlinks on Social Network Sites: A Case Study of Cyworld in Korea Steven Sams Han Woo Park Department of Media & Communication, Yeungnam University, 214–1 Dae-dong, Gyeongsan-si, Gyeongsangbuk-do, 712–749, Republic of Korea. A study was conducted to determine the extent to which hyperlinks appear within user-submitted comments on Korean social network service Cyworld. Links to social movements were common as was news stories regarding the bleak economic forecast. Males were found to post hyperlinks more frequently than females, and those politicians in the ruling party received more links than those in opposition parties. The purpose of posting a link was evaluated and tasks such as performing Message Amplification and Network Building were prominent. Natural-language processing revealed primarily negative sentiment towards the governing president. The findings go some way to indicate how the presence of hyperlinks and short messages within online dialogs can provide an insight into public perception as a whole. Key words: Case Study, Politics, Asia, Online Communities, Social Science, Hyperlink Analysis. doi:10.1111/jcc4.12053 Social networking services have the potential to mediate, complement, and transform the changing nature of political participation by providing an insight into the perceptions of the electorate as they observe and comment on those who represent them politically. The rise in prominence of social networking services has been observed for some time (Boyd & Ellison, 2007), and this, coupled with online engagement relevant to political debate, suggests that social networking has the potential to influence the political landscape (Elmer et al., 2007). However, it is only within the last decade that online social networks have become technically mature and socially established enough to play a noteworthy role in disseminating information and encouraging citizens to remain engaged in political activities (Park & Kluver, 2009). Williamson (2009) argues that this new campaign strategy has benefits for politicians, as it provides a broad assessment of public opinion and offers both politicians and constituents alike a contemporary method to communicate. Leveraging the existing use and understanding of Web resources has ensured that hyperlinks have found a place beyond online curation as communication tools in social network services. Turow and Tsui (2008) postulated that hyperlinks within Web-mediated communication technology facilitate social networking among people, organizations, and nation-states. There are two ways in which hyperlinks assist users in moving from personal communities to broader social networks. Hyperlinks can act as communication channels; connecting people as they surf from one site to another and, according Accepted by previous editor Maria Bakardjieva 294 Journal of Computer-Mediated Communication 19 (2014) 294–307 © 2013 International Communication Association Downloaded from https://academic.oup.com/jcmc/article/19/2/294/4067568 by guest on 25 May 2022

Transcript of The Presence of Hyperlinks on Social Network Sites

Journal of Computer-Mediated Communication

The Presence of Hyperlinks on Social NetworkSites: A Case Study of Cyworld in Korea∗

Steven SamsHan Woo Park

Department of Media & Communication, Yeungnam University, 214–1 Dae-dong, Gyeongsan-si,Gyeongsangbuk-do, 712–749, Republic of Korea.

A study was conducted to determine the extent to which hyperlinks appear within user-submittedcomments on Korean social network service Cyworld. Links to social movements were common aswas news stories regarding the bleak economic forecast. Males were found to post hyperlinks morefrequently than females, and those politicians in the ruling party received more links than thosein opposition parties. The purpose of posting a link was evaluated and tasks such as performingMessage Amplification and Network Building were prominent. Natural-language processing revealedprimarily negative sentiment towards the governing president. The findings go some way to indicatehow the presence of hyperlinks and short messages within online dialogs can provide an insight intopublic perception as a whole.

Key words: Case Study, Politics, Asia, Online Communities, Social Science, Hyperlink Analysis.

doi:10.1111/jcc4.12053

Social networking services have the potential to mediate, complement, and transform the changingnature of political participation by providing an insight into the perceptions of the electorate as theyobserve and comment on those who represent them politically. The rise in prominence of socialnetworking services has been observed for some time (Boyd & Ellison, 2007), and this, coupled withonline engagement relevant to political debate, suggests that social networking has the potential toinfluence the political landscape (Elmer et al., 2007). However, it is only within the last decade thatonline social networks have become technically mature and socially established enough to play anoteworthy role in disseminating information and encouraging citizens to remain engaged in politicalactivities (Park & Kluver, 2009). Williamson (2009) argues that this new campaign strategy has benefitsfor politicians, as it provides a broad assessment of public opinion and offers both politicians andconstituents alike a contemporary method to communicate.

Leveraging the existing use and understanding of Web resources has ensured that hyperlinks havefound a place beyond online curation as communication tools in social network services. Turow and Tsui(2008) postulated that hyperlinks within Web-mediated communication technology facilitate socialnetworking among people, organizations, and nation-states. There are two ways in which hyperlinksassist users in moving from personal communities to broader social networks. Hyperlinks can actas communication channels; connecting people as they surf from one site to another and, according

∗Accepted by previous editor Maria Bakardjieva

294 Journal of Computer-Mediated Communication 19 (2014) 294–307 © 2013 International Communication Association

Dow

nloaded from https://academ

ic.oup.com/jcm

c/article/19/2/294/4067568 by guest on 25 May 2022

to Adamic and Adar (2003), establish links between homepages in the promotion of social/politicalagendas, events, and issues. Constructing hyperlinks on public Web spaces, such as visitor boards, hasbeen used to draw attention to information that is not widely diffused but potentially note-worthy,and such an action can receive immediate and intensive response (Halavais, 2006). Warnick, Xenos,Endres, and Gastil (2005) state that the existence of hyperlinks permits a deeper cognitive engagementto peripheral aspects of the page while the decision to follow the hyperlink is considered. Moreover,links to external services, in practice, often include explanatory messages and Fogg and Iizawa (2008)have emphasized that the networking ability of hypertext can be enhanced if augmented with contextualinformation. Therefore, hyperlinks can be seen to have evolved beyond the exclusively practicalmechanism to navigate online content into a mechanism to inform and build relationships that can beutilized by end users with limited knowledge of the underlying protocols and infrastructure (Karan,Gimeno, & Tandoc, 2009).

Thelwall (2003) demonstrated that hyperlinks can lead to new sources of information, howeverPark and Jankowski (2008) argue that a link is not a single construct but instead can facilitate separateactivities with distinct implications for communication. Furthermore, Ackland, Gibson, Lusoli, andWard (2010) formalized this definition and identified five principal social functions that hyperlinkscan be said to perform: 1) Information Provision, 2) Network Strengthening, 3) Identity Building, 4)Audience Sharing, and 5) Message Amplification. Ackland et al. (2010) defined Information Provision asthe practice of using hyperlinks to provide new sources of information, whereas Network Strengtheninghas been used to denote the establishment of linking actors within a network or the formation of newbonds. An Identity Building action is seen as recognizing the work of others and signaling endorsement,with Audience Sharing being performed to ease the transit of users through the Web (Ackland et al.,2010). Finally, Ackland et al. (2010) found that the increased use of messages by a vocal few elevatedthe presence of marginal political groups with disproportionately strong online support, and MessageAmplification aims to model this behavior.

Link analysis in the context of social and political science has yielded insights into online communitiesof political actors. The findings of Soon and Kluver (2007) lend credence to the argument that theconfiguration of links within political webospheres can reveal the ideological landscape of parties andadvocacy organizations, visualizing the structure of alliances among those who share similar interests,beliefs, or agendas. Focusing on the nature of hyperlinks embedded within individual politician orparty sites, Foot and Schneider (2006), while examining the role of hyperlinks on campaign websites,found that both political parties and politicians have used hyperlinks to signal their endorsement ofpolitical issues. Similarly, Shumate and Dewitt (2008) discovered that hyperlinks were used to establishassociations among nongovernmental organizations, and bipolar clusters were observed which illustratethe existence of the North–south divide. Hyperlinks therefore can be seen to represent an intentionalcommunicative choice particularly in the context of politically motivated services. For example, Biddixand Park (2008) examined a campus movement using hyperlink analysis and the pattern of linkconnectivity among student organizations indicated that interaction was replicated offline. In thiscontext, political-based hyperlinks have been regarded as a public acknowledgement of others and canreflect existing bonds or facilitate in the construction of new networks.

Park, Thelwall, and Kluver (2005) highlighted the form of linking that occurs within a politicalenvironment online; comprised of outlinks and inlinks to candidate websites. Outlinks are susceptibleto change over time, reflecting shifting political allegiances between parties and organizations (Foot,Schneider, Dougherty, Xenos, & Larsen, 2003). Foot et al. (2003) observed that network choices differaccording to political-philosophy, with left-leaning parties showing an affinity for international serviceswhereas parties identified as coming from the right exhibiting a marked preference for domesticwebsites. Herold (2009) goes further by explaining how online services in Asia provide a platform that

Journal of Computer-Mediated Communication 19 (2014) 294–307 © 2013 International Communication Association 295

Dow

nloaded from https://academ

ic.oup.com/jcm

c/article/19/2/294/4067568 by guest on 25 May 2022

compliments traditional sociocultural norms of participation and offer a means to engage constituentsand the wider population, allowing political figures to enhance reputation and elevate prominence. Leeand Park (2010) use the example of Korean politician Geun-Hye Park to demonstrate this phenomenon.Geun-Hye Park was well regarded before the growth of online network services but the use of socialmedia gave supporters an accessible platform, and the number of inlinks to her online presence steadilyincreased when standing as a presidential candidate in 2007 (Lee & Park, 2010).

However, the motivation to share links and amplify content is not limited to admiration. Theappearance of Nick Griffin, chairman of the far-right British National Party, on the BBC’s QuestionTime show in 2010 provoked an immediate response from users on microblogging service Twitter whofound the decision to include the BNP figurehead on a prominent program questionable (Anstead &O’Loughlin, 2011). External events such as this have been found to increase online activism and, in thecase of South Korea, this was demonstrated during the candlelight protests of May and June 2008 wherethe reintroduction of American beef imports potentially infected with Bovine Spongiform Encephalopa-thy (BSE) prompted protests to demand the cessation of a free trade agreement with the United States(Lee, Kim, & Wainwright, 2010). During this time online communities experienced increased usagefrom citizens commenting on political profile pages to voice their concerns (Park, Lim, Sams, Nam, &Park, 2011), and the fallout from the candlelight protests, and other contentious issues in South Korea,has shown how online networks can mobilize citizens. The use of online environments allow individualsdisillusioned with political participation to have the opportunity to engage and influence public debate,and such online activism has the potential to play a part in shaping the development of government pol-icy. Moreover, Ackland et al. (2010) argued that political messages containing hyperlinks can be greatlyamplified, and that heavy posters have the ability to magnify political issues and further exaggeratepresence to other users. Panagiotopoulos, Sams, Elliman, and Fitzgerald (2011) found that this usageby a vocal minority in both official and unofficial channels elevated the prominence of latent concernsthat appear incongruous with mainstream opinion and are, often, characterized by calls for excessiveretribution. In a similar vein, Utz (2009) proposes that social networking services facilitate participatorydemocracy outside of official election periods, and largely unconstrained by budgetary concerns.

The prominence of using hyperlinks and related statements to convey meaning in online messagedialogs, particularly those with a political leaning, has been found to infer behavior (Karen et al., 2009).Whilst relying solely on those comments that contain hyperlinks appears to unduly limit the availablecontent to a small subset of the full sample, Park et al. (2005) found that the target of links may indicatetrends within the broader sample of potential data. Link choices are rarely randomly constructed,particularly in the context of politically motivated services, and external events have been found tomotivate users into sharing links to provide supporting information (Robertson, Vatrapu, & Medina,2010). Moreover, the intention of the link when contributed to a political message board can, as Acklandet al. (2010) argued, be amplified. Access to such data therefore has the potential to offer a broaderunderstanding of the impact of specific campaign strategies, to track the political process as it unfolds,and provide a representative sample to examine how technologies can mediate democratic processes.

Method

The study will examine the target of hyperlinks contained in online public dialogs to discover theservices and issues that have gained prominence. The location of links will be determined to reveal theextent to which hyperlinks include international services. The World Wide Web is often discussed as aglobal resource but the degree to which citizens engaging in national politics utilize this resource remainunclear. Data collection will be achieved by developing a software program to query social network

296 Journal of Computer-Mediated Communication 19 (2014) 294–307 © 2013 International Communication Association

Dow

nloaded from https://academ

ic.oup.com/jcm

c/article/19/2/294/4067568 by guest on 25 May 2022

message boards and retrieve the comment and details of the user in question. The political backgroundof the recipient of each comment will be determined to discover which user groups engage with whichparties. The purpose of link choices will be evaluated based on established criteria and, following this,machine-based learning algorithms will be applied to the sample of user-submissions to classify thetextual comment that typically accompanies hyperlinks in online public dialogs.

SampleThe social network profile pages of 130 Korean National Assembly Members were identified and thedate parameters of the study were April 2008 – June 2009. April 2008 was chosen as a suitable start dateas this reflected the most recent election and appointment of National Assembly Members. The profilepages were maintained on Korea’s most prominent social networking service, Cyworld.com (Kim &Yun, 2007).

Data CollectionTo determine the type and content of links in user comments, all submissions posted to the sampleof 130 politicians on Cyworld were collected using a Java-based eResearch tool. A HTTP call is madeto request a single page of comments from a politician’s visitor board. Comments on Cyworld arepaginated and each visitor page contains a maximum of five submissions. A HTML page is returned andthe content and date are isolated and held in temporary storage. The developed system then requeststhe next page of comments and performs the same actions, iterating until the target date parametershave been met. Following this, the content of submissions is evaluated using regular expressions todetermine if a URL is present.

Data Analysis and Results

A total of 153,602 comments were collected, and this amount consists of 71,499 comments from males,56,779 from females, and 25,324 from users whose gender could not be determined. It was found that1,276 comments contained hyperlinks, consisting of 587 for males, 322 for females, and 367 where theuser’s gender was unknown. The link count was higher at 1,920 and reflects instances of multiple URLswithin individual postings.

LinksThe links comprised of 762 unique URLs and 259 corresponding domains. These domains weremanually categorized into website type, such as portals, media, party and politician homepages, petitionsites, online fan clubs, and NGOs.

Webometric AnalysisThe sample of 1,920 hyperlinks was analyzed using LexiURL Searcher (Thelwall, 2009; Park, 2010) togenerate a standard Webometric report. Ten of the most frequently occurring domains can be seen inTable 1. The 10 domains listed in Table 1 represent 24.5% of the total hyperlinks found; the remainingdomains have been omitted for brevity.

The list of 10 prominent domains, by link count, in the sample is comprised entirely of domesticKorean services. The three main portals, Daum, Nate, and Naver, are all represented to some degree,as is smaller portal service paran.com. Tistory.com, a community blogging service operated by Daum,was found to occur frequently as was internal links to other profile pages on Cyworld. The four

Journal of Computer-Mediated Communication 19 (2014) 294–307 © 2013 International Communication Association 297

Dow

nloaded from https://academ

ic.oup.com/jcm

c/article/19/2/294/4067568 by guest on 25 May 2022

remaining domains in the sample represent the dominant online news providers in Korea. Donga.comis the online version of the long-running daily newspaper Dong-a Ilbo. Likewise, hani.co.kr is a serviceoperated by Korean newspaper Hankyoreh, chosun.com is owned by the Chosun Ilbo, and joins.comis a website from the JoongAng Ilbo newspaper.

Table 2 shows the distribution of top-level domains (TLDs) reported by LexiURL within the sampleof captured domains. Nearly half of links were targeted to .com sites and approximately one-third werefound to contain a .kr suffix, the country code top-level domain (ccTLD) that denotes a Korean service.Overseas ccTLDs (.cn for China, .us for USA, .my for Malaysia) and other generic top-level domains(gTLDs), such as .net and .org, accounted for the remaining 20.85% of links and 54 domains.

CountryGeneric top-level domains are ubiquitous in Korea and therefore the previous example that examinedthe visible URL in isolation does not confidently denote the physical location of the service that eachlink refers to. Barnett, Chung, and Park (2010) highlighted the lack of coverage of network analysisbased on gTLDs and indicate that an analysis of ccTLDs is the more common scenario despite theprevalence of gTLDs in Korea and elsewhere.

To reduce the ambiguity created by the high usage of gTLDs for national services, a network querytool was developed to perform a DNS (Domain Name System) lookup to resolve the domain into apublic Internet Protocol (IP) address, which is then forwarded to an IP locator service to reveal thephysical location of the server that is mapped to the IP address in question. This approach will locatethe country of the server where the website is hosted, however the company and server could be locatedin different countries. To mitigate this limitation, details of the owner of a domain name are accessiblethrough public WHOIS libraries and a network client was developed to query one such service todetermine the country where the company or individual who registered the domain name is residing.The WHOIS query and IP country lookup fields containing the two forms of location informationare included in Table 3. Typically, these two fields are identical as most large organizations host theirservice onsite, however there are exceptions to this observation. For example, ko.wikipedia.org, whichappeared in the results, is a subdomain of Wikipedia and registered in the USA however the physicallocation of the server was found to be in The Netherlands. Whilst Table 3 indicates that Korean servicesare in the majority, only two of the domains listed contained the Korean ccTLD. The remainder utilizegTLDs, such as .com, and this finding may go some way to address the results in the previous sectionthat assigned .com domains as the most frequently occurring.

A total of 1,849 URLs encountered in the sample were found to belong to services based in Korea,and this represents the majority of links as a whole. The predominate site being linked to was a petitionservice (agora.media.daum.net), and this finding is consistent with similar research that found linkingto petitions was a key function of political action within social network services in other nations(Panagiotopoulos et al., 2011). The 210 links to a petition service were found to point to 30 separatepetitions, and of these links 152 are associated with just two petitions. The most prominent, with 116links, was a petition to impeach the governing president Lee Myung-Bak. The second most prominent,with 36 links, was a petition to encourage a government official to resign following allegations of stalking.

In addition to Daum’s agora petition service, blog and cafe (forum) services operated by Daumwere found in the results. The 51 hyperlinks to a Daum-hosted cafe comprised of 25 unique URLs.The most prominent, with 12 links, was a thread for venting negative sentiment towards PresidentLee. Seven links were found to point to a thread with support for Kang Ki Kab, a popular NationalAssembly member who is outspoken in his criticism for American beef imports and the continuedmilitary presence on the Korean peninsula.

298 Journal of Computer-Mediated Communication 19 (2014) 294–307 © 2013 International Communication Association

Dow

nloaded from https://academ

ic.oup.com/jcm

c/article/19/2/294/4067568 by guest on 25 May 2022

Twenty-six blogs hosted on Daum were present from the sum of 69 links discovered. A memorializedpage previously managed by late ex-president Roh Moo-Hyun accounted for 36 links and representsmore than half of the total. A blog managed by the Creative Korea Party on Daum’s Tistory service waslinked to 61 times.

Offering similar services to Daum, Naver’s blog, cafe, and news domains were found in the results.Naver’s news service accounted for 55 unique links, with the majority of news stories covering economicissues such as the government deficit and divide between living standards of the rich and poor. The72 links to Naver’s blog service were distributed between 47 blogs, whereas 76 of the 106 links toa cafe on Naver were found to be pointing to a single thread hosted by nonodemo.com, a servicethat encourages peaceful demonstrations, such as the candlelight protest, as an alternative to violentstreet rallies.

Links to Social Solidarity Bank (bss.or.kr), a microcredit organization, were found in 56 comments.Empowering the poor has become a prominent sociopolitical issue in Korea and, similarly, links toSocial Enterprise, a government department that encourages philanthropic business ventures, werepresent in 49 comments. Internal links to Cyworld profiles were found in 139 comments and two ofthe most frequently occurring were members of the general public and not, as might be expected,politicians or entertainers. This finding may demonstrate how ordinary citizens can become visible,and in some cases more visible than public figures, when online.

The 71 links that refer to international services represent a small proportion of the 1,920 linksencountered and therefore only Korean results appear in the domains listed in Table 3. To address thisomission, Table 4 shows the most prominent services, by link count, that were found to be locatedoutside Korea. The IP and WHOIS fields of Table 4 indicate that the majority of hyperlinks referto websites in the USA. Eight other countries, Australia, Canada, China, Germany, Malaysia, TheNetherlands, Singapore, and the UK, contributed just 13 links overall.

Whilst the number of international links was small in comparison to domestic services, thoseinternational links that were found appeared to have a strong Korean-emphasis. For example, the mostfrequent international link pointed to video-sharing service YouTube. This was comprised of 15 linksand 12 unique URLs. Of these 12 links, 10 of the videos were in Korean and one was in English butthe topic was the candlelight protests in Korea. The final link was pointing to a Western music video.Similarly, all 12 links to Google’s Video service were found to point to a single video that reported onthe issue of Mad Cow Disease in Korea.

Five hyperlinks pointing to news provider Reuters linked to a single story regarding the contam-ination of cattle feed and although two links to the BBC news service pointed to separate stories,the theme of Mad Cow Disease was present in both. The first reported a case of suspected vCJD(the human form of BSC) in the UK and the second covered the candlelight protests in Korea. Alink to CNN’s iReport citizen journalism service was found, and the story being linked was criticalof the perceived heavy-handedness of the Korean security services towards demonstrators during thecandlelight protests. Similarly, a Portable Document Format (PDF) file hosted by the United StatesDepartment of Agriculture detailed the meat classification scheme for cattle. Five links to this filewere found and may indicate concern regarding American beef imports. One link to a story coveringKorea-Japan relations was found on the online edition of the International Herald Tribune.

The personal homepage of James Won, a second generation Korean-American aiming to becomethe president of the Korea, was found to occur three times. Despite having a .us ccTLD, the hostingcompany, AwardSpace, is registered in Germany and the WHOIS field reflects this. Five hyperlinkspointed to thesixsystem.net, a social welfare initiative that aims to reduce the wage discrepancy betweenrich and poor. The final link, danawa.tk, is believed to be Spam.

Journal of Computer-Mediated Communication 19 (2014) 294–307 © 2013 International Communication Association 299

Dow

nloaded from https://academ

ic.oup.com/jcm

c/article/19/2/294/4067568 by guest on 25 May 2022

UsersThe users who composed comments were split between those logged into Cyworld when submitting acomment and those who commented anonymously. Of these two groups, 367 were from anonymoususers and 909 were from users logged into Cyworld. The existence of mandatory verification for allCyworld accounts enabled gender information to be determined for users who submitted a commentwhen logged into Cyworld. Comments containing links from males were found to be more frequentthan those from females (Male = 587; 64.58%, Female = 322; 35.42%), although the type of contentbeing linked to did not vary considerably. For the full sample of collected comments where the gendercould be determined (n = 128,278 in relation with N = 153,602), the bias towards male-submittedcomments containing links is present though less marked (Male = 71,499; 55.74%, Female = 56,779;44.26%). This may indicate that although commenting on political profile pages is common for bothgenders, the submission of hyperlinks is an activity that is practiced more by males than females.

The comments were categorized into six groups, based on the gender of the poster and the politicalaffiliation of the politician that the message was directed to. This is summarized in Table 5. Politiciansfrom the ruling Grand National Party (GNP) received 562 comments containing links, and thereforejust under 60% of all links (in relation with the total N = 939) posted in the sample of Cyworld profilepages was intended for politicians within the GNP. Links posted to GNP politicians from males, at54.45% (n = 306 in relation with N = 562), represent a higher proportion than both females (n = 172;30.6%) and users whose gender was unknown (n = 84; 14.95%).

Males were also the largest gender group to submit a comment containing a link managed by anopposition party politician, with 42.71% (n = 161) of all comments containing links coming from maleswithin this category (N = 377). Unlike the results for the ruling party, comments containing links fromusers whose gender was unknown was high, and represents 31.83% (n = 120) of hyperlinks posted topoliticians in an opposition party (N = 377). Comments containing links for opposition politiciansfrom females amounted to 25.46% (n = 96); the lowest proportion of gender types commenting to anopposition party (N = 377).

CommentsThe previous section extracted links within the sum of comments and analyzed this in isolation. Thissection will examine the accompanying text that is present with most comments. The first part willdetermine the intention of using a link for a smaller subset of comments using established criteria.Following this, machine-learning algorithms will be used to categorize the sentiment and determinecontent across the sample as a whole.

Link IntentionTo determine the intention of users posting hyperlinks, a random sample of 50 comments from eachof the six user categories was generated. Two coders were employed to categorize the sample followingthe five link intention types proposed by Ackland et al. (2010), with an additional field to account forSpam. The coders agreed on 206 comments from the full sample of 300. The remaining commentswhere disagreements occurred were removed from the study to ensure intercoder reliability.

Table 6 summarizes the results. Of the five actions Ackland et al. (2010) reported that links canperform, and including an additional field for Spam, Message Amplification accounted for just overhalf (n = 106; 51.46%) of all link types. Hyperlinks used for Network Building (n = 58; 28.16%) werealso high and together with Message Amplification represent 79.61% (n = 164) of all link types. Spamaccounted for 15.53% (n = 32) of links and the remaining link types, Information Provision, IdentityBuilding, and Audience Sharing, contributed just 4.85% (n = 10).

300 Journal of Computer-Mediated Communication 19 (2014) 294–307 © 2013 International Communication Association

Dow

nloaded from https://academ

ic.oup.com/jcm

c/article/19/2/294/4067568 by guest on 25 May 2022

Comments within the opposition party sample (N = 99) were found to be highest among thosecontaining links that were classified as performing a Network Building purpose (n = 35; 35.35%), andthose identified as Message Amplification (n = 38; 38.38%). However, it was found that female Cyworldusers within this sample tended to favor using hyperlinks for Network Building (n = 20; 48.78%)whereas the male opposition party sample predominately contributed links found to have a MessageAmplification (n = 13; 43.33%) motive. Moreover, the sample containing the male opposition partycontributed the lowest of all user groups for links identified as having a Network Building purposeand, at just four comments (4.04%), suggests an underrepresented intention for a male commentercompared to the opposition party female contribution, which was recorded at 20 comments (20.2%).

In contrast to the results for opposition parties, ruling party hyperlinks (N = 107) were primarilywithin the Message Application category (n = 68; 63.55%). Links from females in this category (n = 29;42.65%) were found to be higher than those of males (n = 23; 33.82%), whereas the smallest proportionwas found to be comments from users whose gender was unknown (n = 16; 23.53%). Whilst the useof hyperlinks for Network Building purposes within comments directed to the ruling party was lesscommon than those intended as Message Application, the distinction between gender types (malen = 5; 21.74% and female n = 6; 26.09%) was less marked. However the majority of comments withinthis category (N = 23) were found to be from users whose gender could not be determined (n = 12;52.17%).

Spam was higher for opposition parties (n = 19; 59.38%) than was found to be the case for the rulingparty (n = 13; 40.63%). Overall, links identified as being Spam (N = 32) were higher for males (n = 15;46.88%) than was the case for females (n = 12; 37.5%). Males contributed more Spam commentsintended for the ruling party than females (Male n = 7; 53.85% and Female n = 3; 23.08%), althoughthe difference between males (n = 8; 42.11%) and females (n = 9; 47.37%) posting Spam links topoliticians in an opposition party was less divergent. Spam from users not logged into Cyworld wassurprising small for both opposition (n = 2; 6.25%) and ruling (n = 3; 9.38%) party, and representseither the smallest or joint smallest of hyperlink types within opposition and ruling party samples(N = 32). Allowing users to comment anonymously was believed to encourage Spam however this,albeit limited, sample appears to suggest that Spam was more common from users whose identity couldbe verified.

Natural-Language ProcessingComments containing hyperlinks are typically accompanied with textual information and whilst theprevious method to analyze links provided an indication of the intent of each message, the content andsentiment of the statement was not accounted for. To analyze a large body of text, natural-languageprocessing (NLP) is one approach to categorization that can mitigate the problem of obtaining accurateresults from large datasets that are infeasible to perform manually. In a related field, formally consideredword-sense disambiguation, the grading of individual or pairs of words within a statement is performedto isolate words and determine their association with other words. The procedure for grading a statementis to tokenize the sentence into individual words, remove weak words (and, or, for example), reduce theword to its root form (known as stemming), and finally grade words through a predefined dictionarybased on their affective value, such as ANEW (Stevenson, Mikels, & James, 2007) or WordNet (Miller,1995), the sum of which contributes to the aggregate score for the statement as a whole. Grading wordsthrough a dictionary can attain high accuracies when applied to structured text, such as editoriallydefined tags or metadata, however the reliability of a given classification is reduced when assessing textthat is less well-formed, such as user-generated content (Cui, Zhang, Liu, & Ma, 2011). There also existsa compromise between natural-language processing and word-sense disambiguation that attempts to

Journal of Computer-Mediated Communication 19 (2014) 294–307 © 2013 International Communication Association 301

Dow

nloaded from https://academ

ic.oup.com/jcm

c/article/19/2/294/4067568 by guest on 25 May 2022

draw attention to words that frequently collocate, and which may omit meaning when viewed inisolation. Collocations, finding tokens that appear together frequently, perform better on unclean datahowever the approach will only resolve specific terms and does not, by itself, infer polarity.

In contrast to grading individual or pairs of words within a statement, supervised approachesto machine-learning evaluate the context of the sentence by training a classifier on context-specificprecoded text from a comparable domain with the salient properties of the corpus present. This has thebenefit of constraining analysis to specific characteristics, such as polarity, but can omit characteristicsoutside the focus of the study. Supervised learning approaches are, when trained on similar domains,able to attain accuracy levels that make the approach a compelling option when evaluating unstructureddata. Instead of measuring the association between variables, supervised learning aims to determine thestrength of connection between predefined classes. Maintaining the separation between training dataand unseen data is vital to ensure the classifier is predicting and not, as the case might be, overfittedto a predefined outcome. In this respect, the greater the training data the more variety and thereforereliable the classifier will be however a poorly trained classifier can still perform well providing theproblem scope is limited. Given two categories to choose from, such as positive and negative, resultsin a constrained set of options and therefore the outcome of a false classification can be mitigated. Forexample, Pang and Lee (2008) examined the classification of film reviews but restricted evaluation tobroad categories, such as positive or negative, as opposed to the level of granularity that typically occursin reviews for movies, products, and services. In spite of the acknowledged limitations of automatedclassification solutions, the approach offers a scalable implementation that can go some way towardscoding polarity in user-generated content.

A Java program was developed that wrapped a small subset of the LingPipe NLP toolkit to enable twoforms of analysis: Collocation and Sentiment Analysis. The absence of an accessible lookup dictionaryof the kind produced in English prohibited word grading on Korean text. Instead locating the frequentlyoccurring words and proper nouns through an examination of collocations was used and the name ofKorean President Lee Myung-Bak was found to occur 229 times although he has no Cyworld accountand therefore was not being addressed directly. Similarly, the name of deceased ex-president RohMoon-Hyun was found to occur frequently (n = 59) and this is in contrast to similar studies thatexamined U.S. candidate profile pages and found that name occurrences are primarily performed inaddressing the profile owner (Robertson, Vatrapu, & Medina, 2009). Mad Cow Disease (n = 159), beef(n = 110), American goods (n = 59), and candlelight protest (n = 139) occurred frequently and area reference to the candlelight protests of May and June 2008. A Korean word that translates as ‘thegap between rich and poor’ occurred frequently (n = 69) as did the statistical term ‘Gini coefficient’(n = 138).

Unlike an examination of textual collocations, a prerequisite of performing supervised sentimentanalysis is compiling a set of manually coded input files that are representative of each category tobe grouped by: training data. To address an absence of positive and negative sentiment composed inKorean from the body of publicly available corpus, a sample of unclassified text was coded through theuse of hired Korean students. Each student coded the same text into subjective - positive and subjective -negative categories. Where disagreements in categorization occurred the statements in question wereremoved from the sample.

Previous studies have indicated that the majority of user-generated comments submitted to socialnetworking sites are subjective in tone (Thelwall & Wilkinson, 2010), and therefore the accuracy-reducing method of preclassifying into objective and subjective was not performed. Other studies havesuggested that neutral comments may play a part in classification, and two approaches to achieve thisare either adding a third neutral set of training data to the positive and negative texts (Ahn, Geyer,Dugan, & Millen, 2009) or relying on positive and negative training data but classifying the statement

302 Journal of Computer-Mediated Communication 19 (2014) 294–307 © 2013 International Communication Association

Dow

nloaded from https://academ

ic.oup.com/jcm

c/article/19/2/294/4067568 by guest on 25 May 2022

as neutral if the confidence of a classification falls below a predefined threshold (Cui, Zhang, Liu, &Ma, 2011). This approach was chosen as a suitable model to follow owing to the difficulty in identifyingneutral comments on social network message boards, and therefore comments evaluated as comprisingof equally positive and negative sentiment were deemed to be neutral and removed from the sample.

Figure 1 indicates that, for the most part, negative comments were in the minority for all months.However it is worth noting that this sample represents those comments that remained on the messageboard, and were not deleted. Legal mechanisms exist in Korea to forbid the posting of negative commentsto politicians during election periods (Lee, 2009) and any subsequent deleting of negative contentmay result in only those viewpoints with a particular bias remaining persistent. However, despite thepossibility of missing potentially deleted comments, May and June 2008 were found to have high numbersof comments containing links that showed negative sentiment, and this date corresponds with the periodof the candlelight protest that has been viewed as a time of increased civil action and negative sentimentonline (Park et al., 2011). May 2009 also shows a large number of comments containing hyperlinks andnegative sentiment; coinciding with the harassment and suicide of ex-president Roh Moo-Hyun.

Although the initial overhead of supervised learning is greater than that for collocations, the approachhas the benefit of providing a mechanism to determine the accuracy of the developed classifier. Thismargin of error is represented in Figure 1, and based on the average conditional probability, a confidenceinterval of 86.67% was achieved for the polarity model. The outcome of a confirmed dataset, a goldstandard (Kim, Zhai, & Han, 2010), can be a factor in ensuring accuracy, and a 10% – 15% margin forerror is seen as acceptable for the domain providing disagreements between coders are handled correctly(Reidsma & Carletta, 2008). Whilst this result appears promising and the existence of a confidenceinterval provides a degree of validity when evaluating the polarity of a statement, there are circumstanceswhere sentiment analysis may overlook relevant meaning. Thelwall, Buckley, Paltoglou, Cai, and Kappas(2010) highlight this concern and draw attention to the inherent shortcomings in performing sentimentanalysis on diffuse content found within social network and microblogging services. Additionally, Pangand Lee (2008) point out that hidden meaning such as sarcasm and innuendo can be overlookedthrough machine-learning methods, and limitations experienced as a result of language-complexityshould be acknowledged when assessing the effectiveness of the approach.

Conclusion

This study has shown how leveraging existing technologies, such as hyperlinks, can mediate democraticprocesses by determining the motivation, domain choice, sentiment, and message content of userscommenting on political profile pages. The motivation of users was determined using established criteriathat evaluated the purpose of posting a link and ascertained how hyperlinks lead to different waysin which ideological messaging can spread through social media. The domain choice was discoveredthrough Webometric Analysis, DNS, and WHOIS lookup to locate the services and nations that occurfrequently. Finally, the content and sentiment was revealed though the use of NLP, and specificallysentiment analysis and collocation.

The findings suggest that hyperlinks are almost solely targeted to Korean services, and the fewthat do point to overseas sites are usually related in some way to local issues in Korea. This framesthe discourse on Cyworld as one with a largely domestic political agenda and indicates that onlinepolitics is unlikely to take place across national borders, except where the topic in question involvesanother nation or comparable issue. Whilst political discourse was, primarily, found to rely on servicesbased within national borders, less clear previously was how online domestic services are employed tostrengthen arguments and demonstrate affinity.

Journal of Computer-Mediated Communication 19 (2014) 294–307 © 2013 International Communication Association 303

Dow

nloaded from https://academ

ic.oup.com/jcm

c/article/19/2/294/4067568 by guest on 25 May 2022

Males are marginally more likely to comment on political profile pages using links than females,and those profiles managed by ruling politicians were found to be of greater prominence than thoseof opposition parties. Message Amplification and Network Building were found to be the dominantpurpose for submitting links within user-generated comments. Using two forms of machine-basedlearning algorithms, sentiment analysis and collocation of significant phrases, revealed primarilynegative sentiment towards President Lee and his role in the reintroduction of American beef imports.Issues surrounding the suicide of ex-President Roh suggested anger towards those who were seen tobe harassing him prior to his death. In both cases, participation increased online and this reaction toexternal stimuli has also been reported in other nations.

Whilst extreme online reactions to perceived injustices are widely accepted, less clear previouslywas how links are used to amplify user comments. The findings suggest that contributing a link has thepotential to complement political debate and provide a measure when evaluating discourse online thatis scalable to larger datasets.

Acknowledgments

The authors are grateful to Se Jung Park and Yon Soo Lim for their feedback on earlier versions ofthe article. This research was supported by the World Class University (WCU) project through theNational Research Foundation of Korea, funded by the Ministry of Education, Science and Technology(No. 515-82-06574). The first author acknowledges that the work in this article was conducted duringhis stay at the WCU Webometrics Institute.

References

Ackland, R., Gibson, R. K., Lusoli, W., & Ward, S. (2010). Engaging with the public? Assessing theonline presence and communication practices of the nanotechnology industry. Social ScienceComputer Review, 28(4), 443–465. doi: 10.1177/0894439310362735

Adamic, L., & Adar, E. (2003). Friends and neighbors on the Web. Social Networks, 25(3), 211–230.doi: 10.1016/S0378-8733(03)00009-1

Anstead, N., & O’Loughlin, B. (2011). The emerging viewertariat and BBC Question Time: Televisiondebate and real time commenting online. International Journal of Press/Politics, 16(4), 440–462.doi:10.1177/1940161211415519

Barnett, G. A., Chung, C. J., & Park, H. W. (2010). Uncovering transnational hyperlink patterns andweb-mediated contents: A new approach based on cracking .com domain. Social Science ComputerReview, 29(3), 369–384. doi: 10.1177/0894439310382519

Biddix, J. P., & Park, H. W. (2008). Online networks of student protest: The case of the living wagecampaign. New Media & Society, 10(6), 871–891. doi: 10.1177/1461444808096249

Boyd, D. M., & Ellison, N. B. (2007). Social network sites: Definition, history and scholarship. Journalof Computer-Mediated Communication, 13(1), 210–230. doi: 10.1111/j.1083-6101.2007.00393.x

Cui, A., Zhang, M., Liu, Y., & Ma, S. (2011). Emotion tokens: Bridging the gap among multilingualTwitter sentiment analysis. In M. Salem, K. Shaalan, F. Oroumchian, A. Shakery, H. Khelalfa(Eds.), Information retrieval technology (pp. 238–249). Berlin: Springer

Elmer, G., Ryan, P. M., Devereaux, Z., Langlois, G., Redden, J., & McKelvey, F. (2007). Electionbloggers: Methods for determining political influence. First Monday, 12(4). Retrieved fromhttp://firstmonday.org/issues/issue12_4/elmer/index.html

Fogg, B. J., & Iizawa, D. (2008). Online persuasion in Facebook and Mixi: A cross-cultural comparison.In H. Oinas-Kukkonen, P. Hasle, M. Harjumaa, K. Segerstahl, & P. Øhrstrøm (Eds.), Persuasivetechnology (pp. 35–46). Berlin: Springer.

304 Journal of Computer-Mediated Communication 19 (2014) 294–307 © 2013 International Communication Association

Dow

nloaded from https://academ

ic.oup.com/jcm

c/article/19/2/294/4067568 by guest on 25 May 2022

Foot, K., & Schneider, S. M. (2006). Web campaigning. Cambridge, MA: MIT Press.Foot, K., Schneider, S. M., Dougherty, M., Xenos, M., & Larsen, E. (2003). Analyzing linking practices:

Candidate sites in the 2002 US electoral Web sphere. Journal of Computer-MediatedCommunication, 8(4). doi: 10.1111/j.1083-6101.2003.tb00220.x

Halavais, A. (2006). Scholarly blogging: Moving towards the visible college. In A. Bruns & J. Jacobs(Eds.), Uses of blogs (pp. 117–126). New York, NY: Peter Lang.

Herold, D. (2009). Cultural politics and political culture of Web 2.0 in Asia. Knowledge, Technologyand Policy, 22(2), 89–94. doi: 10.1007/s12130-009-9076-x

Karan, K., Gimeno, J. D. M., & Tandoc, E. (2009). The Internet and mobile technologies in electioncampaigns: The GABRIELA Women’s Party during the 2007 Philippine elections. Journal ofInformation, Technology and Politics, 6(3), 326–339. doi: 10.1080/19331680903047420

Kim, K. H., & Yun, H. (2007). Cying for me, cying for us: Relational dialectics in a Korean socialnetwork site. Journal of Computer-Mediated Communication, 13(1), 298–318. doi:10.1111/j.1083-6101.2007.00397.x

Kim, H., Zhai, C. X., & Han, J. (2010). Aggregation of multiple judgments for evaluating ordered lists.In C. Gurrin, Y. He, G. Kazai, U. Kruschwitz, S. Little, T. Roelleke, S. Ruger, K. van Rijsbergen(Eds.), Advances in information retrieval (166–178). Berlin: Springer.

Lee, Y. O. (2009). Internet Election 2.0? Culture, institutions, and technology in the Koreanpresidential elections of 2002 and 2007. Journal of Information, Technology and Politics, 6(3),312–325. doi: 10.1080/19331680903050085

Lee, S. O., Kim, S. J., & Wainwright, J. (2010). Mad cow militancy: Neoliberal hegemony and socialresistance in South Korea. Political Geography, 29(7), 359–369. doi: 10.1016/j.polgeo.2010.07.005.

Lee, Y. O., & Park, H. W. (2010). The reconfiguration of e-campaign practices in Korea: A case studyof the presidential primaries of 2007. International Sociology, 25(1), 29–53. doi:10.1177/0268580909346705

Miller, G. A. (1995). WordNet: a lexical database for English. Communications of the ACM, 38(11),39–41. doi: 10.1145/219717.219748

Panagiotopoulos P., Sams, S., Elliman, T., & Fitzgerald, G. (2011). Do social networking groupssupport online petitions? Transforming Government: People, Process and Policy, 5(1), 20–31. doi:10.1108/17506161111114626

Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and Trends inInformation Retrieval, 2(1), 1–135. doi: 10.1561/1500000011

Park. H. W. (2010). Mapping the e-science landscape in South Korea using the webometrics method.Journal of Computer-Mediated Communication, 15(2), 211–229. doi:10.1111/j.1083-6101.2010.01517.x

Park, H. W., & Jankowski, N. (2008). A hyperlink network analysis of citizen blogs in South Koreanpolitics. Javnost-The Public, 15(2), 57–74. Retrieved from http://javnost-thepublic.org/article/2008/2/4/

Park, H. W., & Kluver, R. (2009). Trends in online networking among South Korean politicians: Amixed-method approach. Government Information Quarterly, 26(3), 505–515. doi:10.1016/j.giq.2009.02.008

Park, S. J., Lim, Y. S., Sams, S., Nam, S. M., & Park, H. W. (2011). Networked politics on Cyworld: Thetext and sentiment of Korean political profiles. Social Science Computer Review, 29(3), 288–299.doi: 10.1177/0894439310382509

Park, H. W., Thelwall, M., & Kluver, R. (2005). Political hyperlinking in South Korea: Technicalindicators of ideology and content. Sociological Research Online, 10(3). Retrieved fromhttp://www.socresonline.org.uk/10/3/park.html

Journal of Computer-Mediated Communication 19 (2014) 294–307 © 2013 International Communication Association 305

Dow

nloaded from https://academ

ic.oup.com/jcm

c/article/19/2/294/4067568 by guest on 25 May 2022

Reidsma, D., & Carletta, J. (2008). Reliability measurement without limits. Computational Linguistics,34(3), 319–326. doi: 10.1162/coli.2008.34.3.319

Robertson, S., Vatrapu, R., & Medina, R. (2010). Off the wall political discourse: Facebook use in the2008 U.S. presidential election. Information Polity, 15(1), 11–31. doi: 10.3233/IP-2010-0196

Shumate, M., & Dewitt, L. (2008). The north/south divide in NGO hyperlink networks. Journal ofComputer Mediated Communication, 13(2), 405–428. doi: 10.1111/j.1083-6101.2008.00402.x

Soon, C., & Kluver, R. (2007). The Internet and online political communities in Singapore. AsianJournal of Communication, 17(3), 246–265. doi: 10.1080/01292980701458331

Stevenson, R., Mikels, J., & James, T. (2007). Characterization of the affective norms for English wordsby discrete emotional categories. Behavior Research Methods, 39(4), 1020–1024. doi:10.3758/BF03192999

Thelwall, M. (2003). What is this link doing here? Beginning a fine-grained process of identifyingreasons for academic hyperlink creation. Information Research, 8(3). Retrieved fromhttp://informationr.net/ir/8-3/paper151.html

Thelwall, M. (2009). Introduction to webometrics: Quantitative web research for the social sciences. SanRafael, CA: Morgan & Claypool.

Thelwall, M., & Wilkinson, D. (2010). Public dialogs in social network sites: What is their purpose?Journal of the American Society for Information Science and Technology, 61(2), 392–404. doi:10.1002/asi.21241

Turow, J., & Tsui, L. (Eds.). (2008). The hyperlinked society: Questioning connections in the digital age.Ann Arbor, MI: University of Michigan Press.

Utz, S. (2009). The (potential) benefits of campaigning via social network sites. Journal ofComputer-Mediated Communication, 14(2), 221–243. doi: 10.1111/j.1083-6101.2009.01438.x

Warnick, B., Xenos, M., Endres, D., & Gastil, J. (2005). Effects of campaign-to-user and text-basedinteractivity in political candidate campaign web sites. Journal of Computer-MediatedCommunication, 10(3). doi: 10.1111/j.1083-6101.2005.tb00253.x

Williamson, A. (2009). The effect of digital media on MPs’ communication with constituents.Parliamentary Affairs, 62(3), 514–527. doi: 10.1093/pa/gsp009

Supporting Information

Additional supporting information may be found in the online version of this article:Figure 1. Positive and negative sentiment from comments containing hyperlinksTable 1. LexiURL Unique / Full hostsTable 2. LexiURL Unique / Full URLsTable 3. Total links to each domain (Korea)Table 4. Total links to each domain (Overseas)Table 5. Comments sorted by poster-gender and politician backgroundTable 6. Comments categorized by link type from six groups of gender and political affiliation

About the Authors

Steven Sams is a doctoral candidate at the School of Information Systems, Computing, and Mathematicsat Brunel University, and a former research fellow of WCU Webometrics Institute at YeungnamUniversity. His research interests are political communication, social media analytics, and online datacollection strategies.

306 Journal of Computer-Mediated Communication 19 (2014) 294–307 © 2013 International Communication Association

Dow

nloaded from https://academ

ic.oup.com/jcm

c/article/19/2/294/4067568 by guest on 25 May 2022

Address: Department of Media & Communication, Yeungnam University, 214–1 Dae-dong,Gyeongsan-si, Gyeongsangbuk-do, 712–749, Republic of Korea. Email: [email protected]

Han Woo Park (Corresponding Author) is an Associate Professor and Director of CyberEmotionsResearch Institute, Yeungnam University. He serves as coeditor of Journal of Contemporary EasternAsia and has guest edited special issues of Journal of Computer-Mediated Communication, Social ScienceComputer Review, and Scientometrics on Computational Social Science and Webometrics.

Address: Department of Media & Communication, Yeungnam University, 214–1 Dae-dong,Gyeongsan-si, Gyeongsangbuk-do, 712–749, Republic of Korea. Email: [email protected]

Journal of Computer-Mediated Communication 19 (2014) 294–307 © 2013 International Communication Association 307

Dow

nloaded from https://academ

ic.oup.com/jcm

c/article/19/2/294/4067568 by guest on 25 May 2022