ASKNet: Automatically Creating Semantic Knowledge ...

132
ASKNet: Automatically Creating Semantic Knowledge Networks from Natural Language Text Brian Harrington Oxford University Computing Laboratory University of Oxford Thesis submitted for the degree of Doctor of Philosophy Hilary Term 2009

Transcript of ASKNet: Automatically Creating Semantic Knowledge ...

ASKNet: Automatically CreatingSemantic Knowledge Networksfrom Natural Language Text

�Brian Harrington

Oxford University Computing Laboratory

University of Oxford

Thesis submitted for the degree of

Doctor of Philosophy

Hilary Term 2009

To Sophie:You are the love of my life and have made me happier than I ever

thought possible.You mean everything to me.

I want to spend the rest of my life with you.

Will you marry me?

Acknowledgements

I would like to thank my supervisor Stephen Clark for his continued sup-port throughout my time at Oxford. It was his assistance and guidancethat made this thesis possible.

I would also like to thank my family, especially my parents who havealways provided me with moral (not to mention financial) support.

Finally, I would like to thank Sophie for her role in motivating me. It ismy desire to start a life with her that has pushed me through the finalstages of writing this thesis and prevented me from becoming a perpetualstudent.

This research was funded by the Clarendon Scholarship and in part bythe Canadian Centennial Scholarship.

Contents

1 Introduction 6

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.2 Existing Semantic Resources . . . . . . . . . . . . . . . . . . . . . . . 8

1.2.1 Manually Constructed Resources . . . . . . . . . . . . . . . . 9

1.2.2 Automatically Constructed Resources . . . . . . . . . . . . . . 10

1.3 Contributions of This Thesis . . . . . . . . . . . . . . . . . . . . . . . 12

1.4 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2 Parsing and Semantic Analysis 15

2.1 Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.1.1 Choosing a Parser . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.1.2 The C&C Parser . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.2 Semantic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.2.1 Discourse Representation Theory . . . . . . . . . . . . . . . . 20

2.2.2 Boxer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3 Semantic Networks 24

3.1 A Semantic Network Definition . . . . . . . . . . . . . . . . . . . . . 24

3.2 The ASKNet Semantic Network . . . . . . . . . . . . . . . . . . . . . 27

3.2.1 Temporality and Other Issues . . . . . . . . . . . . . . . . . . 28

3.2.2 Network Implementation . . . . . . . . . . . . . . . . . . . . . 29

3.3 Parser Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.3.1 C&C Parser Filter . . . . . . . . . . . . . . . . . . . . . . . . 35

3.3.2 Boxer Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4 Information Integration 43

4.1 Spreading Activation . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.1.1 History of Spreading Activation . . . . . . . . . . . . . . . . . 47

i

4.2 Spreading Activation in ASKNet . . . . . . . . . . . . . . . . . . . . 48

4.2.1 Update Algorithm: Example . . . . . . . . . . . . . . . . . . . 49

4.2.2 Update Algorithm: Implementation . . . . . . . . . . . . . . . 56

4.2.3 Firing Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 60

5 Evaluation 63

5.1 Network Creation Speed . . . . . . . . . . . . . . . . . . . . . . . . . 64

5.2 Manual Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

5.2.1 Building the Network Core . . . . . . . . . . . . . . . . . . . . 69

5.2.2 Evaluating the Network Core . . . . . . . . . . . . . . . . . . 71

5.2.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

6 Semantic Relatedness 80

6.1 Using ASKNet to Obtain Semantic Relatedness Scores . . . . . . . . 82

6.2 WordSense 353 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

6.3 Spearman’s Rank Correlation Coefficient . . . . . . . . . . . . . . . . 84

6.4 Experiment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

6.4.1 Data Collection & Preparation . . . . . . . . . . . . . . . . . 85

6.4.2 Calculating a Baseline Score . . . . . . . . . . . . . . . . . . . 86

6.4.3 Calculating the ASKNet Score . . . . . . . . . . . . . . . . . . 87

6.4.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

6.4.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

6.5 Experiment 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

6.5.1 Inspecting the Corpus . . . . . . . . . . . . . . . . . . . . . . 91

6.5.2 Building a Better Corpus . . . . . . . . . . . . . . . . . . . . . 94

6.5.3 Results: New Corpus . . . . . . . . . . . . . . . . . . . . . . . 95

6.5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

7 Conclusions 100

7.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

7.1.1 Future Improvements . . . . . . . . . . . . . . . . . . . . . . . 102

7.1.2 External improvements . . . . . . . . . . . . . . . . . . . . . . 104

A Published Papers 106

B Semantic Relatedness Scores & Rankings - Initial Corpus 107

C Semantic Relatedness Scores & Rankings - Improved Corpus 115

ii

List of Figures

2.1 A simple CCG derivation using forward (>) and backward (<) appli-

cation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.2 Example Boxer output in Prolog format for the sentences Pierre Vinken,

61 years old, will join the board as a nonexecutive director Nov. 29.

Mr. Vinken is chairman of Elsevier N.V., the Dutch publishing group. 22

2.3 Example Boxer output in “Pretty Print” format for the sentences

Pierre Vinken, 61 years old, will join the board as a nonexecutive di-

rector Nov. 29. Mr. Vinken is chairman of Elsevier N.V., the Dutch

publishing group. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.1 An example of an associative network. Objects and concepts are linked

without distinction for type or direction of link. . . . . . . . . . . . . 25

3.2 Semantic network representations of the two parses of “John saw the

man with the telescope” . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.3 A lexically ambiguous network . . . . . . . . . . . . . . . . . . . . . . 26

3.4 A Hierarchical Semantic Network . . . . . . . . . . . . . . . . . . . . 28

3.5 UML Class diagram of ASKNet’s semantic network architecture . . . 30

3.6 Sample output from the C&C parser for the text “Pierre Vinken, 61

years old, will join the board as a nonexecutive director Nov. 29. Mr.

Vinken is chairman of Elsevier N.V., the Dutch publishing group”. . . 37

3.7 The C&C parser filter output for the input given in Figure 3.6 . . . . 38

3.8 A simple example of the Boxer DRS (left) and resulting ASKNet net-

work fragment (right) for the sentence “John scored a great goal” . . 39

3.9 Sample output from Boxer for the text “Pierre Vinken, 61 years old,

will join the board as a nonexecutive director Nov. 29. Mr. Vinken is

chairman of Elsevier N.V., the Dutch publishing group”. . . . . . . . 41

3.10 Boxer filter output for the input given in Figure 3.9 . . . . . . . . . . 42

iii

4.1 A collection of network fragments taken from various sources, including

news, biomedical, geographical and political information . . . . . . . 44

4.2 An integrated network created from the fragments in Figure 4.1 . . . 45

4.3 An example main network containing information about United States

politics, writers and mathematicians being updated by a network frag-

ment formed from the sentence “Bush beat Gore to the White House.” 51

4.4 The activation from the update network is transferred to the main net-

work. For nodes with more than one potential mapping, the activation

is split based on the current similarity matrix score. . . . . . . . . . . 53

4.5 The activation from the update network is transferred to the main net-

work. The activation from the bu node is split unevenly, with a higher

percentage going to georgebush than johnbush due to our updated

similarity scores. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

4.6 Network resulting from application of the update algorithm . . . . . . 56

4.7 An example similarity matrix and the corresponding ScoreMatrix data

structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5.1 Average time to add a new node to the network vs. total number of

nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

5.2 Pseudocode for the algorithm used to create a network core given an

ASKNet network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

5.3 Graphical representation for topic: “Elian Gonzalez Custody Battle”. 72

5.4 Expanded section of Figure 5.3. . . . . . . . . . . . . . . . . . . . . . 73

5.5 Graphical representation for topic: “2001 Election of Vladimir Putin”. 74

5.6 Expanded section of Figure 5.5. . . . . . . . . . . . . . . . . . . . . . 75

5.7 Examples of type errors. Left: “Melissa virus” , a computer virus

identified as a location. Right: “Gaza Strip”, a location identified as

an organisation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

5.8 Examples of label errors. Left: After processing a sentence contain-

ing the phrase “...arriving in Israel Sunday to conduct...” the phrase

“Israel Sunday” is mistakenly identified as a single entity. Right: The

location “Miami” and the person “Miami Judge” collapsed into a single

node. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

iv

5.9 Examples of path errors. Top: Three independent meetings referenced

in the same sentence, all involving relatives of Elian Gonzalez, are

identified as a single meeting event. Bottom: The network indicates

that the computer, rather than the virus, hides in Microsoft Word

documents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

6.1 Scatter plot of rank order of ws-353 scores vs. rank order of baseline

scores. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

6.2 Scatter plot of rank order of ws-353 scores vs. rank order of ASKNet

scores. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

6.3 An excerpt from the Wikipedia “disambiguation page” for the query

term “American”. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

6.4 The text retrieved from the page shown in Figure 6.3 using the original

data preparation methods (top) and the improved heuristic (bottom). 95

6.5 Scatter plot of rank order of ws-353 scores vs. rank order of baseline

scores as calculated on the improved corpus. . . . . . . . . . . . . . . 98

6.6 Scatter plot of rank order of ws-353 scores vs. rank order of ASKNet

scores as calculated on the improved corpus. . . . . . . . . . . . . . . 99

v

Chapter 1

Introduction

This thesis details the creation of ASKNet (Automated Semantic Knowledge Net-

work), a system for creating large scale semantic networks from natural language

texts. Using ASKNet as an example, we will show that by using existing natural

language processing (nlp) tools, combined with a novel use of spreading activation

theory, it is possible to efficiently create high quality semantic networks on a scale

never before achievable.

The ASKNet system takes naturally occurring English text (e.g., newspaper arti-

cles), and processes them using existing nlp tools. It then uses the output of those

tools to create semantic network fragments representing the meaning of each sentence

in the text. Those fragments are then combined by a spreading activation based al-

gorithm that attempts to decide which portions of the networks refer to the same

real-world entity. This allows ASKNet to combine the small fragments together into

a single cohesive resource, which has more expressive power than the sum of its parts.

Systems aiming to build semantic resources have typically either overlooked in-

formation integration completely, or else dismissed it as being ai-complete, and thus

6

1.1. MOTIVATION

unachievable. In this thesis we will show that information integration is both an in-

tegral component of any semantic resource, and achievable through a combination of

nlp technologies and novel applications of spreading activation theory. While extrac-

tion and integration of all knowledge within a text may be ai-complete, we will show

that by processing large quantities of text efficiently, we can compensate for minor

processing errors and missed relations with volume and creation speed. If relations

are too difficult to extract, or we are unsure which nodes should integrate at any given

stage, we can simply leave them to be picked up later when we have more information

or come across a document which explains the concept more clearly.

ASKNet is primarily designed as a proof of concept system. However, this thesis

will show that it is capable of creating semantic networks larger than any existing

similar resource in a matter of days, and furthermore that the networks it creates of are

sufficient quality to be used for real world tasks. We will demonstrate that ASKNet

can be used to judge semantic relatedness of words, achieving results comparable to

the best state-of-the-art systems.

1.1 Motivation

The generation of large-scale semantic resources from natural language text is be-

coming a key problem in Natural Language Processing [Pantel et al., 2004, Etzioni

et al., 2004]. Manually constructed semantic and lexical networks have been shown

to be useful for ai tasks such as Question Answering [Curtis et al., 2005], as well

as more specific tasks such as predictive text entry [Stocky et al., 2004] and word

sense disambiguation [Curtis et al., 2006]. However, creating these networks requires

a great deal of time and resources. Automatic construction would allow the creation

of similar resources on a much larger scale, and in a much shorter timeframe.

7

1.2. EXISTING SEMANTIC RESOURCES

A long-standing goal of nlp has been to develop tools which can process naturally

occurring text and provide the semantic representations needed by ai applications for

general knowledge representation and reasoning. Recently we have begun to see a re-

turn to this goal, particularly with the advent of wide-coverage parsers and semantic

processing tools such as named entity recognisers [Collins, 2003, Tjong Kim Sang and

De Meulder, 2003], and the development of wide-coverage semantic resources such

as WordNet [Fellbaum, 1998]. qa systems such as PowerAnswer [Moldovan et al.,

2002] have been particularly successful at exploiting such resources, now obtaining

accuracies of over 50% for factoid questions [Dang et al., 2006]. Textual entailment

systems such as lcc’s groundhog [Hickl et al., 2006] are beginning to show evi-

dence that systems which use deep linguistic analysis can outperform systems using

a simpler bag-of-words representation. We believe that ASKNet could provide a se-

mantic resource that would be useful for ai applications and other areas of research

that require general world knowledge.

1.2 Existing Semantic Resources

The potential usefulness of large scale semantic knowledge resources can be attested

to by the number of projects and the amount of resources that have been dedicated

to their construction. In this section we will survey several of the larger and more

successful efforts at building a resource similar to that which the ASKNet project

hopes to achieve. There are two broad classes of projects that attempt to build large

scale knowledge resources. Traditionally manual creation has been the methodology

of choice, but more recently projects using automated creation have begun.

8

1.2. EXISTING SEMANTIC RESOURCES

1.2.1 Manually Constructed Resources

Manual creation of large scale semantic resources is a very labour intensive task.

Projects of this nature can easily take decades to complete and require hundreds of

contributors. However in most cases manual creation ensures that a highly reliable

resource is created and every entry in the network can be relied upon with confidence

as it has been tested by humans.

By far the most widely used knowledge resource in development today is WordNet

[Fellbaum, 1998]. Begun in 1985 at Princeton University, WordNet organises words

into senses or distinct meanings which are connected through a discrete number of

semantic relations, and contains over 200 000 word senses [Word Net]. WordNet is

designed following psycho-linguistic theories of human memory, and is mainly focused

on formal taxonomies of words. It is primarily a lexicographic resource, rather than

an attempt to create a semantic knowledge resource, however it has been used in

many cases to approximate a semantic network, and is therefore included in this list.

The Cyc Project [Lenat, 1995] attempts to focus on common knowledge, or asser-

tions which are too simple and obvious to be given in dictionaries or other forms of

text, but that a native speaker of English can take for granted that his/her audience

knows. The Cyc Project is manually created one assertion at a time by a team of

knowledge engineers, and contains over 2.2 million assertions relating over 250 000

terms [Matuszek et al., 2006].

ConceptNet [Liu and Singh, 2004a] (previously known as OMCSNet) uses a seman-

tic network similar to the network created for ASKNet. Nodes are small fragments of

English connected by directed relations. The primary difference between ConceptNet

and the semantic network formalism used in ASKNet is that the relations in Concept

Net are selected from a set of 20 pre-defined relations, and ConceptNet is only able to

9

1.2. EXISTING SEMANTIC RESOURCES

contain definitional data. ConceptNet uses the OpenMind corpus [Singh et al., 2002]

to acquire its knowledge. This is particularly interesting because the OpenMind cor-

pus is created by the general public. Visitors to a webpage are presented with text

such as “A knife is used for ...”, and are then asked to provide text fragments to fill in

the rest of the sentence. This has allowed ConceptNet to grow rapidly. ConceptNet

contains over 1.6 million edges connecting more than 300 000 nodes [Liu and Singh,

2004b].

1.2.2 Automatically Constructed Resources

The labour intensive nature of manually creating semantic resources makes automatic

resource creation an obvious goal for researchers [Crestani, 1997]. However it is only

recently that advances in natural language processing techniques have made auto-

matic creation a possibility. Semantic resources created automatically will naturally

be more likely to contain errors that would not be introduced to manually created

networks; however, for many tasks the great decrease in time and labour required

to build a network, combined with the ability to use larger corpora, will more than

make up for the decrease in accuracy [Dolan et al., 1993].

There have recently been promising results in semi-automated knowledge acquisi-

tion. [Pantel and Pennacchiotti, 2006] details the Espresso system, which attempts

to harvest semantic relations from natural language text by building word patterns

which signify a specific relation (e.g., “X consists of Y” for the part of(Y,X) rela-

tion) and searching large corpora for text which fits those patterns. The building

of patterns is weakly-supervised, and each new relation the system extracts must be

chosen by a human user. Unlike ASKNet, Espresso only extracts binary relations,

and does not build complex node structures or perform any information integration;

it is nonetheless very promising that a simple pattern matching based algorithm has

10

1.2. EXISTING SEMANTIC RESOURCES

been shown to perform well when used on a web based corpus [Pantel et al., 2004].

Schubert and Tong [Schubert and Tong, 2003] have also developed a system for

automatically acquiring knowledge from text. However, they attempt to gain “pos-

sibilistic propositions” (e.g., “A person can believe a proposition”) rather than ex-

tracting direct knowledge from the text. Furthermore, they only extract information

from a small treebank corpus rather than raw text. ASKNet can extract information

from raw text because of its use of a wide coverage parser. This allows us to use

the vast quantities of readily available English text to create networks, instead of

comparatively small structured corpora.

Started in 1993 at Microsoft Research, MindNet [Dolan et al., 1993] uses a wide

coverage parser to extract pre-defined relations from dictionary definitions. To illus-

trate the difference in the automated construction approach, the MindNet network

of over 150 000 words connected by over 700 000 relations can be created in a matter

of hours on a standard personal computer [Richardson et al., 1998].

Of all the projects listed here, MindNet is the most similar in methodology to

ASKNet. However MindNet uses a more traditional phrase-structure parser and only

analyses dictionary definitions which tend to have much less linguistic variation than

newspaper text, and are also more limited in the type of information they convey.

MindNet also only uses a small set of pre-defined relations and is essentially an atomic

network (see Section 3.1). ASKNet’s relations are defined by the text itself and it is

capable of handling arbitrarily complex node structures. Therefore the largest differ-

ence between MindNet and ASKNet is that ASKNet can accommodate a much more

diverse range of inputs, and can represent a much wider range of information. This

will allow ASKNet to use very large document collections to create its network which

will hopefully lead to a larger, more diverse and ultimately more useful network. The

largest difference between ASKNet and MindNet is that MindNet does not perform

11

1.3. CONTRIBUTIONS OF THIS THESIS

any information integration (see Chapter 4).

1.3 Contributions of This Thesis

This thesis provides the following contributions:

• Detailing the development of a system for the automatic construction of in-

tegrated semantic knowledge networks from natural language texts by using

spreading activation theory to combine information from disparate sources.

• Developing methodologies for testing systems which create large scale semantic

networks using average node insertion time analysis to prove a linear upper

bound on creation time growth and manual evaluation of network cores (sections

of the network which are of manageable size, but still contain the most frequently

occurring nodes).

• Providing a novel automatic method for judging semantic relatedness of words

that performs at least as well as state of the art systems using other methods.

1.4 Outline

Chapter 1: Introduction

In this chapter we introduce the concept of automatically creating semantic knowledge

networks from natural language texts, and attempt to motivate the need for tools such

as ASKNet. We also survey existing research projects which share similarities with

ASKNet and highlight the benefits and drawbacks of various methodologies used in

12

1.4. OUTLINE

these systems in order to motivate the use of similar or opposing methodologies in

ASKNet.

Chapter 2: Parsing and Semantic Analysis

In this chapter we discuss the tools and methods used to prepare the natural language

text input for inclusion into an ASKNet network. We first explain the criteria for

choosing a parser, and then survey some of the available software packages identifying

the C&C parser as the best candidate for our needs. We also discuss our use of

the semantic analysis tool Boxer, and its use of Discourse Representation Theory in

influencing the creation of ASKNet networks.

Chapter 3: Semantic Networks

In this chapter we first provide a concrete definition for what we will consider a proper

semantic network. We then discuss the creation of the ASKNet semantic network

formalism within the context of this definition. We then detail the implementation

of the network, and the conversion of input text into ASKNet network fragments.

Chapter 4: Information Integration

In this chapter we briefly survey the history of spreading activation from its roots in

psycho-linguistic literature to its use in natural language processing applications. We

then detail the use of spreading activation in ASKNet, and walk through an example

iteration of the update algorithm which is responsible for the information integration in

ASKNet. We discuss the merits of information integration, and attempt to motivate

the necessity of this often overlooked aspect of knowledge discovery and semantic

resource creation.

13

1.4. OUTLINE

Chapter 5: Evaluation

In this chapter we detail three evaluation metrics for the ASKNet system and pro-

vide details of the methodologies and results of our evaluation. We first evaluate

the network creation speed to prove that ASKNet can create large scale networks

efficiently, and also empirically show a linear upper bound for network growth time.

We then use human evaluation of network cores to establish a direct precision score

for constructed networks, and achieve a precision of 79.1%.

Chapter 6: Semantic Relatedness

In this chapter, we use ASKNet to rank word pairs by their semantic relatedness and

compare those rankings to results of human evaluations from the psycho-linguistics

literature. We compare ASKNet’s ability to perform this task to existing method-

ologies based on WordNet traversal and pointwise mutual information calculation,

and we find that an ASKNet based approach performs at least as well as all existing

approaches despite not using a manually created resource or a large human designed

corpus.

Chapter 7: Conclusions

In this chapter we summarise the ASKNet project and the potential impact of this

thesis on the nlp and ai communities, as well as the academic community at large.

We also briefly summarise some areas of further improvement that could be made to

ASKNet, as well as some areas which we could see potentially benefiting from the use

of ASKNet.

14

Chapter 2

Parsing and Semantic Analysis

Manipulating natural language in its raw form is a very difficult task. Parsers and

semantic analysis tools allow us to work with the content of a document on a semantic

level. This simplifies the process of developing a semantic network both computa-

tionally and algorithmically. To this end, ASKNet employs a set of software tools to

render the plain text into a discourse representation structure, from which point it

can turn the information into a semantic network fragment with relative ease.

In order for ASKNet to create its semantic network fragment to represent a sen-

tence, it must know the constituent objects of the sentence and their relations to one

another. This information is very difficult to extract, and even the best tools available

are far from perfect. ASKNet has been designed to use external tools for parsing and

semantic analysis so that as these tools improve, ASKNet can improve with them. It

has also been designed to not rely too heavily on any one tool so that if a better tool

is developed, ASKNet can use it to achieve the best performance possible.

15

2.1. PARSING

2.1 Parsing

Before we can work with natural language text, we must first analyse and manipulate

it into a form that can be easily processed. Parsing converts “plain” natural language

text into a data structure which the system can either use to build a semantic network

fragment directly, or can use as the input to a semantic analysis program.

2.1.1 Choosing a Parser

One of the major strengths of ASKNet lies in its ability to integrate large amounts of

differing information into its network. In order to exploit this power, it is necessary

for all stages of the system to be able to handle a wide range of inputs, and process

those inputs efficiently.

The choice of which parser to use is therefore very important to the success of

ASKNet. For the entire system to perform well, the parser must be both wide cov-

erage and efficient. To this end, many parsers were considered such as the Charniak

parser [Charniak, 2000], the Collins parser [Collins, 1999], the Stanford parser [Klein

and Manning, 2003] and RASP [Briscoe and Carroll, 2002]. Eventually speed consid-

erations and its relational output made the C&C parser [Clark and Curran, 2007] an

obvious choice.

2.1.2 The C&C Parser

The C&C parser is a wide coverage statistical parser based on Combinatory Categorial

Grammar (CCG) [Steedman, 2000], written in C++. CCG is a lexicalised grammar

formalism where each word in a sentence is paired with a lexical category, which

defines how the word can combine with adjacent words and word phrases. For example

16

2.1. PARSING

in Figure 2.1 the word likes is given the lexical category (S[dcl]\NP)/NP which means

that it is looking for a noun phrase to its right (as indicated by the direction of the

slash). In this example it combines with pizza using forward application and the

combined phrase likes pizza receives the category S[dcl]\NP, which is looking for a

noun phrase to its left. This then allows likes pizza to be combined with the NP John

to its left to become the declarative sentence John likes pizza. The lexical categories

of words are combined using a small number of “combinatory rules”, such as forward

and backward application to produce a full derivation for the sentence.

Figure 2.1: A simple CCG derivation using forward (>) and backward (<) applica-tion.

The C&C parser uses a grammar extracted from CCGBank [Hockenmaier, 2003],

a ccg treebank derived from the Penn Treebank. CCGbank is based on real-world

text: 40,000 sentences of Wall Street Journal text manually annotated with ccg

derivations. Training on this data has resulted in a very robust parser.

CCG was designed to capture long range dependencies in syntactic phenomena

such as coordination and extraction which are often entirely missed by other parsers.

Most parsers use treebank grammars which can discover local dependencies, but de-

pendencies in natural language text can be arbitrarily far apart. Take for example

the sentence given in (2.1). We can easily move the dependency farther apart by

adding a clause such as in (2.2). It is easy to continue to add clauses to the sentence

to move the initial dependency farther apart as illustrated in (2.3).

The dog that Mary saw. (2.1)

17

2.1. PARSING

The dog that John said that Mary saw. (2.2)

The dog that John said that Ann thought that Mary saw. (2.3)

As the dependencies move farther apart, most parsers have greater difficulty in

recognising them, and in many cases once they move farther apart than the parser’s

set context window, they cannot be found at all. CCG was specifically designed to be

able to capture dependencies regardless of the intervening distance in the text, and

thus the C&C parser is able to extract these dependencies that most other parsers

miss.

CCG is a lexicalised grammar which means it assigns lexical categories to words.

These categories capture the elementary syntactic structure of the sentence and there-

fore only a small number of combinatory rules are required for combining lexical cate-

gories. However, since the categories can be treated as atomic labels the C&C parser

is able to use a supertagger to very efficiently assign lexical categories to words in

much the same way as standard taggers assign part of speech tags. This results in a

parser which is both efficient and robust [Clark and Curran, 2004].

Aside from its speed, the other major advantage of using the C&C parser for

ASKNet is the ease with which semantic information can be obtained from the ccg

output. This allows the use of semantic analysis tools such as the one described in

the next section.

The C&C parser also provides a named entity recogniser (ner) implemented as a

separate program that combines its output with that of the parser [Curran and Clark,

2003]. This program is a maximum entropy sequence tagger which uses contextual

features to attach entity tags to words in order to label them as belonging to certain

categories such as person, location, organisation, date and monetary amount. The

accuracy of the ner tagger ranges from roughly 85% to 90%, depending on the data

18

2.2. SEMANTIC ANALYSIS

set and entity type [Curran and Clark, 2003].

While several considerations caused the C&C parser to be chosen for use in

ASKNet, the system has been constructed in such a way that it is not heavily depen-

dent on any particular parser or parsing algorithm. In order to integrate a different

parser into ASKNet, it would only be necessary to design a new filter (see Section

3.3) for that parser.

2.2 Semantic Analysis

Rather than attempting to use the output of the parser directly as input to ASKNet,

there are many advantages to first process it with a semantic analysis tool. The

primary role of semantic analysis tools is to extract semantic relations from the parser

output, which is extremely useful for a system such as ASKNet. Further to this,

semantic analysis tools can convert the parser output into a well defined structure

that is much easier to work with and allows ASKNet to process the semantic entities

within the text without requiring it to deal with much of the lexical and syntactic

detail of the sentence.

Another advantage of processing the data with a semantic analyser is that it pro-

vides another avenue for the incorporation of new nlp technologies. As semantic

analysis tools improve and begin to incorporate novel features and techniques, the

data input into ASKNet will improve with no additional cost to us. For example, if

the semantic analysis tool used improves its anaphora resolution, then so long as the

tool does not change its output format, the ASKNet input will improve without the

need for any changes.

The use of a semantic analyser does add some overhead to the processing time,

19

2.2. SEMANTIC ANALYSIS

however this overhead is relatively small and is linear with respect to the number of

input sentences. The increase in the quality of networks created, combined with the

ability to simplify ASKNet’s text processing will likely more than make up for the

extra time required.

2.2.1 Discourse Representation Theory

Discourse Representation Theory (drt) takes a dynamic perspective of natural lan-

guage semantics [van Eijck and Kamp, 1997] where each new sentence is viewed in

terms of its contribution to an existing discourse. A Discourse Representation Struc-

ture (drs) is a formalised representation of the information available at a given point

in a discourse. New sentences in the discourse are viewed as updates to the structure

[Kamp and Reyle, 1993]. drt was initially designed to solve the problem of unbound

anaphora [van Eijck, 2005], and is particularly useful for establishing links between

pronouns and their antecedents. drt has equivalent expressive power to first order

logic.

When interpreting a sentence as a drs a discourse referent (essentially a free

variable) is created whenever an indefinite noun phrase (e.g., a dog, someone, a car

of mine) is encountered. Definite noun phrases (e.g., this dog, him, my car) are always

linked to existing discourse referents. For example1 when processing the discourse in

(2.4), the first sentence creates two discourse referents, x and y referring to the woman

and the room respectively. Then three conditions are created: woman(x), room(y)

and entered(x,y). This produces the drs seen in (2.5).

1This example has been simplified for clarity as well as to be more directly consistent with thetype of drs used by Boxer (see Section 2.2.2) and thus does not represent all of the details of thedrs formalism as presented in [Kamp and Reyle, 1993].

20

2.2. SEMANTIC ANALYSIS

A woman entered the room. She smiled. (2.4)

(x,y)(woman(x),room(y),entered(x,y)) (2.5)

When interpreted, the second sentence in discourse (2.4) creates the drs seen

in (2.6). However when it is processed as an update to (2.5), it also produces a

link between the variable z, assigned to the pronoun she, and the variable x which

represents the antecedent of that pronoun, thus producing the updated drs (2.7).

(z)(smiles(z)) (2.6)

(x,y,z)(woman(x),room(y),entered(x,y),z=x,smiled(z)) (2.7)

[Kamp, 1981] and [Kamp and Reyle, 1993] give a full set of rules for creating drss

in formal detail, covering constraints on anaphoric linking, nested drs structures,

and special case drs linking such as implication and disjunction.

2.2.2 Boxer

Discourse representation theory is particularly useful to the construction of ASKNet

because it builds on similar principles of interpreting new information within the

context of the current knowledge base (see Section 4.2.1). In order to leverage the

power of drt ASKNet uses a semantic analysis tool called Boxer [Bos, 2005].

Boxer is a Prolog program which uses the output of the C&C parser (see Section

2.1.2) to construct semantic derivations based on drs structures [Bos et al., 2004].

Using Boxer in conjunction with the C&C parser allows a seamless transition from

natural language text into a drs complete with nested structure, entity recognition,

21

2.2. SEMANTIC ANALYSIS

%%% Pierre Vinken , 61 years old , will join the board as a nonexecutive director Nov. 29 . Mr. Vinken is chairman of Elsevier N.V. , the Dutch publishing group .

sem(1,

[word(1001,'Pierre'), word(1002,'Vinken'), word(1003, (',')), word(1004,'61'), word(1005, years), word(1006, old), word(1007, (',')), word(1008, will), word(1009,join), word(1010, the), word(1011, board), word(1012, as), word(1013, a), word(1014,nonexecutive), word(1015, director), word(1016,'Nov.'), word(1017,'29'), word(1018,'.'), word(2001,'Mr.'), word(2002,'Vinken'), word(2003, is), word(2004,chairman), word(2005, of), word(2006,'Elsevier'), word(2007,'N.V.'), word(2008,(',')), word(2009, the), word(2010,'Dutch'), word(2011, publishing), word(2012,group), word(2013,'.')

],

[pos(1001,'NNP'), pos(1002,'NNP'), pos(1003,(',')), pos(1004,'CD'), pos(1005,'NNS'), pos(1006,'JJ'), pos(1007,(',')), pos(1008,'MD'), pos(1009,'VB'), pos(1010,'DT'), pos(1011,'NN'), pos(1012,'IN'), pos(1013,'DT'), pos(1014,'JJ'), pos(1015,'NN'), pos(1016,'NNP'), pos(1017,'CD'), pos(1018,'.'), pos(2001,'NNP'), pos(2002,'NNP'), pos(2003,'VBZ'), pos(2004,'NN'), pos(2005,'IN'), pos(2006,'NNP'), pos(2007,'NNP'), pos(2008, (',')), pos(2009,'DT'), pos(2010,'JJ'), pos(2011,'NN'), pos(2012,'NN'), pos(2013,'.')

],

[ne(2002,'I-PER'), ne(2007,'I-LOC'), ne(1001,'I-PER'), ne(1002,'I-PER'), ne(1016,'I-DAT'), ne(1017,'I-DAT')

],

smerge(drs( [[1001, 1002]:x0, [1004, 1005]:x1, [1006]:x2, [1010]:x3,

[1009]:x4, [1013]:x5, [1016]:x6], [[2001]:named(x0, mr, ttl), [1002, 2002]:named(x0, vinken, per), [1001]:named(x0, pierre, per), [1004]:card(x1, 61, ge), [1005]:pred(year, [x1]), [1006]:prop(x2,

drs( [], [[1006]:pred(old, [x0])])),

[1006]:pred(rel, [x2, x1]), []:pred(event, [x2]), [1011]:pred(board, [x3]), [1016, 1017]:timex(x6, date([]:'XXXX', [1016]:'11', [1017]:'29')), [1009]:pred(join, [x4]), [1009]:pred(agent, [x4, x0]), [1009]:pred(patient, [x4, x3]), [1014]:pred(nonexecutive, [x5]), [1015]:pred(director, [x5]), [1012]:pred(as, [x4, x5]), [1016]:pred(rel, [x4, x6]), []:pred(event, [x4])]), drs( [[2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012]:x7,

[2006, 2007]:x8, [2009]:x9, [2011]:x10, [2003]:x11], [[2004]:pred(chairman, [x7]), [2006, 2007]:named(x8, elsevier_nv, loc), [2005]:pred(of, [x7, x8]), [2010]:pred(dutch, [x9]), [2011]:pred(publishing, [x10]), []:pred(nn, [x10, x9]), [2012]:pred(group, [x9]), [2005]:pred(of, [x7, x9]), [2003]:prop(x11,

drs( [], [[2003]:eq(x0, x7)])),

[]:pred(event, [x11])]))

).

Figure 2.2: Example Boxer output in Prolog format for the sentences Pierre Vinken,61 years old, will join the board as a nonexecutive director Nov. 29. Mr. Vinken ischairman of Elsevier N.V., the Dutch publishing group.

22

2.2. SEMANTIC ANALYSIS

______________________ _______________________ | x0 x1 x2 x3 x4 x5 x6 | | x7 x8 x9 x10 x11 | |----------------------| |-----------------------| (| named(x0,mr) |+| chairman(x7) |) | named(x0,vinken) | | named(x8,elsevier_nv) | | named(x0,pierre) | | of(x7,x8) | | |x1|>61 | | dutch(x9) | | year(x1) | | publishing(x10) | | _________ | | nn(x10,x9) | | | | | | group(x9) | | x2:|---------| | | of(x7,x9) | | | old(x0) | | | _________ | | |_________| | | | | | | rel(x2,x1) | | x11:|---------| | | event(x2) | | | x0 = x7 | | | board(x3) | | |_________| | | timex(x6)=XXXX-11-29 | | event(x11) | | join(x4) | |_______________________| | agent(x4,x0) | | patient(x4,x3) | | nonexecutive(x5) | | director(x5) | | as(x4,x5) | | rel(x4,x6) | | event(x4) | |______________________|

Figure 2.3: Example Boxer output in “Pretty Print” format for the sentences PierreVinken, 61 years old, will join the board as a nonexecutive director Nov. 29. Mr.Vinken is chairman of Elsevier N.V., the Dutch publishing group.

and some limited anaphoric pronoun resolution. Some example output of the program

can be seen in Figures 2.2 and 2.3.

Representing the sentence as a drs is ideal for ASKNet for several reasons. The

drs structure very closely mirrors the semantic network structure used in ASKNet,

with discourse referents being roughly equivalent to object nodes and the semantic

relations being analogous to either node labels or relations (see Section 3.2.2).

23

Chapter 3

Semantic Networks

A semantic network can loosely be defined as any graphical representation of knowl-

edge using nodes to represent semantic objects and arcs to represent relationships

between objects. Used since at least the 3rd century AD in philosophy, with com-

puter implementations in use for over 45 years [Masterman, 1962], a wide variety of

formalisms have used the name semantic network [Sowa, 1992].

3.1 A Semantic Network Definition

For the purposes of this thesis, we will posit certain requirements for what we will con-

sider as an acceptable semantic network. Primarily, we will require that the relations

in the network be labelled and directed. This is to distinguish semantic networks from

what we will call associative networks which connect concepts based simply on the

existence of a relationship without regards to the relationship’s nature or direction

(See Figure 3.1). Associative networks, often referred to as “pathfinder networks”,

are technically a type of semantic network, and are quite often used because they can

easily be extracted from co-occurrence statistics [Church and Hanks, 1990] and have

24

3.1. A SEMANTIC NETWORK DEFINITION

proven useful for many tasks [Schvaneveldt, 1990]; however for our purposes their lack

of power and expressiveness will discount them from consideration.

cat dog

animal

whiskers

house

bird

plant

fly

wing

airplane

tree christmas

Figure 3.1: An example of an associative network. Objects and concepts are linkedwithout distinction for type or direction of link.

The second requirement we shall impose upon semantic networks is that they be

structurally unambiguous. A given network structure should have only one semantic

meaning. Thus, even though the semantically different ideas of John using a telescope

to see a man and John seeing a man carrying a telescope can be encoded in the same

English sentence John saw the man with the telescope, when that sentence is translated

into a semantic network, the structure of the network must uniquely identify one of

the two interpretations (See Figure 3.2).

Semantic networks may still contain lexical ambiguity through having ambiguous

words used as labels on nodes and arcs. For example in figure 3.3, it is impossible

to tell whether bank refers to a financial institution, or the edge of a river. It is

theoretically possible to remove lexical ambiguity from a semantic network by forcing

each node to be assigned to a particular sense of the word(s) in its label, however

word sense disambiguation is a very difficult task and there is no complete solution

currently available.

The third and final requirement we will make for semantic networks is that they

25

3.1. A SEMANTIC NETWORK DEFINITION

saw

with

man

telescope

John

(a) John used the telescope to see the man

(b) John saw the man carrying the telescope

Figure 3.2: Semantic network representations of the two parses of “John saw the manwith the telescope”

went to bankJohn

Figure 3.3: A lexically ambiguous network

must be able to accommodate the complex structures regularly found in natural

language text. In particular we will require that the network allow relations between

complex concepts which may themselves contain many concepts and relations. This

is to distinguish proper semantic networks from what we will call atomic networks

which only allow simple nodes representing a single concept. These networks can only

accommodate a limited type of information, and thus we will not include them in our

definition of semantic networks.

This notion of semantic network is not definitive, nor is it complete. We have

said nothing of the network’s ability to deal with temporal, probabilistic or false

information. A definitive definition of semantic networks (if indeed such a definition

26

3.2. THE ASKNET SEMANTIC NETWORK

is possible) is beyond the scope of this thesis. We have merely defined the minimum

requirements necessary to be acceptable for our purposes.

3.2 The ASKNet Semantic Network

The semantic network formalism developed for ASKNet meets all of the criteria we

have set out for consideration as a “proper” semantic network, and also has a few

extra features that make it particularly well suited to the ASKNet project. We will

first explain how the criteria are met, and then briefly describe the extra features that

have been added.

ASKNet trivially meets the first criterion by its design. All relations in ASKNet

are labelled and directed with a defined agent and target node, of which at least one

must be present before a relation can be added to the network.

The second criterion is taken care of by the parser. One of the primary features of

a parser is to select one parse from the list of possible parses for a natural language

sentence. Since no information is discarded in the translation from the parser output

to the network creation, we maintain a single parse and thus are left without any of

the original structural ambiguity.

The third criterion is met by the hierarchical structure of the network. This allows

complex concepts and even entire discourses to be treated as single objects. As we see

in Figure 3.4, complex objects can be built up from smaller objects and their relations.

The hierarchical structure is unrestrictive, and thus it is possible for any pair of nodes

to have a relation connecting them, or for a single node to be a constituent of multiple

complex nodes.

In figure 3.4 we can also see the attribute nodes (denoted by ellipses). Any object

27

3.2. THE ASKNET SEMANTIC NETWORK

Figure 3.4: A Hierarchical Semantic Network

or relation can have multiple attribute nodes, and attribute nodes can also be complex

nodes.

One additional, but very important feature of ASKNet’s semantic network is that

every link in the network is assigned a value between 0 and 1. This value represents

the confidence or the salience of the link. This value can be determined by various

means such as the confidence we have in the source of our information, or the number

of different sources which have repeated a particular relation. In practice, the value

(or weight) of a link is set by the update algorithm (see Section 4.2.1). Weights can

also be assigned to attribute links.

3.2.1 Temporality and Other Issues

The network formalism presented here is robust and flexible, however it does not

deal with all types of information. For example, there is nothing in the network to

deal with the temporal nature of information. There is no way for ASKNet to know

28

3.2. THE ASKNET SEMANTIC NETWORK

whether a particular piece of information is meant to be true for a certain time period,

or indefinitely true.

For our current purposes, we will not attempt to expand the network formalism to

deal with issues such as temporality. We recognise that there will be some limitations

to the types of information that can be represented by ASKNet; however the current

definition will be sufficient for the functionality required in this thesis.

3.2.2 Network Implementation

ASKNet’s internal semantic network is implemented in Java. It is designed to allow

maximum flexibility in node type and hierarchy. A UML class diagram1 for the

network is given in Figure 3.5.

In this section we will explore the details of each of the classes of the network,

explaining the functionality and design decisions of each class and how it interacts

with the other parts of the network.

SemNode

SemNodes come in 4 distinct types (ObjNode, RelNode, AttNode and ParentNode).

Each node type has a distinct activation threshold, but all of the node types are

implemented almost identically. The primary difference between the node types is

the way they are treated by the SemNet class.

• ObjNodes represent atomic semantic objects. They have a special field called

neType which is set if the named entity recogniser (see Section 2.1.2) provided

1This is a simplified class diagram and contains only a portion of the total class information.Much of the detail has been omitted to increase the saliency of more important features.

29

3.2. THE ASKNET SEMANTIC NETWORK

Figure 3.5: UML Class diagram of ASKNet’s semantic network architecture

it with a label. For example, if a node represents a person, its neType field

would be set to “per”.

• RelNodes represent relations between objects or concepts. While some semantic

networks simply label their links with the names of relations, ASKNet uses fully

implemented nodes for this purpose, primarily so that relations themselves can

have attributes and adjustable firing potentials, and also so that a single relation

can have more than one label. All RelNodes have a agent and target link

which provide the direction of the relation; at least one of these links must be

instantiated for a RelNode to be created. This ensures that all relations must

30

3.2. THE ASKNET SEMANTIC NETWORK

have a direction.

• AttNodes represent attributes of an object or concept. They are essentially sim-

plifications of the “attribute of” relationship. Creating a distinct node type for

attributes reduces unnecessary relations in the network; improving performance

and making the networks more intuitive.

• ParentNodes represent complex semantic concepts made up of two or more

nodes. All of the members of the complex concept are labelled as the concept’s

children and each node has a link to any ParentNodes of which it is a member.

ParentNodes are often vacuous parents, which means that they are unlabeled

and provide no additional information beyond the grouping of their constituent

nodes.

ParentNodes also have an allFire mode, wherein all of their children must have

fired before they are allowed to fire. This is to prevent one constituent of a

complex concept causing the entire concept to fire.

All nodes have a unique id assigned to them at the time of their creation which

indicates the document from which they originated. A set of labels allows many labels

31

3.2. THE ASKNET SEMANTIC NETWORK

to be added to a single node, which is necessary as the same concept is often referred

to in a variety of ways. Each node has arrays of links to the nodes with which it is

connected; these are stored based on the type of node linked. All nodes contain a

link to the monitor processes for their network so that they can easily report their

status and events such as firing, label changes or deletion requests. Finally each node

contains a link to the SemFire object for the network which processes firing requests

and controls firing sequences for the nodes.

SemNodes can receive a pulse of activation from another node or from the network;

this increases the potential variable. If this causes it to exceed the firingLevel,

then the node sends a request to SemFire to be fired (see Section 4.2.3 for more

details). Nodes can also be merged; merging copies all of one node’s links and labels

to another and then deletes the first node. Deleting a node sends messages to all

connected nodes to delete the appropriate links from both ends so that no “dead”

links can exist in the network.

SemLink

SemLinks form the links between nodes. Each link is assigned a strength when it

is created, which can either represent the certainty of the link (i.e., how confident the

system is that this link exists in the real world) or its salience (i.e., how often this

link has been repeated in the input in comparison with other links of a similar type).

This can be increased or decreased by the network as more information is gained.

32

3.2. THE ASKNET SEMANTIC NETWORK

SemNet

The SemNet class is the interface into the semantic network. All of the function-

ality of the network is available through SemNet’s methods. SemNet is used to add,

remove, retrieve and manipulate nodes. It also indexes the nodes and contains the

update algorithm (see Section 4.2.1).

SemNet must be able to retrieve nodes based on both their unique ID and their

label. Since the same label may be used in many nodes, this is achieved with a pair

of hashtables. The first hashtable maps a string into a list of the IDs of all nodes

which have that string as one of their labels. The second hashtable maps an ID to

its corresponding node. The combination of these two hashtables allows SemNet to

efficiently retrieve nodes based on either their label or their ID.

SemNet’s print() method prints the contents of the network in GraphViz [Gansner

and North, 2000] format so that the graph can be displayed visually for manual de-

bugging and evaluation. This is done by calling the print() method of every node in

the network. Each node then prints out its own details and the details of all of its links

in a format which can be turned into a graphical representation by GraphViz. The

majority of the diagrams in this thesis are created using this manner. For examples,

see Figures 3.7 and 3.10.

The print() method is very rarely called on an entire network for the simple

33

3.2. THE ASKNET SEMANTIC NETWORK

reason that the resultant graphical representation would be far too large. For this

reason, SemNet has a printFired() method which only prints nodes which have

fired since the last time the network was reset.

SemMonitor

SemMonitor receives status reports from every node in the network; this can be

used for debugging purposes but it is also used to track which nodes fired in a given

sequence of activation. All nodes have a link to the SemMonitor object for the network

and are required to notify SemMonitor every time they fire.

SemFire

SemFire is structured similarly to SemMonitor in that every node in the network

contains a link to the single SemFire object. When a node wishes to fire, it notifies

SemFire. SemFire keeps a list of all requests and permits nodes to fire in an order

specified by the firing algorithm (see Section 4.2.3)

34

3.3. PARSER FILTERING

3.3 Parser Filtering

The output of ASKNet’s parsing and analysis tools must be manipulated in such a

way as to turn each sentence’s representation into the form of a semantic network

update which can be used by the update algorithm (see Section 4.2.1). The module

which performs this data manipulation is called the parser filter. The parser filter

is designed in a modular fashion so that when one of the parsing and analysis tools

changes, the parser filter can be easily replaced or altered without affecting the rest

of ASKNet. Two parser filters have been developed for ASKNet, the first to filter the

output of the C&C parser, and the second to filter the output of Boxer.

3.3.1 C&C Parser Filter

The first filter developed for the system was designed to work with the C&C parser’s

grammatical relations output2. The filter is essentially a set of rules mapping relations

output by the parser to network features. For example, as can be seen in Table 3.1.

upon encountering the output vmod(word1, word2), the filter turns the node for

word2 into an attribute for the relational node word1 (if either of the nodes do not

exist they are created; if the node for word1 is not already a relNode it is turned into

one).

Some of the rules require some complexity to ensure that links are preserved es-

pecially between parental nodes during the application of various rules. There are

also a few “ad hoc” rules created to deal properly with phenomena such as conjunc-

tions and disjunctions. The order in which rules are applied also greatly affects the

performance of this filter.

2The parser filter as described here is compatible with an older beta version of the grammaticalrelations output from the parser. This explains why the grammatical relations in this example aredifferent to those in [Clark and Curran, 2007]

35

3.3. PARSER FILTERING

Parser Output Rulecomp(word1, word2) Merge Node1 and Node2

vmod(word1, word2) Node2 becomes attNodeNode1 becomes relNodeNode2 becomes attribute of Node1Parents of Node2 become parents of Node1

ncsubj(word1, word2) Node1 becomes relNodeSubject link of Node1 set to Node2

dobj(word1, word2) Node1 becomes relNodeObject link of Node1 points to Node2

Table 3.1: A sample of the rules used by the C&C parser filter

The C&C Parser Filter is no longer used in ASKNet, but it is a good example of the

type of filter that would need to be created if we chose to change the parsing and data

analysis tools. The grammatical relations output by the C&C parser are radically

different to the output of Boxer which is currently used, but with the creation of a

simple filter, it can be fully integrated into ASKNet with little difficulty.

3.3.2 Boxer Filter

The Boxer filter takes advantage of the recursive nature of Boxer’s prolog output.

The program is written recursively, handling one predicate at a time and continually

calling itself on any sub-predicates.

Like the C&C parser filter, the Boxer filter is essentially a set of rules mapping

predicates to network fragments (See Figure 3.8 for a simple example). However, with

the output of Boxer, the predicates are nested recursively, so the filter must deal with

them recursively. Table 3.2 shows the rules for a number of Boxer’s prolog predicates.

Several of the rules used by the Boxer filter are context sensitive (i.e., if a predicate

tries to label a node which is in one of its parent nodes, it is treated as an attribute

instead). There are also a number of “special case” rules such as those shown where

36

3.3. PARSER FILTERING

nmod(Vinken_2, Pierre_1)

nmod(years_5, 61_4)

comp(old_6, years_5)

ncsubj(old_6, Vinken_2)

detmod(board_11, the_10)

dobj(join_9, board_11)

nmod(director_15, nonexecutive_14)

detmod(director_15, a_13)

comp(as_12, director_15)

vmod(join_9, as_12)

comp(Nov._16, 29_17)

vmod(join_9, Nov._16)

xcomp(will_8, join_9)

ncsubj(will_8, Vinken_2)

ncsubj(join_9, Vinken_2)

<c> Pierre|NNP|N/N Vinken|NNP|N ,|,|, 61|CD|N/N years|NNS|N

old|JJ|(S[adj]\NP)\NP ,|,|, will|MD|(S[dcl]\NP)/(S[b]\NP)

join|VB|(S[b]\NP)/NP the|DT|NP[nb]/N board|NN|N

as|IN|((S\NP)\(S\NP))/NP a|DT|NP[nb]/N nonexecutive|JJ|N/N

director|NN|N Nov.|NNP|((S\NP)\(S\NP))/N[num] 29|CD|N[num] .|.|.

nmod(Vinken_2, Mr._1)

nmod(N.V._7, Elsevier_6)

nmod(group_12, publishing_11)

nmod(group_12, Dutch_10)

detmod(group_12, the_9)

conj(,_8, group_12)

conj(,_8, N.V._7)

comp(of_5, group_12)

comp(of_5, N.V._7)

nmod(chairman_4, of_5)

dobj(is_3, chairman_4)

ncsubj(is_3, Vinken_2)

<c> Mr.|NNP|N/N Vinken|NNP|N is|VBZ|(S[dcl]\NP)/NP chairman|NN|N

of|IN|(NP\NP)/NP Elsevier|NNP|N/N N.V.|NNP|N ,|,|, the|DT|NP[nb]/N

Dutch|NNP|N/N publishing|VBG|N/N

group|NN|N .|.|.

Figure 3.6: Sample output from the C&C parser for the text “Pierre Vinken, 61 yearsold, will join the board as a nonexecutive director Nov. 29. Mr. Vinken is chairmanof Elsevier N.V., the Dutch publishing group”.

37

3.3. PARSER FILTERING

Figure 3.7: The C&C parser filter output for the input given in Figure 3.6

38

3.3. PARSER FILTERING

Figure 3.8: A simple example of the Boxer DRS (left) and resulting ASKNet networkfragment (right) for the sentence “John scored a great goal”

the predicate was either ‘agent’ or ‘event’.

The Boxer filter continues calling itself recursively, creating sub-networks within

parent nodes (this results in the hierarchical nature of the network) until it has

processed the entire prolog drs structure and we are left with a semantic network

which represents all of the information in the discourse.

39

3.3. PARSER FILTERING

Prolog Predicate Ruledrs(A[ ],B) Create one node for each of the discourse

referents in A

Recursively call filter on B

prop(x,B) Recursively call filter on B

Set x as the parent node for network fragmentcreated by B

named(x, text, type) Set x to named entity type type

Give node x label textpred(text, x) Give node x label textpred(‘event’, x) Set x to type relNodepred(text,[x,y]) Create relNode z with label text

set subject link of z to x

set object link of z to y

pred(‘agent’,[x,y]) Set agent link of y to x

eq(x,y) Create relNode z with label isset subject link of z to x

set object link of z to y

or(A,B) Create parentNode x with label orCreate unlabeled parentNode y

Create unlabeled parentNode z

Set x as parent of y and z

Recursively call filter on A

Set y as the parent node for network fragmentcreated by A

Recursively call filter on B

Set z as the parent node for network fragmentcreated by B

Table 3.2: A sample of the rules used by the Boxer filter. Capital letters representprolog statements, lower case letters represent prolog variables

40

3.3. PARSER FILTERING

smerge(

drs(

[[1001, 1002]:x0, [1004, 1005]:x1, [1006]:x2, [1010]:x3, [1009]:x4,

[1013]:x5, [1016]:x6],

[ [2001]:named(x0, mr, ttl),

[1002, 2002]:named(x0, vinken, per),

[1001]:named(x0, pierre, per),

[1004]:card(x1, 61, ge),

[1005]:pred(year, [x1]),

[1006]:prop(x2, drs([], [[1006]:pred(old, [x0])])),

[1006]:pred(rel, [x2, x1]),

[]:pred(event, [x2]),

[1011]:pred(board, [x3]),

[1016, 1017]:timex(x6, date([]:’XXXX’, [1016]:’11’, [1017]:’29’)),

[1009]:pred(join, [x4]),

[1009]:pred(agent, [x4, x0]),

[1009]:pred(patient, [x4, x3]),

[1014]:pred(nonexecutive, [x5]),

[1015]:pred(director, [x5]),

[1012]:pred(as, [x4, x5]),

[1016]:pred(rel, [x4, x6]),

[]:pred(event, [x4])]),

drs(

[[2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012]:x7,

[2006, 2007]:x8, [2009]:x9, [2011]:x10, [2003]:x11],

[ [2004]:pred(chairman, [x7]),

[2006, 2007]:named(x8, elsevier_nv, loc),

[2005]:pred(of, [x7, x8]),

[2010]:pred(dutch, [x9]),

[2011]:pred(publishing, [x10]),

[]:pred(nn, [x10, x9]),

[2012]:pred(group, [x9]),

[2005]:pred(of, [x7, x9]),

[2003]:prop(x11, drs([], [[2003]:eq(x0, x7)])),

[]:pred(event, [x11])

]))

).

Figure 3.9: Sample output from Boxer for the text “Pierre Vinken, 61 years old,will join the board as a nonexecutive director Nov. 29. Mr. Vinken is chairman ofElsevier N.V., the Dutch publishing group”.

41

3.3. PARSER FILTERING

Figure 3.10: Boxer filter output for the input given in Figure 3.9

42

Chapter 4

Information Integration

The power of ASKNet comes from its ability to integrate information from various

sources into a single cohesive representation. This is the main goal of the update

algorithm (see Section 4.2.1).

For an example of the type of additional information which can be gained by

information integration, consider the set of network fragments shown in Figure 4.1.

Each fragment is taken from a different source, and without being integrated into a

cohesive resource, the fragments are of little value, particularly as they are likely to be

scattered among many other fragments. However, when we integrate the fragments

into a single network by mapping co-referent nodes together as in Figure 4.2 it becomes

apparent that there is a connection between Chemical F and Disease B that would

not have been apparent from the information given in each fragment separately. In

Figure 4.2 there is a path connecting Chemical F and Disease B even though they

never appeared together within a document. Analysing these paths could lead to the

discovery of novel relationships (This is discussed further in Section 7.1.1).

This simple example shows the potential power that can be gained by integrating

43

Figure 4.1: A collection of network fragments taken from various sources, includingnews, biomedical, geographical and political information

networks into cohesive units. We will revisit this example in later chapters when we

explore the uses of the ASKNet system, and at that time we will see how beneficial

these connections can be.

Information Integration and GOFAI

Some might argue that information integration in the manner outlined in this thesis

is ai-complete, or “Good Old Fashioned ai” (GOFAI). While that may be the case if

one were arguing about a system’s ability to extract all possible information within

a corpora, it is certainly possible to integrate a large amount of information without

the need for GOFAI.

With the vast quantities of natural language text readily available through the

internet, a system could integrate only a small percentage of the information it re-

ceived and still produce a resource that is useful to the scientific community. We do

not claim that the methodologies outlined in this thesis will ever be able to extract

44

Figure 4.2: An integrated network created from the fragments in Figure 4.1

all possible information from a corpus, but we will attempt to show that they can

extract and integrate enough information to create a high quality, large scale useful

resource.

Existing Information Integration Systems

Most of the research on information integration has been done in the database

paradigm, using string similarity measurements to align database fields [Bilenko et al.,

2003]. Research done on natural language information integration has mostly cen-

tered on document clustering based on attributes gained from pattern matching [Wan

et al., 2005]. The majority of automatically created semantic resources, such as those

referenced in Section 1.2.2, have only the simplest forms of information integration.

This limits their ability to create large scale resources from diverse text sources, as it

limits the system’s usefulness when processing data other than dictionary or encyclo-

pedia entries (which are explicit enough to be processed without needing integration).

One particularly interesting line of research is the work of Guha and Garg [Guha

and Garg, 2004]. They propose a search engine which clusters document results

45

4.1. SPREADING ACTIVATION

which relate to a particular person. The proposed methodology is to create binary

first order logic predicates (e.g.,first name(x,Bob), works for(x,IBM)) which can be

treated as attributes for a person, and then using those attributes to cluster documents

about one particular individual. This amounts to a simplified version of the problem

ASKNet attempts to solve, using a simplified network, and limiting the domain to

personal information; the results, however, are promising.

4.1 Spreading Activation

Spreading activation is a common feature in connectionist models of knowledge and

reasoning, and is usually connected with the neural network paradigm. Spreading

activation in neural networks is the process by which activation can spread from

one node in the network to all adjacent nodes in a similar manner to the firing of a

neurone in the human brain. Nodes in a spreading activation neural network receive

activation from their surrounding nodes, and if the total amount of accumulated

activation exceeds some threshold, that node then fires, sending its activation to all

nodes to which it is connected. The amount of activation sent between any two nodes

is proportional to the strength of the link between those nodes with respect to the

strength of all other links connected to the firing node. The activation function used

in ASKNet is given in (4.1).

Spreading activation algorithms are, by nature, localised algorithms. Due to the

signal attenuation parameter (given as βx,y in Equation 4.1), it is guaranteed that the

signal can only travel a set distance from the originating node. Assuming the network

in which they are implemented is of sufficient size, firing any one node affects only a

small percentage of the nodes (i.e., those strongly linked to the original firing node),

and leaves the remainder of the network unaffected.

46

4.1. SPREADING ACTIVATION

activationi,j = αiweighti,j∑

k|k∈link(i),k 6=j βi,kweighti,k(4.1)

Symbol Definitionsαx Firing variable which fluctuates depending on node typesactivationx,y Amount of activation sent from node x to node y when

node x firesweightx,y Strength of link between node x and node yβx,y Signal attenuation on link (x,y), 0 < β < 1 determines the

amount of activation that is lost along each link. Fluctuatesdepending on link types

link(x) The set of nodes y such that link(x,y) existslink(x, y) The directed link from node x to node y

4.1.1 History of Spreading Activation

The discovery that human memory is organised semantically and that concepts which

are semantically related can excite one another came from the field of psycho-linguistics.

Meyer and Schvaneveldt [Meyer and Schvaneveldt, 1971] showed that when partic-

ipants were asked to classify pairs of words, having a pair of words which were se-

mantically related increased both the speed and the accuracy of the classification.

They hypothesised that when one word is retrieved from memory this causes other

semantically related words to be primed and thus retrieval of those words will be

facilitated.

The formal theory of spreading activation can be traced back to the work of Quil-

lian [Quillian, 1969] who proposed a formal model for spreading activation in a se-

mantic network. This early theory was little more than a marker passing method

where the connection between any two nodes was found by passing markers to all

adjacent nodes until two markers met, similar to a breadth first search.

It was the work of Collins and Loftus [Collins and Loftus, 1975] that added the main

features of what we today consider spreading activation, such as: signal attenuation,

47

4.2. SPREADING ACTIVATION IN ASKNET

summation of activation from input nodes and firing thresholds.

Despite the obvious theoretical advantages of Collins and Loftus’ model, due to

computational restraints much of the work which has used the title of “spreading

activation” has very rarely used the full model. Many researchers used a simplified

marker passing model [Hirst, 1987], or used a smaller or simplified network because

the manual creation of semantic networks that fit Collins and Loftus’ model was too

time consuming [Crestani, 1997, Preece, 1981].

The application of spreading activation to information retrieval gained a great deal

of support in the 80s and early 90s [Salton and Buckley, 1988, Kjeldsen and Cohen,

1988, Charniak and Goldman, 1993]: However the difficulty of manually creating

networks, combined with the computational intractability of automatically creating

networks caused most researchers to abandon this course [Preece, 1981]. In the past

few years there has been an increase in the number of nlp projects utilising spreading

activation on resources such as WordNet and Wikipedia [Wang et al., 2008, Nastase,

2008].

4.2 Spreading Activation in ASKNet

The semantic network created for ASKNet has been designed specifically for use with

spreading activation. Each node maintains its own activation level and threshold, and

can independently send activation to all surrounding nodes (This can be done with

or without regard to the direction of the links. For the purposes of this thesis, unless

stated otherwise, all activation spread disregards link direction). Monitor processes

control the firing patterns and record the order and frequency of node firing.

Each of the various types of nodes (object, relation, parent, attribute, etc.) can

48

4.2. SPREADING ACTIVATION IN ASKNET

have its own firing threshold and even its own firing algorithm. Each node type has

a global signal attenuation value that controls the percentage of the activation that a

node of this type passes on to each of its neighbours when firing. This mirrors natural

neural networks, and also ensures that the network will always eventually return to

a stable state, as with each successive firing, some activation is lost and thus firing

cannot continue indefinitely.

Spreading activation is by nature a parallel process, however it is implemented

sequentially in ASKNet for purely computational reasons. While future work may

allow parallelisation of the algorithm, the current system has been designed to ensure

that the sequential nature of the processing does not adversely affect the outcome.

Two separate implementations of the firing algorithm have been created. The first

is a pulsing algorithm where each node which is prepared to fire at any given stage

fires and the activation is suspended until all nodes have finished firing. This is

analogous to having the nodes fire simultaneously on set pulses of time. The second

implementation of the firing algorithm uses a priority queue to allow the nodes with

the greatest amount of activation to fire first (for more detailed information see Section

4.2.3).The second algorithm is more analogous to the asynchronous firing of neurones

in the human brain, however both implementations have been fully implemented and

the user can choose which firing method they wish the system to use.

4.2.1 Update Algorithm: Example

The update algorithm takes a smaller network or network fragment (update network)

and integrates it into a larger network (main network). Essentially the same algorithm

is used for updating at the sentence level and at the document level. When updating at

the sentence level, the update network represents the next sentence in the document

and the main network represents all previous sentences in the document. When

49

4.2. SPREADING ACTIVATION IN ASKNET

updating at the document level, the update network represents the document, and

the main network represents all of the documents that have been processed by the

system.

This section of the thesis attempts to convey an understanding of the update

algorithm by walking through a simple example. This example uses a simplified

network to avoid unnecessary details. The numbers used here are not representative

of the values received in the actual ASKNet system; the changes in values at each

update have been increased so that we can see a change in the network after only one

iteration. In normal performance, changes of this magnitude would require multiple

iterations of the algorithm.

For this example, we consider the update shown in Figure 4.3 where we are at-

tempting to update a network containing information on United States politics, along

with a few “red herrings”, with an update network formed from the sentence “Bush

beat Gore to the White House”. All nodes in this example will be referred to by their

ID field.

Initially, all named entity nodes from the update network are matched with any

similar nodes from the main network1. The nodes are compared on simple similarity

characteristics as computed in Equation 4.2. A similarity score is then calculated for

each node pairing producing the matrix shown in Table 4.1. For the purposes of this

example, we will assume that initially all of the similarity scores for disputed nodes

are equal.

Once the initial scoring is completed, the algorithm chooses a node from the update

network (in this case let us choose bu) and attempts to refine its similarity scores. In

order to do this, it first puts activation into the bu node, then allows the network to

1This algorithm is not restricted to named entity nodes, and could be computed for any or allnode type. However for clarity of explanation, we will restrict the following example to named entitynodes.

50

4.2. SPREADING ACTIVATION IN ASKNET

Figure 4.3: An example main network containing information about United Statespolitics, writers and mathematicians being updated by a network fragment formedfrom the sentence “Bush beat Gore to the White House.”

georgebush johnbush algore gorevidal whitehousebu 0.5 0.5go 0.5 0.5wh 0.5

Table 4.1: Similarity Matrix: Initial scoring

fire. This results in activation accumulating in the go and wh nodes. The amount of

activation in each node will depend on the structure of the network and the strength

of the various links. For the purposes of this example, it is sufficient to see that the

go and wh nodes will receive some activation.

Once the network has settled into a stable state, the activation from all update

51

4.2. SPREADING ACTIVATION IN ASKNET

scorei,j = α ∗NEBooli,j + β ∗ |labelsi ∩ labelsj||labelsi ∪ labelsj|

(4.2)

Symbol Definitionsscorex,y The initial similarity score computed for the node

pair (x,y)α Weighting given to named entity similarityNEBoolx,y A boolean set to 1 if x and y have the same NE

type otherwise set to 0β Weighting given to label similaritylabelsx The set of textual labels of node x

network nodes, except the node being refined (in this case bu), is transferred to the

corresponding nodes in the main network as seen in Figure 4.4. The activation for

nodes with more than one potential mapping is split among all potential main network

candidate nodes based on the current similarity matrix score. In this example, since

the similarity matrix scores (go,algore) and (go,gorevidal) are equal, the activation

from go is split evenly.

The main network is then allowed to fire, and the transferred activation spreads

throughout the main network. In our example, some of the activation from the

algore and whitehouse nodes will reach the georgebush node, the activation from

gorevidal node will not reach any named entity nodes, and the johnbush node will

receive no activation at all.

The algorithm can now refine the similarity scores based on the activation received.

Since the georgebush node received some activation, we will increase the similarity

score for (bu,georgebush) slightly. The johnbush node did not receive any activation

at all, and so we will decrease the similarity score for (bu,johnbush). The resulting

similarity matrix is shown in Table 4.2.

The algorithm has now used semantic information to refine the scores for mapping

of the bu node. Since the nodes which bu and georgebush are connected to are

52

4.2. SPREADING ACTIVATION IN ASKNET

Figure 4.4: The activation from the update network is transferred to the main net-work. For nodes with more than one potential mapping, the activation is split basedon the current similarity matrix score.

georgebush johnbush algore gorevidal whitehousebu 0.6 0.25go 0.5 0.5wh 0.5

Table 4.2: Similarity Matrix: After refining scores for bu node

similar, we have an increased confidence that they are referring to the same real-

world entity, and since the nodes which bu and johnbush are connected to share no

similarity, we have decreased our confidence in their referring to the same real world

entity.

53

4.2. SPREADING ACTIVATION IN ASKNET

The algorithm then attempts to refine the scores for another node in the update

network (let us choose go). The process this time is similar, however rather than the

activation from bu being transferred evenly to the potential main network matches

as the activation from go was in the previous iteration, it is instead transferred more

heavily to the georgebush node, since the similarity score for (bu,georgebush) is now

higher than that of (bu,johnbush) As depicted in Figure 4.5. Increasing the activation

in the georgebush node means that the algore node will receive more activation,

and thus we will increase its score more than we would have, had we not refined the

scores for the bu node. Thus, as we refine our similarity matrix becomes more refined

with each iteration. The similarity matrix after the first complete iteration can be

seen in Table 4.3.

georgebush johnbush algore gorevidal whitehousebu 0.6 0.25go 0.65 0.25wh 0.5

Table 4.3: Similarity Matrix: After refining scores for bu and go nodes

Refining the scores for the bu and go nodes means that when we attempt to refine

the final wh node, less activation is wasted in the activation transfer from update

network to main network (i.e., the georgebush and algore nodes get much more

activation and the johnbush and gorevidal nodes get less). Therefore when the

main network is fired, the whitehouse node receives more activation than it would

have if we had not refined the other scores. Thus we increase its score more, resulting

in the similarity matrix given in Table 4.4.

After one iteration of the update algorithm, the similarity matrix has been im-

proved. As we saw already in the first iteration, the increased confidence in one

mapping leads to large increases in confidence in future mappings (this is in line with

intuition; as we gain increased confidence in one area, we can use that confidence to

54

4.2. SPREADING ACTIVATION IN ASKNET

Figure 4.5: The activation from the update network is transferred to the main net-work. The activation from the bu node is split unevenly, with a higher percentagegoing to georgebush than johnbush due to our updated similarity scores.

georgebush johnbush algore gorevidal whitehousebu 0.6 0.25go 0.65 0.25wh 0.7

Table 4.4: Similarity Matrix: After one iteration of the update algorithm

make bolder predictions in related areas). Therefore, the update algorithm becomes

a self reinforcing loop, allowing the similarity scores to converge.

The update algorithm is run multiple times on a single update, or until all update

nodes have a similarity score above a set threshold (called the mapping threshold).

55

4.2. SPREADING ACTIVATION IN ASKNET

When the algorithm terminates, any pairs with a similarity score above the mapping

threshold are mapped together, and all non-mapped nodes are simply placed into

the main network. Eventually, our example should map wh to whitehouse, bu to

georgebush and go to algoregore, resulting in the updated network shown in Figure

4.6.

Figure 4.6: Network resulting from application of the update algorithm

This is a simplified example, and many features of the algorithm have been ab-

stracted away. However it gives an overall understanding of how the algorithm works,

and hopefully provides an insight into the intuition behind the update algorithm. Fur-

ther details of the particulars are provided in the next section.

4.2.2 Update Algorithm: Implementation

The update algorithm, like the rest of ASKNet, is implemented in Java. The majority

of the algorithm is implemented by the NetMerger class, which takes two networks as

parameters and returns a list of nodes which should be mapped together. The update

56

4.2. SPREADING ACTIVATION IN ASKNET

algorithm was designed to be executable on very large networks, and thus required

data structures and sub-algorithms that would ensure that the overall algorithm would

be efficient in terms of both time and memory usage.

In this section we will explain the three main classes which perform the update

algorithm, and the data structures which they use to implement the algorithm effi-

ciently.

NetMerger

The NetMerger class implements the actual update algorithm as described in Sec-

tion 4.2.1. The mergeDiscourse method performs the updates at a sentence level,

mapping each sentence’s drs into a single update network. The mergeNetworks

method performs an almost identical algorithm to merge the update networks into

the main network. The two methods are implemented separately so that different

firing parameters can be used for the two types of updates, and also so that in future

additional features such as improved anaphora resolution could be implemented in

the mergeDiscourse method.

The merge methods do not actually perform the mappings, but rather calculate the

similarity scores, and return listings of node pairs which should be mapped together.

The map method then performs the actual mapping, calling appropriate methods

from the SemNet class.

The NetMerger class never directly manipulates network nodes. It works exclu-

57

4.2. SPREADING ACTIVATION IN ASKNET

sively with node IDs in string form, and interfaces with the SemNet class to perform

all necessary functions. This results in increased modularity of code, and also simpli-

fies the writing and debugging of the algorithms.

ScoreMatrix

With networks potentially reaching millions of nodes, it is obviously inefficient to

calculate and store the similarity scores for all possible node pairs, particularly as the

vast majority of the scores would never be updated. For this reason, the scoreMatrix

class is implemented as a specialised sparse matrix.

The scoresTable hashtable maps the ID of each node in the update network to

a hashset of MapScore objects, which represents all of the similarity scores in its

row. This allows calculation of the relevant elements of the similarity matrix without

making it necessary to create objects for similarity scores which never get updated.

If a score is never updated, it is never created, thus conserving memory. The use

of hashtables and hashsets also allows efficient lookup of individual similarity scores,

thus allowing individual scores to be calculated without the need to search through

entire rows or columns of the matrix. Figure 4.7 shows an example of the data

structures used.

58

4.2. SPREADING ACTIVATION IN ASKNET

Figure 4.7: An example similarity matrix and the corresponding ScoreMatrix datastructure

MapScore

MapScore objects represent a similarity score for a pair of nodes. The update and

main network nodes are differentiated, as several of the functions which use MapScore

59

4.2. SPREADING ACTIVATION IN ASKNET

are designed around the assumption that the update algorithms will normally be

performed between one very large network and one relatively small network. Thus

differentiating the networks to which the nodes belong is very important.

4.2.3 Firing Algorithms

Ultimately each node in a neural network should act independently, firing whenever

it receives the appropriate amount of activation. This asynchronous communication

between nodes is more directly analogous to the workings of the human brain, and

most spreading activation theories assume a completely asynchronous model.

In practice, it is difficult to have all nodes operating in parallel. ASKNet attempts

to emulate an asynchronous network through its firing algorithm. Each network has

a SemFire object (see Section 3.2.2 for class diagram) which controls the firing of the

nodes in that network.

When a node in the network is prepared to fire, it sends a firing request to

the SemFire object. The SemFire object then holds the request until the appropriate

time before sending a firing permission message to the node allowing it to fire.

Two separate firing algorithms have been implemented in ASKNet.

60

4.2. SPREADING ACTIVATION IN ASKNET

Pulse Firing

The pulse firing algorithm emulates a network where all nodes fire simultaneously at

a given epoch of time. Each node that is prepared to fire at a given time fires, and the

system waits until all nodes have fired and all activation levels have been calculated

before beginning the next firing round.

To implement this algorithm, the SemFire object retains two lists of requests. The

first is the list of firing requests which will be fulfilled on this pulse; we will call this

list the pulse list. The second list contains all requests made during the current pulse;

we will call this the wait list. The SemFire object fires all of the nodes with requests

in the pulse list, removing a request once it has been fulfilled (in this algorithm the

order of firing is irrelevant), while placing all firing requests it receives into the wait

list. Once the pulse list is empty, and all requests from the current pulse have been

collected in the wait list, the SemFire object simply moves all requests from the wait

list into the pulse list, and is then ready for the next pulse.

Priority Firing

The priority firing algorithm emulates a network where the amount of activation

received by a node dictates the speed with which the node fires. Nodes receiving

higher amounts of activation will fire faster than nodes which receive just enough to

meet their firing threshold.

To implement this algorithm, the SemFire object retains a priority queue of re-

quests, where each request is assigned a priority based on the amount of activation

it received over its activation threshold (4.3). The SemFire object fulfills the highest

priority request; if a new request is received while the first request is being processed,

it is added to the queue immediately.

61

4.2. SPREADING ACTIVATION IN ASKNET

priorityi = αi(acti − leveli) (4.3)

Symbol Definitionspriorityx The priority of node xαx Type priority variable, dependant on node type of x (can be set

to give a higher priority to a particular node type)acti Activation level of node xleveli Firing level of node x

The two firing algorithms would be equivalent if all of the activation in the network

spread equally. However when a node fires, it sends out a set amount of activation,

and excess activation received above the firing threshold disappears from the network.

The effect of this disappearing activation is that the order in which nodes fire can

change the final pattern of activation in the network. It is therefore important that

both firing algorithms be implemented and tested so that a choice can be made based

on their contribution to the performance of the system.

In practice both firing algorithms obtain similar results with the minor differences

being cancelled out when processing large data sets. For the experimental data in

this thesis we have chosen to use the pulse firing algorithm as it allows for easier

debugging since we can pause the firing at any step to get a “freeze-frame” of the

system mid-fire.

62

Chapter 5

Evaluation

Evaluation of large scale semantic network creation systems is a difficult task. The

scale and complexity of the networks makes traditional evaluation metrics impractical,

and since ASKNet is the first system of its kind, there are no existing systems against

which we can directly compare. In this chapter we discuss the metrics we have

developed in order to evaluate ASKNet, and describe the implementation and results

of those metrics.

One of the most important aspects of ASKNet is its ability to create large scale

semantic networks efficiently. In particular, this means that as the size of a network

grows, the time taken to add a new node should not increase exponentially. In

order to evaluate the efficiency of ASKNet, we first show evidence of its ability to

efficiently create semantic networks on a scale comparable with the largest available

resources. We then establish an upper bound on network creation time, showing

that as the network grows, the time required to add new nodes increases linearly.

The establishment of this upper bound is very important, as although it is necessary

to show that ASKNet can efficiently create networks with a size comparable to any

existing resource, the upper bound shows that the existing algorithms can scale up

63

5.1. NETWORK CREATION SPEED

to networks many orders of magnitude larger, on a scale which simply can not be

achieved by manual creation.

Creating large networks efficiently is trivial if there is no regard to the quality of

the networks produced. (for example, one could simply create nodes and links at

random). It is therefore important that we establish the quality of the networks pro-

duced by ASKNet. This is not a trivial task, as the networks are generally too large

to evaluate manually, and there exists no gold standard against which we can com-

pare. We attempt to solve these problems by evaluating the precision of a “network

core”, a subset of the network containing the most important information from the

input documents. Using humans to evaluate these cores, we were able to establish a

precision score of 79.1%. This is a very promising result for such a difficult task.

While human evaluation of the network cores provides a good direct measure of

the quality of a portion of the network, it is also important to evaluate the produced

networks as a whole. Therefore we implement a task based evaluation, using ASKNet

to perform a real world task and comparing its results against those of state of the

art systems using other methodologies. In this instance, we have chosen automatic

judgement of the semantic relatedness of words, a task for which we believe ASKNet

to be well suited. We compare ASKNet’s scores against human judgements, and find

that it correlates at least as well as the scores of top performing systems utilising

WordNet, pointwise mutual information or vector based approaches. This evaluation

will be described in Chapter 6.

5.1 Network Creation Speed

One of the major goals of the ASKNet system is to efficiently develop semantic re-

sources on a scale never before available. For this end, it is not only important that

64

5.1. NETWORK CREATION SPEED

we are able to build networks quickly, it is also imperative that the time required

to build networks does not increase exponentially with respect to the network size.

This is one of the advantages of using spreading activation algorithms; since the area

of the network affected by any one firing is not dependent on the size of the overall

network, the time taken by the algorithms should not increase exponentially with the

network size.

In order to evaluate the network creation speed, we chose articles of newspaper

text from the 1998 New York Times as taken from the AQUAINT Corpus of English

News Text1, which mentioned then United States President Bill Clinton. By choosing

articles mentioning a single individual we hoped to create an overlap in named entity

space without limiting the amount of data available. This ensured that the update

algorithm (see Section 4.2.1) was run more frequently than would be expected in

unrestricted text, thus giving us a good upper bound on the performance of ASKNet.

In order to further ensure that this experiment gave a true representation of the

network creation speed, all spreading activation based parameters were set to allow

for maximum firing strength and spread. This meant that as many nodes as possible

were involved in the update algorithm, thus increasing the algorithmic complexity as

much as possible.

After processing approximately 2 million sentences, ASKNet was able to build a

network of over 1.5 million nodes and 3.5 million links in less than 3 days. This time

also takes into account the parsing and semantic analysis (See Table 5.1). This is a

vast improvement over manually created networks which take years or even decades

to achieve networks of less than half this size [Matuszek et al., 2006].

During the creation of the network, the time taken to add a new node to the

network was monitored and recorded against the number of nodes in the network.

1Made available by the Linguistic Data Consortium (LDC).

65

5.1. NETWORK CREATION SPEED

Total Number of Nodes 1,500,413Total Number of Edges 3,781,088

Time: Parsing 31hrs : 30 minTime: Semantic Analysis 16 hrs: 54 minTime: Building Network &Information Integration 22 hrs : 24 min

Time: Total 70 hrs : 48 min

Table 5.1: Statistics pertaining to the creation of a large scale semantic network

This allowed us to chart the rate at which network creation speed slowed in relation

to network size. The results are shown in Figure 5.1.

Figure 5.1: Average time to add a new node to the network vs. total number of nodes

As the network size began to grow, the average time required to add a new node

began to climb exponentially. This is to be expected in a small network as the update

algorithm’s spreading activation would reach most or all nodes in the network, and

so each additional node would increase the time required for each run of the update

algorithm.

Because the spreading activation algorithms are localised (see Section 4.1), once

66

5.2. MANUAL EVALUATION

the network becomes so large that the activation does not spread to the majority

of nodes, an addition of a new node is unlikely to have any effect on the spreading

activation algorithm. As we see in Figure 5.1, when the network size hits a critical

point (in this case approximately 850,000 nodes) the average time required to add

a new node begins to grow linearly with respect to network size. This shows that

average node insertion time eventually grows linearly with the size of the network,

which implies that the total time to create a network (assuming the network is large

enough) is linear with respect to the network’s size.

This result was obtained by choosing input sentences on similar topics, with high

named entity overlap, and the maximum possible activation spread. In practice the

exponential growth of node creation time would terminate at a much smaller network

size. Eventually average node insertion time becomes effectively constant as adding

new nodes becomes less likely to affect firing algorithms for other parts of the network.

This evaluation has empirically established that (for networks over a given size) the

time required to build the network grows linearly with respect to network size. This

is important in establishing ASKNet’s ability to build very large scale networks. It is

promising that we were able to build a network twice as large as anything previously

in existence in only a matter of days; however it is even more promising that the

growth rate has been shown to be sub-exponential. This means that there is very

little limitation on the size of network that could potentially be created with ASKNet.

5.2 Manual Evaluation

Evaluating large-scale semantic networks is a difficult task. Traditional nlp evalua-

tion metrics such as precision and recall do not apply so readily to semantic networks;

the networks are too large to be directly evaluated by humans; and even the notion

67

5.2. MANUAL EVALUATION

of what a “correct” network should look like is difficult to define.

nlp evaluation metrics also typically assume a uniform importance of information.

However, when considering semantic networks, there is often a distinction between

relevant and irrelevant information. For example, a network containing information

about the Second World War could contain the fact that September 3rd 1939 was

the day that the Allies declared war on Germany, and also the fact that it was a

Sunday. Clearly for many applications the former fact is much more relevant than

the latter. In order to achieve a meaningful precision metric for a semantic network,

it is important to focus the evaluation on high-relevance portions of the network.

There is no gold-standard resource against which these networks can be evaluated,

and given their size and complexity it is highly unlikely that any such resource will

be built. Therefore evaluation can either be performed by direct human evaluation or

indirect, application based evaluation. For this chapter we have chosen direct, human

evaluation.

The size of the networks created by ASKNet makes human evaluation of the entire

network impossible. It is therefore necessary to define a subset of the network on which

to focus evaluation efforts. In early experiments, we found that human evaluators had

difficulty in accurately evaluating networks with more than 20 - 30 object nodes and

30 - 40 relations. Rather than simply evaluating a random subset of the network,

which may be of low-relevance, we evaluated a network core, which we define as a

set of high-relevance nodes, and the network paths which connect them. This allows

us to maintain a reasonable sized network for evaluation, while still ensuring that we

are focusing our efforts on the high-relevance portions of the network. These are also

likely to be the portions of the network which have undergone the most iterations of

the update algorithm. Therefore the evaluation will be more likely to give an accurate

representation of ASKNet’s overall capability, rather than being dominated by the

68

5.2. MANUAL EVALUATION

quality of the nlp tools used.

We evaluated networks based on documents from the 2006 Document Understand-

ing Conference (duc). These documents are taken from multiple newspaper sources

and grouped by topic. This allows us to evaluate ASKNet on a variety of inputs

covering a range of topics, while ensuring that the update algorithm is tested by the

repetition of entities across documents. In total we used 125 documents covering 5

topics, where topics were randomly chosen from the 50 topics covered in duc 2006.

The topics chosen were: Israeli West Bank Settlements, Computer Viruses, NASA’s

Galileo Mission, the 2001 Election of Vladimir Putin and the Elian Gonzalez Custody

Battle.

5.2.1 Building the Network Core

Our task in building the core is to reduce the size of the evaluation network while

maintaining the most relevant information for this particular type of network (news-

paper text). We begin to build the core by adding all named entity nodes which are

mentioned in more than 10% of the documents (a value picked for pragmatic purposes

of obtaining a core with an appropriate size). In evaluating the duc data, we find

that over 50% of the named entity nodes are only mentioned in a single document

(and thus are very unlikely to be central to the understanding of the topic). Applying

this restriction reduces the number of named entities to an average of 12 per topic

network while still ensuring that the most important entities remain in the core.

For each of the named entity nodes in the core, we perform a variation of Dijkstra’s

algorithm [Dijkstra, 1959] to find the strongest path to every other named entity node

in the core. Rather than using the link weights to determine the shortest path, as

in the normal Dijkstra’s algorithm, we use the spreading activation algorithm to

69

5.2. MANUAL EVALUATION

determine the path along which the greatest amount of activation will travel between

the two nodes, which we call the primary path. Adding all of these paths to the

core results in a representation containing the most important named entities in the

network, and the primary path between each pair of nodes (if such a path exists).

Pseudo code for this algorithm is given in Figure 5.2

Algorithm 1: CreateCore

Data: ASKNetnetworkA = (N, V )‖N = nodes, V = linksResult: Core : The network core of Abegin

NENodes←− {n ∈ N‖ n is a named entity node};CoreNodes←− {n ∈ N‖ n appeared in >10% of documents };CoreNEs←− {NENodes ∩ CoreNodes};PathNodes←− ∅;PathLinks←− ∅;Core←− ∅;for x ∈ CoreNEs do

for y ∈ CoreNEs‖y 6= x dogiveActivation(x);while notReceivedActivation(y) do

fireNetwork(A);giveActivation(x);

/* Trace the path of maximum activation from y back to x*/

tempNode = y ;while tempNode 6= x do

prevNode = tempNode;/* maxContrib(i) returns the node which sent the most

activation to i */

tempNode = maxContrib(tempNode);PathNodes←− PathNodes ∪ tempNode;PathLinks←− PathLinks ∪ link(prevNode, tempNode);

Core = (CoreNodes ∪ PathNodes, PathLinks);end

Figure 5.2: Pseudocode for the algorithm used to create a network core given anASKNet network

The core that results from the Dijkstra-like algorithm focuses on the relationships

between the primary entities and discards peripheral information about individual

70

5.2. MANUAL EVALUATION

entities within the network. It also focuses on the strongest paths, which represent

the most salient relationships between entities and leaves out the less salient relation-

ships (represented by the weaker paths). As an example, the core obtained from the

“Elian Gonzalez Custody Battle” network (See Figure 5.3) maintained the primary

relationships between the important entities within the network, but discarded infor-

mation such as the dates of many trials, the quotes of less important figures relating

to the case, and information about entities which did not directly relate to the case

itself.

Running the algorithm on each of the topic networks produced from the duc data

results in cores with an average of 20 object nodes and 32 relations per network,

which falls within the acceptable limit for human evaluation. An additional benefit

of building the core in this manner is that, since the resulting core tends to contain

the most salient nodes and relations in the network, human evaluators can easily

identify which portions of the network relate to which aspect of the stories.

We also found during our experiments that the core tended to stabilise over time.

On average only 2 object nodes and no named entity nodes changed within the core

of each network between inputting the 20th and the 25th document of a particular

duc category. This indicates that the core, defined in this way, is a relatively stable

subset of the network, and represents information which is central to the story, and

is therefore being repeated in each article.

5.2.2 Evaluating the Network Core

ASKNet uses the GraphViz [Gansner and North, 2000] library to produce graphical

output. This allows human evaluators to quickly and intuitively assess the correctness

of portions of the network. One network was created for each of the 5 topics, and

71

5.2. MANUAL EVALUATION

TYPE = loc washingtonTYPE = loc havanawife

DATE:+-XXXX-04-06

travel

from toon with

to

TYPE = loc cuba

TYPE = per

gonzalez

miguel

juan

remain

relative

briefly

with

TYPE = locmiami

judge

TYPE = per

gonzalez

elian

son

return goleaveof show

by

DATE:+-XXXX-01-14

want

NOT

to

forin

TYPE = loc floridaboat

TYPE = org immigration

wednesday

meet

hour

nearly

for with

with

with

relative

of

TYPE = org naturalization_service

officialTYPE = org ins

in

lawyer

TYPE = loc united_states

in

unclearonce

whetherreunite

TYPE = per

us

circuit

court

of

TYPE = org appeals

panel

in

TYPE = loc atlanta

rule

for

father

of

TYPE = org justice_department

order

transfer

onof

morning

thursday

in

TYPE = per

reno

janet

general

attorney

attourney

say

in

pleased

act

unanimously

TYPE = org circuit_court

11th

on tuesdaywith night

television

state

TYPE = per

castro

fidel

president

-Named Entity

-Entity

-Relation

-Attribute

-Connector

-Semantic Relation Synonymy, Meronymy, Definition, etc.

Legend

Figure 5.3: Graphical representation for topic: “Elian Gonzalez Custody Battle”.

graphical representations were output for each network. Examples of the graphical

representations of the network cores used for evaluation are shown in Figure 5.3 and

Figure 5.5. Magnified views of the representations are also given in Figure 5.4 and

Figure 5.6. To ease the evaluator’s task, we have chosen to output the graphs without

the recursive nesting. In some cases, connector nodes (ovals) were added to provide

information that was lost due to the removal of the nesting. Each of the 5 topic

networks was evaluated by 3 human evaluators. (The networks were distributed in

such a way as to ensure that no two networks were evaluated by the same 3 evaluators).

5 evaluators participated in the experiment, all of whom were graduate students in

non computer science subjects and who spoke English as a first language. None of

72

5.2. MANUAL EVALUATION

Figure 5.4: Expanded section of Figure 5.3.

the evaluators had any prior experience with ASKNet or similar semantic networks.

The evaluators were provided with the graphical output of the networks they were to

assess, the sentences that were used in the formation of each path, and a document

explaining the nature of the project, the formalities of the graphical representation,

and the step-by-step instructions for performing the evaluation.2

The evaluation was divided into 2 sections and errors were classified into 3 types.

The evaluators were first asked to evaluate the named entity nodes in the network, to

determine if each node had a type error (an incorrect named entity type as assigned

by the named entity tagger as shown in Figure 5.7), or a label error (an incorrect set

2All of the evaluation materials provided to the evaluators can be found online atwww.brianharrington.net/asknet.

73

5.2. MANUAL EVALUATION

TYPE = per

putin

vladimir

president

acting

prime

minister

russian

'president-elect'

return

head

promise born

champion

publish

TYPE = loc moscow

to

after wednesdayearly

visit

TYPE = loc ivanovo

is

to

transform

TYPE = loc russia

TYPE = per

yeltsin

boris

president

russian

leave

announcement

of surpriseresignname

nation

as

president

acting

in on

TYPE = loc leningrad DATE:+-1952-10-07

TYPE = loc st_petersburg

TYPE = loc chechnya

war

current in

TYPE = org communist party

zyuganov gennady

is come

rival

of

inin

election

presidentialDATE:+-1996-XX-XX

TYPE = loc kremlin

town northeast

180 miletextile factory down-at-the-heels of

TYPE = per

stepashin

former

prime

minister

sergei

express

belief

pave

resignation

of

way

for

race

ofinto

TYPE = org central_election_commission

TYPE = org cecscheduleof

DATE:+-XXXX-05-05

tentativelyinauguration for

afterwaitand

listplan

ofeconomic

candidate

ministerial

Figure 5.5: Graphical representation for topic: “2001 Election of Vladimir Putin”.

of labels, indicating that the node did not correspond to a single real world entity

as shown in Figure 5.8). The evaluators were then asked to evaluate each primary

path. If there was an error at any point in the path, the entire path was said to have

74

5.2. MANUAL EVALUATION

Figure 5.6: Expanded section of Figure 5.5.

a path error (as shown in the Figure 5.9) and deemed to be incorrect. In particular,

it is important to notice that in the bottom example of Figure 5.9, the error actually

caused several paths (i.e., “Melissa Virus” - “Microsoft Word”, “Melissa Virus” -

“Microsoft Word documents” and “Microsoft Word” - “Microsoft Word documents”)

to be considered incorrect. This lowered the overall network scores, by potentially

penalising the same mistake multiple times, but as in all stages of this evaluation

we felt it important to err on the side of caution to ensure that our results were

under-estimations of network quality rather than over-estimations.

The error types were recorded separately in an attempt to discover their source.

Type errors are caused by the named entity tagger, label errors by the update algo-

rithm or the semantic analyser (Boxer), and path errors by the parser or Boxer.

75

5.2. MANUAL EVALUATION

Figure 5.7: Examples of type errors. Left: “Melissa virus” , a computer virus identi-fied as a location. Right: “Gaza Strip”, a location identified as an organisation.

Figure 5.8: Examples of label errors. Left: After processing a sentence containingthe phrase “...arriving in Israel Sunday to conduct...” the phrase “Israel Sunday” ismistakenly identified as a single entity. Right: The location “Miami” and the person“Miami Judge” collapsed into a single node.

5.2.3 Results

The scores reported by the human evaluators are given in Table 5.2. The scores given

are the percentage of nodes and paths that were represented entirely correctly. A

named entity node with either a type or label error was considered incorrect, and any

path segment containing a path error resulted in the entire path being labelled as

incorrect.

The overall average precision was 79.1%, with a Kappa Coefficient [Carletta, 1996]

of 0.69 indicating a high level of agreement between evaluators.

Due to the nature of the evaluation, we can perform further analysis on the errors

76

5.2. MANUAL EVALUATION

Figure 5.9: Examples of path errors. Top: Three independent meetings referenced inthe same sentence, all involving relatives of Elian Gonzalez, are identified as a singlemeeting event. Bottom: The network indicates that the computer, rather than thevirus, hides in Microsoft Word documents.

77

5.2. MANUAL EVALUATION

Topic Eval 1 Eval 2 Eval 3 AvgElian Gonzalez 88.2% 70.1% 75.0% 77.6%Galileo Probe 82.6% 87.0% 91.3% 87.0%Viruses 68.4% 73.7% 73.7% 71.9%Vladimir Putin 90.3% 82.8% 94.7% 89.9%West Bank 68.2% 77.3% 70.0% 72.3%Average Precision: 79.1%

Table 5.2: Evaluation Results

Topic NE Type Label PathElian Gonzalez 8.3% 50.5% 41.7%Galileo Probe 22.2% 55.6% 22.2%Viruses 93.8% 0.0% 6.3%Vladimir Putin 22.2% 33.3% 44.4%West Bank 66.7% 27.8% 5.6%Total: 43.4% 32.9% 23.7%

Table 5.3: Errors by Type

reported by the evaluators, and categorize each error by type as seen in Table 5.3.

The results in Table 5.3 indicate that the errors within the network are not from a

single source, but rather are scattered across each of the steps. The NE Type errors

were made by the ner tool. The Label errors came from either Boxer (mostly from

mis-judged entity variable allocation), or from the Update Algorithm (from merging

nodes which were not co-referent). The Path errors were caused by either the parser

mis-parsing the sentence, Boxer mis-analysing the semantics, or from inappropriate

mappings in the Update Algorithm.

The errors appear to be relatively evenly distributed, indicating that, as each of

the tools used in the system improves, the overall quality of the network will increase.

Some topics tended to cause particular types of problems. Notably, the ner tool

performed very poorly on the Viruses topic. This is to be expected as the majority of

the named entities were names of computer viruses or software programs that would

not have existed in the training data used for the ner tagging model.

An overall precision of 79.1% is highly promising for such a difficult task. The

78

5.2. MANUAL EVALUATION

high score indicates that, while semantic network creation is by no means a solved

problem, it is possible to create a system which combines multiple natural language

inputs into a single cohesive knowledge network and does so with a high level of

precision. In particular we have shown that ASKNet’s use of spreading activation

techniques results in a high quality network core, with the most important named

entities and the relations between those entities being properly represented in the

majority of cases.

79

Chapter 6

Semantic Relatedness

The ability to determine semantic relatedness between two terms could be of great use

to a wide variety of nlp applications, such as information retrieval, query expansion,

word sense disambiguation and text summarisation [Budanitsky and Hirst, 2006].

However, it is important to draw a distinction between semantic relatedness and

semantic similarity. Resnik 1999 illustrates this point by writing “Semantic similarity

represents a special case of semantic relatedness: for example, cars and gasoline

would seem to be more closely related than, say, cars and bicycles, but the latter

pair are certainly more similar”. Budanitsky & Hirst 2006 further point out that

“Computational applications typically require relatedness rather than just similarity;

for example, money and river are cues to the in-context meaning of bank that are

just as good as trust company”. Despite these distinctions, many papers continue

use these terms interchangeably. For the purposes of this thesis, we will continue to

honour this distinction, and we will use the term semantic distance to refer to the

union of these two concepts.

One of the most popular modern methods for automatically judging semantic

distance is the use of WordNet [Fellbaum, 1998], using the paths between words

80

in the taxonomy as a measure of distance. While many of these approaches have

obtained promising results for measuring semantic similarity [Jiang and Conrath,

1997, Banerjee and Pedersen, 2003], the results for measuring semantic distance have

been much less promising [Hughes and Ramage, 2007].

One of the major drawbacks of using WordNet as a basis for evaluating semantic

relatedness is its hierarchical taxonomic structure. This results in terms such “car”

and “bicycle” being very close in the network, but terms such as “car” and “gasoline”

being separated by a great distance. Another difficulty results from the non-scalability

of WordNet which we addressed in Section 1.2.1. While the quality of the network is

very high, the manual nature of its construction prohibits it from having the coverage

necessary to be able to reliably obtain scores for any arbitrary word pair. ASKNet’s

non-hierarchical nature, and generalised relation links combined with its robustness

in dealing with different types of input make it, at least in principle, a much more

suitable resource to use for discovering semantic relatedness between terms.

An additional way of obtaining semantic distance scores is to calculate pointwise

mutual information (pmi) across a large corpus. By obtaining the frequency with

which words co-occur in the corpus and dividing by the total number of times each

term appears, one can obtain a measure of association of those terms within that

corpus. If the corpus is large enough, these values can be used as a measure of

semantic distance. The drawback of this methodology is that it requires a very large

corpus, and while word co-occurrences can be computed efficiently, it is still necessary

to process a great deal of information in order to build a representative score.

One alternative to computing pmi based on word co-occurrence, is to use the

number of results retrieved by large search engines when searching for individual

words and word pairs. This method has been used successfully in measuring semantic

similarity [Turney, 2001].

81

6.1. USING ASKNET TO OBTAIN SEMANTIC RELATEDNESS SCORES

A final method for improving upon simple word co-occurrence is the use of vector

space models, which can use word co-occurence as features in the vectors, but can

also incorporate additional linguistic information such as syntactic relations. Using

these additional types of information has been shown to improve scores in similar

tasks [Pado and Lapata, 2007].

Of these traditional methods for obtaining scores of semantic distance, none are

particularly suited to measuring semantic relatedness, as opposed to similarity. All

of them (with the exception of [Turney, 2001]) require either a manually created

resource or a large, pre-compiled corpus. In this section we will detail an alternative

methodology which uses ASKNet to obtain scores for semantic relatedness using a

relatively small corpus automatically harvested from the web with minimal human

intervention.

6.1 Using ASKNet to Obtain Semantic Related-

ness Scores

In this chapter we detail two experiments using ASKNet to obtain scores for the

semantic relatedness of word pairs, and comparing those scores against human gen-

erated scores for the same word pairs. In the first experiment, the ASKNet networks

are built on a corpus obtained directly from search engine results. The second experi-

ment has an identical methodology to the first, but the corpus is improved by using a

few simple heuristics to obtain a more representative corpus, which results in a large

improvement to the correlation scores.

Once a large scale ASKNet network is constructed, it is possible to use the spread-

ing activation functions of the network (as described in Section 4.1) to efficiently

82

6.1. USING ASKNET TO OBTAIN SEMANTIC RELATEDNESS SCORES

dist(x, y, α) = act(x, y, α) + act(y, x, α) (6.1)

Symbol Definitionsact(i,j,α) total amount of activation received by node j when node i

is given α activation and then the network is allowed to fire

obtain a distance score between any node pair (x,y). This score is obtained by plac-

ing a set amount of activation (α) in node x, allowing the network to fire until it

stabilises, and then noting the total amount of activation received during this process

by node y, which we will call act(x,y,α). This process is repeated starting with node

y to obtain act(y,x,α). We will call the sum of these two values dist(x,y,α), and,

since we will be using a constant value for α, will will shorten this to dist(x,y).

dist(x,y) is a measure of the total strength of connection between nodes x and

y, relative to the other nodes in their region. This takes into account not just direct

paths, but also indirect paths, if the links along those paths are of sufficient strength.

Since ASKNet relations are general, and not hierarchical in nature, this score can be

used as a measure of semantic relatedness between two terms. If we take the (car,

gasoline), (car, bicycle) example mentioned earlier, we can see that firing the node

representing car in ASKNet should result in more activation being sent to gasoline

than to bicycle as the former shares more direct and indirect relations with car. This

means that unlike WordNet or other taxonomic resources, ASKNet can be directly

used to infer semantic relatedness, rather than semantic similarity.

In order to evaluate ASKNet’s ability to produce measures of semantic relatedness,

we chose to correlate the system’s measurements to those given by humans. In these

experiments we take human judgements to be a gold standard, and attempt to use

ASKNet to replicate those judgements.

83

6.2. WORDSENSE 353

6.2 WordSense 353

In order to obtain human judgements against which we could compare ASKNet’s

scoring, we used the WordSimilarity-353 (ws-353) collection [Finkelstein et al., 2002].

Although the name implies that the scores are similarity rankings, human judges were

in fact asked to score 353 pairs of words for their relatedness on a scale of 1 to 10.

The ws-353 collection contains word pairs which are not semantically similar,

but still receive high scores because they are judged to be related (e.g., the pair

(money, bank) receives a score of 8.50). It also contains word pairs which do not

share a part of speech (e.g., (drink, mouth)), and at least one term which does not

appear in WordNet at all (Maradona). All of these have proven difficult for WordNet

based methods, and resulted in significantly poorer results than those obtained with

collections emphasising semantic similarity [Hughes and Ramage, 2007].

6.3 Spearman’s Rank Correlation Coefficient

For consistency with previous literature, we use Spearman’s rank correlation coeffi-

cient (also known as Spearman’s ρ [Spearman, 1987]) as a measure of the correlation

between the ASKNet scores and those from the ws-353. Spearman’s rank correlation

coefficient assesses the measurements based on their relative ranking rather than on

their values.

If we take as vectors the set of human measurements (X = 〈x1, .., xn〉) and ASKNet

measurements (Y = 〈yi, .., yn〉) and convert them into ranks ((X ′ = 〈x′1, .., x

′n〉 and

(Y ′ = 〈y′1, .., y

′n〉 respectively), i.e., if xi is the largest value in X, then x′

i = 1, if xj

is the second highest value in X, then x′j = 2, etc. Then the correlation coefficient

can be calculated by Equation 6.2. The significance value is calculated using a simple

84

6.4. EXPERIMENT 1

ρ =n(

∑ni=1 x′

iy′i)− (

∑ni=1 x′

i)(∑n

i=1 y′i)√

n(∑n

i=1 x′i2)− (

∑ni=1 x′

i)2√

n(∑n

i=1 y′i2)− (

∑ni=1 y′

i)2

(6.2)

Symbol Definitionsx′

k the k’th element of vector X ′

n the number of elements in vectors X ′ and Y ′

permutation test, finding the probability that a random permutation of X ′ will achieve

an equal or greater value of ρ.

6.4 Experiment 1

6.4.1 Data Collection & Preparation

In order to use ASKNet to develop rankings for each word pair in the ws-353, we

first extracted each individual word from the pairings resulting in a list of 440 words

(some words were used in multiple pairings). For each of the words in this list, we

then performed a query in a major search engine, in this case Google, and downloaded

the first 5 page results for that query. (The choice of the number 5 as the number of

documents to download for each word was based on a combination of intuition about

the precision and recall of search engines, as well as the purely pragmatic issue of

obtaining a corpus that could be held in system memory).

Each of the downloaded web pages was then cleaned by a set of Perl scripts which

removed all HTML markup and javascript code and comments. Punctuation was

added where necessary (e.g., upon encountering a </li> or </td> tag, if the previous

string did not end in a full stop, one was added). Statistics for the resulting corpus

are given in Table 6.1.

85

6.4. EXPERIMENT 1

Experiment 1 CorpusNumber of Sentences 995,981Number of Words 4,471,301Avg. Number of Sentences/Page 452.7% Pages from Wikipedia 18.5

Table 6.1: Summary statistics for the corpus generated in Experiment 1

6.4.2 Calculating a Baseline Score

A simple baseline score was calculated using pointwise mutual information (pmi) of

word co-occurence statistics. The corpus was used as input to a Perl script which

counted the number of paragraphs and documents in which word pairs co-occurred.

The score of a word pair was increased by 1 point for every document in which both

words occurred, and increased by a further point for every paragraph in which both

words occurred. (This means that a word pair co-occurring in a single paragraph

automatically received a score of at least 2 points, this also means that the score

for a particular word pair can actually have a value greater than 1). This score was

then divided by the product of the total number of occurrences of either word in the

corpus. This is in line with the standard definition of pmi. The methodology used

is formalised in Equation 6.3. Note that unlike the traditional definition of pmi we

do not take the log of the scores. This is because the final result is based on the

rank, and therefore it is not necessary. The result of performing Spearman’s rank

correlation on these scores is given in Table 6.3

Scorex,y =co-occur(x, y)

occur(x)occur(y)(6.3)

Symbol DefinitionsScorei,j The computed distance between words i and jco-occur(i,j) The number of times words i and j occur in the same

sentence + the number of times words i and j occur inthe same paragraph

occur(i) the total number of times word i occurs in the corpus

86

6.4. EXPERIMENT 1

6.4.3 Calculating the ASKNet Score

After processing the corpus to build an ASKNet network with approximately 800,000

nodes and 1,900,000 edges, the appropriate node pairs were fired to obtain the distance

measure as described earlier. Those measurements were then recorded as ASKNet’s

measurement of semantic relatedness between two terms. If a term was used as a

label in two or more nodes, the node containing the fewest extraneous labels was

chosen. If there was more than one node using the term as a label with the same

overall number of labels, the input activation was split evenly.

It is important to note that no manual adjustments were made to ASKNet to

facilitate this experiment. All of the firing parameters were set based on intuition and

results from previous experiments before any data was entered into the system. This

means that the system was not fine-tuned to this particular task. This methodology

was chosen for two reasons, firstly because the ws-353 contained few enough pairs

that we thought it unwise to split up the collection for training and testing, and

secondly because we hoped to show that a completed “un-tweaked” network could

perform at least as well as manually tuned systems based on WordNet.

6.4.4 Results

The scores for both the baseline system and ASKNet were compared against those

from the ws-353 collection using Spearman’s rank correlation. Example scores and

ranks are given in Table 6.2. The correlation results are given in Table 6.3. For

comparison, we have included the results of the same correlation on scores from four

additional systems. These scores were obtained from [Hughes and Ramage, 2007].

The Jiang-Conrath system [Jiang and Conrath, 1997] computes semantic distance

87

6.4. EXPERIMENT 1

Word Pair ws-353 Baseline ASKNet ws-353 Baseline ASKNetScore Score Score Rank Rank Rank

love - sex 6.77 266 5.90 144 62 85tiger - cat 7.35 296 9.06 109 53 58tiger - tiger 10 398 58.87 1 33 9book - paper 7.46 511 45.15 98 20 15computer - keyboard 7.62 216 9.52 82 83 56computer - internet 7.58 0 0.00 86 316 316plane - car 5.77 500 9.16 214 21 57train - car 6.31 938 37.04 177 5 20television - radio 6.77 186 6.18 143 99 82media - radio 7.42 138 3.05 103 120 130drug - abuse 6.85 64 1.24 138 190 169bread - butter 6.19 202 6.54 188 90 79cucumber - potato 5.92 0 0.00 204 316 316doctor - nurse 7 108 4.17 127 138 115professor - doctor 6.62 137 15.38 156 121 40student - professor 6.81 260 14.07 139 69 42smart - student 4.62 12 0.14 256 260 252smart - stupid 5.81 4 0.03 213 295 290company - stock 7.08 408 11.49 121 30 53stock - market 8.08 411 51.51 52 29 12stock - phone 1.62 37 0.29 340 222 238stock - CD 1.31 8 0.05 341 279 279stock - jaguar 0.92 91 4.30 345 156 112stock - egg 1.81 9 0.06 335 271 273fertility - egg 6.69 19 0.19 150 246 247stock - live 3.73 232 3.07 283 77 129stock - life 0.92 158 2.81 344 111 135book - library 7.46 342 38.02 99 40 19bank - money 8.12 525 32.68 48 18 23wood - forest 7.73 270 17.32 72 61 36money - cash 9.15 214 8.93 6 85 59professor - cucumber 0.31 0 0.00 352 316 316king - cabbage 0.23 48 0.88 353 206 187king - queen 8.58 58 1.08 24 197 176king - rook 5.92 141 3.17 205 118 128bishop - rabbi 6.69 12 0.11 152 260 259

Table 6.2: Relatedness scores and score rankings for ws-353, baseline system andASKNet.

88

6.4. EXPERIMENT 1

between a word pair by using the information content of the two words and their lowest

common subsumer in the WordNet hierarchy. The Lin approach [Lin, 1998] similarly

uses WordNet to discover word pair similarity which he defines as the “ratio between

the amount of information needed to state their commonality and the information

needed to fully describe what they are”. Both of these approaches were identified

as top performers in a survey of methodologies for calculating semantic distance

[Budanitsky and Hirst, 2006].

The Hughes-Ramage system [Hughes and Ramage, 2007] uses random walks over

WordNet to achieve a similar metric to the other systems, but they also augment this

with walks of associative networks generated from word co-occurence in WordNet’s

gloss definitions. This is explicitly done in order to help improve their methodology’s

ability to calculate semantic relatedness as opposed to semantic similarity.

In addition to the WordNet based systems, we performed a simple pmi measure

over the British National Corpus (bnc)1. These scores were calculated in the same

manner as our baseline both with and without stemming implemented, but on the

larger, more general corpus of the bnc.

The results of the pmi based methodology were poor, based largely on the problem

of data sparseness. Several word pairs never co-occurred in the bnc, and some indi-

vidual terms never occurred at all. Stemming helped the pmi approach slightly, but

it still performed far worse than the Hughes & Ramage WordNet based system. This

is likely because many of the word pairs, even after stemming, simply never appeared

together in the corpus.

Also worth noting is the information provided to us by Jeffrey Mitchell, of the

University of Edinburgh School of Informatics. He used vector based models similar

1Data cited herein has been extracted from the British National Corpus Online service, managedby Oxford University Computing Services on behalf of the BNC Consortium. All rights in the textscited are reserved.

89

6.4. EXPERIMENT 1

to the methodology of [Pado and Lapata, 2007] on the bnc in order to compute scores

for the word pairs given in the ws-353. Through personal communication we have

discovered that this methodology obtained a correlation coefficient of 0.5012.

ρJiang-Conrath 0.195Lin 0.216Hughes-Ramage 0.552pmi: bnc-unstemmed 0.192pmi: bnc-stemmed 0.250

Baseline: Word co-occurrence 0.310ASKNet 0.391

Table 6.3: Rank correlation scores for ASKNet, the baseline system and existingWordNet based systems.

Figures 6.1 and 6.2 provide scatter plots of the rank order of the scores for the

baseline and ASKNet respectively. In Figure 6.1, note the large number of data points

lying directly on the x-axis. These points indicate word pairs for which a score of

zero was obtained, indicating that the word pair never co-occurred in a document.

We can also see that there is very little visible correlation (as would be indicated by

a tendency for the data points to cluster around a line from the bottom left to the

top right of the graph) in either graph. Table 6.2 provides a sample of the scores and

rankings produced by both systems compared to those of the ws-353 gold standard

(the full results can be found in Appendix C).

6.4.5 Discussion

These results were somewhat disappointing. While the ASKNet system did manage to

out-perform the Jiang-Conrath and Lin systems, neither of their methodologies were

specifically targeting semantic relatedness, and failed to even reach the baseline score.

ASKNet was out performed by the Hughes-Ramage system by over 16 percentage

points.

90

6.5. EXPERIMENT 2

Figure 6.1: Scatter plot of rank order of ws-353 scores vs. rank order of baselinescores.

6.5 Experiment 2

6.5.1 Inspecting the Corpus

After a disappointing result in experiment 1, we used ASKNet’s graphical output

to manually survey pieces of the network. This identified several problems with the

data collection and preparation methodology as described in Section 6.4.1. Upon

inspecting the corpus several problems were identified which suggested that the low

scores were a result of the corpus rather than the experiments themselves.

Some pages retrieved contained no meaningful content of any description. For

example, the first link retrieved for the query term Maradona is his public site2,

2http://www.diegomaradona.com

91

6.5. EXPERIMENT 2

Figure 6.2: Scatter plot of rank order of ws-353 scores vs. rank order of ASKNetscores.

which is implemented entirely in Flash, which can not easily be converted into plain

text. Therefore the corpus entry for this web page consisted solely of the sentence

“CASTELLANO — ENGLISH” (taken from two small language selection links at

the bottom of the page).

Over 75% of the query words resulted in at least one of the five results being a link

to Wikipedia. This may initially seem like a positive result, as it would indicate a good

percentage of words having at least one document giving an encyclopedic overview

of the term; in fact, it proved to be disastrous to the results. For more than half of

the query terms linking to Wikipedia, the resulting document was a “disambiguation

page”: a page which contains very little text relating to the term itself, but instead

is merely a listing of links to other Wikipedia articles relating to the various senses

of the term. A partial example of such an article is given in Figure 6.3.

92

6.5. EXPERIMENT 2

Figure 6.3: An excerpt from the Wikipedia “disambiguation page” for the query term“American”.

These “disambiguation pages” are problematic not only in that they provide very

little information for ASKNet to use, but also in that they are structured as a list,

which causes the data preparation script used in Section 6.4.1 to treat each element

of the list as a sentence fragment, which often did not result in a sensible parse being

returned by the C&C parser.

Another problem was that the search engine used (in this case Google) uses word

93

6.5. EXPERIMENT 2

stemming of query terms. This resulted in a search for “planning” returning docu-

ments which never contained the word “planning”, but instead contained the word

“plan”. Since both the baseline system and ASKNet used the original query terms

un-stemmed, the potentially useful information provided by these documents was

being ignored in score calculation.

6.5.2 Building a Better Corpus

In order to improve the corpus, several changes were made to the data preparation

method described in 6.4.1. These changes are listed below:

1. A heuristic was added to say that if a sentence ended with a colon, and was

immediately followed by a list, that sentence was copied once for each element

in the list, and the sentence was concatenated with each list element, excluding

the colon. (An example of the change that this made is given in Figure 6.4).

2. For Wikipedia disambiguation pages, identified as such by either the string

“(disambiguation)” in the title or by first sentence being of the form “X may

refer to:”, all of the links in the disambiguation list were followed, and the

resulting pages were also added to the corpus. This means that for the example

given in Figure 6.3, the wikipedia pages for United States, Americas and

Indigenous peoples of the Americas would be added to the corpus.

Additionally, an implementation of the Porter Stemmer [Porter, 1980] was added

to ASKNet, and for each word added to a node as a label, the stemmed version of the

word was also added if the stemmed word was different from the original. We also

stemmed all of the words in the corpus for the computation of the baseline score.

Summary statistics for the resulting corpus are given in Table 6.4.

94

6.5. EXPERIMENT 2

American may refer to:

A person, inhabitant, or attribute of the United States of

America.

A person, an inhabitant, or attribute of the Americas, the

lands and regions of the Western Hemisphere.

A person or attribute of the indigenous peoples of the

Americas.

American may refer to a person, inhabitant, or attribute of

the United States of America.

American may refer to a person, an inhabitant, or attribute

of the Americas, the lands and regions of the Western

Hemisphere

American may refer to a person or attribute of the

indigenous peoples of the Americas

Figure 6.4: The text retrieved from the page shown in Figure 6.3 using the originaldata preparation methods (top) and the improved heuristic (bottom).

Experiment 1 CorpusNumber of Sentences 995,981Number of Words 4,471,301Avg. Number of Sentences/Page 452.7% Pages from Wikipedia 18.5

Experiment 2 CorpusNumber of Sentences 1,042,128Number of Words 5,027,947Avg. Number of Sentences/Page 464.4% Pages from Wikipedia 22.1

Table 6.4: Summary statistics for the corpus generated in both experiments

6.5.3 Results: New Corpus

The baseline and ASKNet scores were re-computed on the new corpus, and using the

new techniques as described in the previous section. The results are shown in Table

6.5.

Figures 6.5 and 6.6 provide scatter plots of the rank order of the scores for the

95

6.5. EXPERIMENT 2

ρJiang-Conrath 0.195Lin 0.216Hughes-Ramage 0.552pmi: bnc-unstemmed 0.192pmi: bnc-stemmed 0.250

Original CorpusBaseline: Word co-occurrence 0.310ASKNet 0.391

Improved CorpusBaseline: Word co-occurrence 0.408ASKNet 0.609

Table 6.5: Rank correlation scores for ASKNet, the baseline system and existingWordNet based systems.

baseline and ASKNet respectively as calculated using the new corpus. Here we note

the reduction of data points lying directly on the x-axis in Figure 6.5 when compared

with Figure 6.1, indicating fewer word pairs with no co-occurrences. We can also note

the distinct visible correlation in Figure 6.6. Table 6.6 provides a sample of the scores

and rankings produced by both systems being run on the new corpus compared to

those of the ws-353 gold standard.

6.5.4 Discussion

The improvements to the corpus allowed for a massive improvement in the quality of

ASKNet’s scores. The rank correlation coefficient improved to 0.609, which indicates

a performance at least on par with any of the existing methodologies. This shows

that with a properly acquired corpus, ASKNet can be used to judge the semantic

relatedness at least as well as any other existing system, without the need for a

manually created network such as WordNet or a large human-collected corpus such

as the bnc.

Additionally, since ASKNet used a web-based corpus to generate its scores, it

96

6.5. EXPERIMENT 2

Word Pair ws-353 Baseline ASKNet ws-353 Baseline ASKNetScore Score Score Rank Rank Rank

computer - keyboard 7.62 1403 26.45 82 59 65computer - internet 7.58 390 5.2 86 144 178plane - car 5.77 922 1.09 214 76 288train - car 6.31 3705 5.16 177 26 179television - radio 6.77 2657 47.63 143 33 38media - radio 7.42 833 11.61 103 84 112drug - abuse 6.85 435 19.14 138 136 85bread - butter 6.19 693 21.91 188 98 77cucumber - potato 5.92 20 0.26 204 287 321doctor - nurse 7 426 50.97 127 139 36professor - doctor 6.62 1545 126.82 156 52 14student - professor 6.81 1503 33.08 139 53 55smart - student 4.62 23 2.34 256 281 246smart - stupid 5.81 3 12.93 213 323 105company - stock 7.08 3259 41.84 121 30 44stock - market 8.08 6061 49.58 52 15 37stock - phone 1.62 33 0.34 340 267 318stock - CD 1.31 7 0.14 341 311 332stock - jaguar 0.92 430 16.25 345 138 89stock - egg 1.81 8 0.2 335 303 326fertility - egg 6.69 28 2.93 150 274 231stock - live 3.73 318 2.57 283 157 235stock - life 0.92 338 2.08 344 152 263book - library 7.46 4419 67.1 99 23 30bank - money 8.12 3561 33.85 48 27 53wood - forest 7.73 1836 25.36 72 42 69money - cash 9.15 901 37.97 6 77 47professor - cucumber 0.31 217 2.9 352 177 232king - cabbage 0.23 88 0.96 353 221 297king - queen 8.58 108 3.4 24 209 221king - rook 5.92 476 5.07 205 131 180bishop - rabbi 6.69 103 1.76 152 212 271

Table 6.6: Relatedness scores and score rankings for ws-353 , baseline system andASKNet as computed on the improved corpus

did not encounter the same data sparseness problems seen in the other systems.

[Hughes and Ramage, 2007] were forced to remove at least one word pair from their

analysis, because the word Maradona did not appear in WordNet. When performing

experiments on a similar data set using vector space models, [Pado and Lapata,

97

6.5. EXPERIMENT 2

Figure 6.5: Scatter plot of rank order of ws-353 scores vs. rank order of baselinescores as calculated on the improved corpus.

2007] were forced to remove seven of the 143 word pairs due to one of the words

having too low frequency in their corpus (in this case, the bnc). Since our ASKNet

based methodology for acquiring scores retrieved data directly from the internet, it

encountered no such problems.

One of the initial goals of this evaluation was to assess ASKNet’s ability to perform

a real-world task with as little human intervention as possible. To that end, all of the

firing parameters of ASKNet were set before any data was collected, and even then,

only the most coarse adjustments were made manually in anticipation of the types

of data that would be found in the corpus. While our improvements to the corpus

creation process could be seen as human intervention, they were relatively minor, and

with appropriate foresight would have been included into the initial corpus creation

process. The data was by no means “hand picked” to be appropriate for ASKNet or

98

6.5. EXPERIMENT 2

Figure 6.6: Scatter plot of rank order of ws-353 scores vs. rank order of ASKNetscores as calculated on the improved corpus.

for the task at hand.

In conclusion, we have demonstrated that a novel approach to automatically mea-

suring the semantic relatedness of words, using a relatively small, task focused, web-

harvested corpus to build an ASKNet network, can perform at least as well as any

existing system. This shows that the networks produced by ASKNet are of sufficient

quality to be of use in a real world application, and therefore we consider this to be

a very positive result in our evaluation.

99

Chapter 7

Conclusions

In this thesis we have detailed the conception, development and evaluation of ASKNet,

a system for automatically creating large scale semantic knowledge networks from nat-

ural language text. We have shown that existing nlp tools, an appropriate semantic

network formalism and spreading activation algorithms can be combined to design

a system which is capable of efficiently creating semantic networks on a scale never

before possible and of promising quality.

The primary focus of this thesis has been to combine ai techniques with nlp tools

in order to efficiently achieve cross-document information extraction and integration.

This work promises to not only afford researchers with large scale networks, a useful

tool in their own right, but also to provide a new methodology for large scale knowl-

edge acquisition. We have shown that cross-document information integration, a step

which has often been overlooked as either unnecessary or unfeasible, is both necessary

for creating high quality semantic resources, and possible to achieve efficiently on a

large scale using existing tools and algorithms. Furthermore we have shown that it is

possible to automatically create high quality semantic resources on a large scale in a

reasonable time without the need for manually created resources, and crucially, that

100

7.1. FUTURE WORK

by using appropriate algorithms, the scale of those resources can increase indefinitely

with only a linear increase in creation time.

Semantic networks have a wide variety of applications, and a system for auto-

matically generating such networks could have far reaching benefits to multiple areas

both in the field of nlp and in other research areas which require knowledge acqui-

sition systems which can perform effectively and efficiently on a large scale. There

are many areas of research to which a system such as ASKNet could potentially be of

benefit; biomedical research is one of the most obvious examples. The current state

of information overload is resulting in unforeseen difficulties for many researchers.

The problem of research has, in many areas, ceased to be a search for information

and become an exercise in filtering the vast amounts of information which are readily

available. Developing tools and methodologies that can aid in this filtering process

is an important task for the nlp community, not just for the benefit of researchers

within our own field, but for the benefit of the academic community at large.

This thesis has focused on the ASKNet system as both a real-world tool for de-

veloping semantic resources and a proof of concept system showing what is possible

in the field of knowledge discovery. The ASKNet system has demonstrated that by

combining ai and nlp techniques it is possible to create semantic resources larger,

faster and with higher quality than anything previously obtainable.

7.1 Future Work

In this section we will briefly survey some potential future directions for the ASKNet

project. These are roughly divided into two sections: improvements which are ex-

ternal to the ASKNet project, but could be incorporated to improve the networks it

creates; and potential improvements for/uses of ASKNet networks.

101

7.1. FUTURE WORK

7.1.1 Future Improvements

ASKNet was designed primarily as a “proof of concept” system in order to test

various hypotheses regarding the use of nlp tools and spreading activation theories

to integrate information from multiple sources into a single cohesive network. While

we have shown in this thesis that network creation time is linear with respect to

network size, it is still unlikely that any networks on a scale much larger than those

created for this thesis would be made using a single CPU. It would be possible in

future to create a version of ASKNet that would work in a distributed fashion, using

a single network that could be built and manipulated by any number of computers

working independently.

A Distributed ASKNet

In order to create a distributed version of ASKNet, one would simply have to add

interfaces into a central network for multiple computers (essentially this would involve

having the SemNet interface described in Section 3.2.2 available to multiple agents).

In order to remove conflicts between agents, one agent would require the ability to

“lock” a section of the network, then to receive a copy of that section which they

could update before merging it back with the main “global” network.

Through this distributed process, multiple agents could be working on the same

network simultaneously without fear of corrupting the network by trying to update

the same nodes simultaneously. An agent would only have to wait for other agents to

terminate if there was an agent trying to update a node similar to one in its network;

in a large enough network this is likely to be very infrequent.

102

7.1. FUTURE WORK

Potential Uses for ASKNet

There are many potential uses for ASKNet networks. In this section we will simply

list a few areas which we feel could benefit from ASKNet in the relatively near future,

or to which we feel ASKNet is particularly well suited.

• Entity Relationship Discovery: This is an extension of the semantic relatedness

scores generated in Section 6. Rather than finding the semantic relatedness

of words, ASKNet could be used to judge the semantic relatedness of named

entities. This task is even less suited to existing resources like WordNet, and

could be very useful in information retrieval. In particular, if ASKNet can

be ported to the biomedical domain, the ability to discover the existence of

relationships between entities such as genes and proteins would be very useful.

Since the C&C tools are currently being ported to work with biomedical data,

this application should be possible in the very near future.

• Question Answering: An obvious use of ASKNet is in a question answering (qa)

system. Most existing qa systems are only able to extract information from

single documents. ASKNet could be employed to increase the recall of these

systems by allowing them to answer questions which require intra-document

analysis.

• Novel Relationship Discovery: An extension of the Entity Relationship Discov-

ery functionality could be to attempt to discover novel relationships. In order

to do this, one would merely have to try to find entities which have strong re-

lationships in ASKNet without having any relations connecting them directly.

This would be analogous to entities being strongly related, but never being

mentioned within the same context. An example of this sort of relationship

was found by [Swanson, 1986], where it was discovered through an extensive

103

7.1. FUTURE WORK

literature review that there was a link between Raynaud’s disease and fish-oil,

despite no experiments having ever linked the two directly. The use of ASKNet

to discover these types of novel relationships could potentially evolve into an

interesting ai project, and be of great use to a wide variety of areas.

7.1.2 External improvements

ASKNet is designed to adapt to improvements made in other areas of nlpIn this sec-

tion we list just a few improvements which we believe could contribute to an increase

in the quality and usefulness of the created networks, but whose implementation is

beyond the scope of the project. Each of these are active areas of research, and we

hope that future improvements in any or all of these areas will benefit ASKNet.

• Anaphora resolution: Currently ASKNet only has very rudimentary anaphora

resolution provided by Boxer, combined with a few heuristics added in the

network creation process. This means a good deal of potentially useful infor-

mation is not being integrated even at the sentence level. Improving anaphora

resolution is an active area of research and improvements are being made [Char-

niak and Elsner, 2009]. Improvements in this area could increase the recall of

ASKNet tremendously.

• Temporal relations: The network formalism used throughout this thesis is atem-

poral, and cannot easily accommodate temporal information. This would ob-

viously be a difficulty that would need to be resolved before any sophisticated

automated reasoning could be done on the created networks. This could pos-

sibly be resolved by adding a temporal element to the relations which would

indicate the time period for which they held. This is a difficult task, as no tools

are currently available which appropriately capture this information, however

104

7.1. FUTURE WORK

there is active research in this area [Pustejovsky et al., 2005], and we hope that

in future this information would be able to be added to ASKNet.

• Domain adaptation: The ability to create networks on new domains, particu-

larly those domains where similar resources are scarce or non-existent would be

very useful for obvious reasons. While the ASKNet framework is not particu-

larly tied to a specific domain, the tools it uses (i.e., C&C and Boxer) are trained

on a particular domain (newspaper text) and will obviously have reduced per-

formance on a domain that is novel to them. Work is currently underway to

port C&C tools to work in the biomedical domain. This will allow us to test

the plausibility and level of difficulty of porting ASKNet to new domains once

the tools it uses have been properly adapted.

105

Appendix A

Published Papers

The following is a listing of papers extracted from the materials in this dissertationwhich have been published in other venues. All papers have been co-authored byBrian Harrington and Stephen Clark.

The papers & publications are as follows:

• Journal Papers

– Harrington B. & Clark S. ASKNet: Creating and Evaluating Large ScaleIntegrated Semantic Networks (Expanded version of ICSC-08 paper). In-ternational Journal of Semantic Computing. 2(3), pp.343-364, 2009.

• Conference Papers

– Harrington B. & Clark S. ASKNet: Automated Semantic Knowledge Net-work. Proceedings of the Twenty-Second Conference on Artificial Intelli-gence (AAAI-07), Vancouver, Canada, 2007.

– Harrington B. & Clark S. ASKNet: Creating and Evaluating Large ScaleIntegrated Semantic Networks. Proceedings of the Second IEEE Interna-tional Conference on Semantic Computing (ICSC-08), Santa Clara, USA,2008.

106

Appendix B

Semantic Relatedness Scores &Rankings - Initial Corpus

This appendix provides the complete set of relatedness scores and score rankings forws-353, baseline system and ASKNet as computed on the initial corpus. An extractfrom this table is given in Table 6.2.

Word Pair ws-353 Baseline ASKNet ws-353 Baseline ASKNetScore Score Score Rank Rank Rank

love - sex 6.77 266 5.90 144 62 85tiger - cat 7.35 296 9.06 109 53 58tiger - tiger 10 398 58.87 1 33 9book - paper 7.46 511 45.15 98 20 15computer - keyboard 7.62 216 9.52 82 83 56computer - internet 7.58 0 0.00 86 316 316plane - car 5.77 500 9.16 214 21 57train - car 6.31 938 37.04 177 5 20telephone - communication 7.5 195 4.79 94 95 103television - radio 6.77 186 6.18 143 99 82media - radio 7.42 138 3.05 103 120 130drug - abuse 6.85 64 1.24 138 190 169bread - butter 6.19 202 6.54 188 90 79cucumber - potato 5.92 0 0.00 204 316 316doctor - nurse 7 108 4.17 127 138 115professor - doctor 6.62 137 15.38 156 121 40student - professor 6.81 260 14.07 139 69 42smart - student 4.62 12 0.14 256 260 252smart - stupid 5.81 4 0.03 213 295 290

107

Word Pair ws-353 Baseline ASKNet ws-353 Baseline ASKNetScore Score Score Rank Rank Rank

company - stock 7.08 408 11.49 121 30 53stock - market 8.08 411 51.51 52 29 12stock - phone 1.62 37 0.29 340 222 238stock - CD 1.31 8 0.05 341 279 279stock - jaguar 0.92 91 4.30 345 156 112stock - egg 1.81 9 0.06 335 271 273fertility - egg 6.69 19 0.19 150 246 247stock - live 3.73 232 3.07 283 77 129stock - life 0.92 158 2.81 344 111 135book - library 7.46 342 38.02 99 40 19bank - money 8.12 525 32.68 48 18 23wood - forest 7.73 270 17.32 72 61 36money - cash 9.15 214 8.93 6 85 59professor - cucumber 0.31 0 0.00 352 316 316king - cabbage 0.23 48 0.88 353 206 187king - queen 8.58 58 1.08 24 197 176king - rook 5.92 141 3.17 205 118 128bishop - rabbi 6.69 12 0.11 152 260 259Jerusalem - Israel 8.46 622 116.17 28 14 3Jerusalem - Palestinian 7.65 382 38.41 79 36 18holy - sex 1.62 25 0.32 339 237 235fuck - sex 9.44 104 3.91 2 143 121Maradona - football 8.62 100 8.35 22 148 64football - soccer 9.03 227 14.02 10 80 43football - basketball 6.81 229 3.97 140 79 119football - tennis 6.63 39 0.39 154 219 226tennis - racket 7.56 50 0.60 89 203 210Arafat - peace 6.73 198 19.76 147 92 31Arafat - terror 7.65 196 5.47 78 94 90Arafat - Jackson 2.5 0 0.00 321 316 316law - lawyer 8.38 496 137.92 33 22 2movie - star 7.38 261 11.24 108 67 54movie - popcorn 6.19 134 3.55 187 124 124movie - critic 6.73 183 2.07 146 102 151movie - theater 7.92 306 62.43 62 47 6physics - proton 8.12 32 0.41 47 231 223physics - chemistry 7.35 182 4.25 110 103 114space - chemistry 4.88 87 0.74 248 162 201alcohol - chemistry 5.54 47 0.53 225 208 217vodka - gin 8.46 69 0.80 29 183 193vodka - brandy 8.13 19 0.18 45 246 249drink - car 3.04 341 5.09 304 41 93drink - ear 1.31 721 13.26 342 11 46drink - mouth 5.96 35 0.41 200 226 223drink - eat 6.87 748 20.93 134 10 29baby - mother 7.85 36 0.50 67 225 218drink - mother 2.65 17 0.12 316 249 255car - automobile 8.94 345 6.16 14 39 83

108

Word Pair ws-353 Baseline ASKNet ws-353 Baseline ASKNetScore Score Score Rank Rank Rank

gem - jewel 8.96 276 59.08 13 57 8journey - voyage 9.29 3 0.02 3 300 298boy - lad 8.83 80 0.90 17 171 185coast - shore 9.1 65 2.41 7 189 142asylum - madhouse 8.87 9 0.08 16 271 264magician - wizard 9.02 186 34.34 11 99 22midday - noon 9.29 20 0.33 4 245 233furnace - stove 8.79 88 0.86 19 160 191food - fruit 7.52 205 4.89 92 88 99bird - cock 7.1 122 4.69 120 130 106bird - crane 7.38 41 0.80 105 217 193tool - implement 6.46 73 0.85 166 178 192brother - monk 6.27 68 4.80 179 184 102crane - implement 2.69 3 0.02 315 300 298lad - brother 4.46 14 0.12 263 256 255journey - car 5.85 84 1.07 212 168 178monk - oracle 5 61 0.60 241 193 210cemetery - woodland 2.08 0 0.00 329 316 316food - rooster 4.42 39 0.35 264 219 231coast - hill 4.38 33 0.28 265 228 239forest - graveyard 1.85 0 0.00 333 316 316shore - woodland 3.08 0 0.00 303 316 316monk - slave 0.92 7 0.04 347 283 286coast - forest 3.15 187 2.21 302 97 146lad - wizard 0.92 4 0.02 346 295 298chord - smile 0.54 0 0.00 350 316 316glass - magician 2.08 0 0.00 330 316 316noon - string 0.54 0 0.00 351 316 316rooster - voyage 0.62 0 0.00 349 316 316money - dollar 8.42 244 5.59 32 74 87money - cash 9.08 214 8.93 8 85 59money - currency 9.04 246 11.92 9 73 52money - wealth 8.27 249 5.07 41 71 94money - property 7.57 215 6.78 87 84 78money - possession 7.29 95 2.48 112 151 140money - bank 8.5 525 32.68 26 18 23money - deposit 7.73 273 8.20 73 60 69money - withdrawal 6.88 57 0.79 132 199 196money - laundering 5.65 102 20.88 217 146 30money - operation 3.31 296 4.54 297 53 107tiger - jaguar 8 115 4.88 57 134 100tiger - feline 8 17 0.28 59 249 239tiger - carnivore 7.08 38 0.73 122 221 203tiger - mammal 6.85 22 0.45 135 240 220tiger - animal 7 249 8.69 128 71 62tiger - organism 4.77 26 0.24 250 235 244tiger - fauna 5.62 9 0.08 222 271 264tiger - zoo 5.87 106 3.53 210 139 125

109

Word Pair ws-353 Baseline ASKNet ws-353 Baseline ASKNetScore Score Score Rank Rank Rank

psychology - psychiatry 8.08 0 0.00 51 316 316psychology - anxiety 7 0 0.00 126 316 316psychology - fear 6.85 0 0.00 137 316 316psychology - depression 7.42 0 0.00 101 316 316psychology - clinic 6.58 0 0.00 158 316 316psychology - doctor 6.42 0 0.00 169 316 316psychology - Freud 8.21 0 0.00 42 316 316psychology - mind 7.69 0 0.00 77 316 316psychology - health 7.23 0 0.00 115 316 316psychology - science 6.71 0 0.00 149 316 316psychology - discipline 5.58 0 0.00 223 316 316psychology - cognition 7.48 0 0.00 95 316 316planet - star 8.45 629 23.57 30 13 26planet - constellation 8.06 127 1.34 53 127 166planet - moon 8.08 262 5.18 50 65 92planet - sun 8.02 232 5.35 56 77 91planet - galaxy 8.11 165 2.21 49 110 146planet - space 7.92 319 8.33 63 44 66planet - astronomer 7.94 227 7.15 61 80 75precedent - example 5.85 88 1.74 211 160 155precedent - information 3.85 5 0.03 280 292 290precedent - cognition 2.81 14 0.12 312 256 255precedent - law 6.65 340 19.29 153 42 32precedent - collection 2.5 4 0.03 320 295 290precedent - group 1.77 42 0.40 337 215 225precedent - antecedent 6.04 0 0.00 193 316 316cup - coffee 6.58 142 4.46 157 117 109cup - tableware 6.85 0 0.00 136 316 316cup - article 2.4 364 4.85 322 38 101cup - artifact 2.92 11 0.07 310 264 268cup - object 3.69 80 0.75 288 171 199cup - entity 2.15 50 0.45 328 203 220cup - drink 7.25 105 1.67 114 141 158cup - food 5 139 1.52 243 119 160cup - substance 1.92 37 0.26 332 222 243cup - liquid 5.9 51 0.64 208 202 208jaguar - cat 7.42 136 24.68 102 122 25jaguar - car 7.27 102 7.79 113 146 72energy - secretary 1.81 0 0.00 334 316 316secretary - senate 5.06 0 0.00 239 316 316energy - laboratory 5.09 61 0.66 238 193 206computer - laboratory 6.78 46 0.55 142 210 215weapon - secret 6.06 66 1.01 192 186 180FBI - fingerprint 6.94 187 7.86 130 97 71FBI - investigation 8.31 126 13.68 38 128 45investigation - effort 4.59 32 0.62 257 231 209Mars - water 2.94 365 8.50 309 37 63Mars - scientist 5.63 84 0.87 219 168 188news - report 8.16 136 3.93 43 122 120canyon - landscape 7.53 16 0.14 91 253 252

110

Word Pair ws-353 Baseline ASKNet ws-353 Baseline ASKNetScore Score Score Rank Rank Rank

image - surface 4.56 254 4.38 258 70 111discovery - space 6.34 90 0.89 174 157 186water - seepage 6.56 23 0.39 160 238 226sign - recess 2.38 66 0.78 324 186 197Wednesday - news 2.22 3 0.02 327 300 298mile - kilometer 8.66 86 3.71 21 165 123computer - news 4.47 50 0.59 260 203 212territory - surface 5.34 29 0.22 229 234 245atmosphere - landscape 3.69 22 0.19 287 240 247president - medal 3 9 0.05 305 271 279war - troops 8.13 210 4.53 44 87 108record - number 6.31 920 15.82 178 6 38skin - eye 6.22 174 1.54 184 106 159Japanese - American 6.5 197 2.52 162 93 139theater - history 3.91 73 0.91 275 178 184volunteer - motto 2.56 3 0.02 318 300 298prejudice - recognition 3 5 0.03 306 292 290decoration - valor 5.63 3 0.02 218 300 298century - year 7.59 1598 38.72 85 3 17century - nation 3.16 1802 61.30 301 2 7delay - racism 1.19 0 0.00 343 316 316delay - news 3.31 9 0.06 298 271 273minister - party 6.63 58 0.76 155 197 198peace - plan 4.75 106 2.20 253 139 149minority - peace 3.69 11 0.07 285 264 268attempt - peace 4.25 90 2.07 267 157 151government - crisis 6.56 105 1.71 159 141 157deployment - departure 4.25 0 0.00 269 316 316deployment - withdrawal 5.88 0 0.00 209 316 316energy - crisis 5.94 44 1.39 203 213 165announcement - news 7.56 8 0.06 88 279 273announcement - effort 2.75 10 0.06 313 267 273stroke - hospital 7.03 174 2.88 124 106 134disability - death 5.47 23 0.35 226 238 231victim - emergency 6.47 47 0.80 165 208 193treatment - recovery 7.91 15 0.11 64 255 259journal - association 4.97 33 0.27 245 228 242doctor - personnel 5 14 0.11 242 256 259doctor - liability 5.19 11 0.10 236 264 262liability - insurance 7.03 265 53.75 125 64 11school - center 3.44 193 4.44 293 96 110reason - hypertension 2.31 68 1.26 325 184 168reason - criterion 5.91 37 0.56 206 222 213hundred - percent 7.38 120 1.24 106 131 169Harvard - Yale 8.13 401 35.76 46 31 21hospital - infrastructure 4.63 7 0.04 255 283 286death - row 5.25 438 6.97 235 28 77death - inmate 5.03 78 2.21 240 176 146lawyer - evidence 6.69 116 1.11 151 133 175life - death 7.88 607 43.05 66 16 16

111

Word Pair ws-353 Baseline ASKNet ws-353 Baseline ASKNetScore Score Score Rank Rank Rank

life - term 4.5 1507 49.06 259 4 14word - similarity 4.75 61 0.56 252 193 213board - recommendation 4.47 9 0.06 261 271 273governor - interview 3.25 9 0.07 299 271 268OPEC - country 5.63 40 0.38 220 218 229peace - atmosphere 3.69 7 0.04 286 283 286peace - insurance 2.94 6 0.04 308 288 286territory - kilometer 5.28 22 0.50 233 240 218travel - activity 5 79 0.93 244 173 183competition - price 6.44 95 2.92 167 151 133consumer - confidence 4.13 4 0.02 270 295 298consumer - energy 4.75 61 1.18 251 193 172problem - airport 2.38 12 0.08 323 260 264car - flight 4.94 261 4.14 247 67 116credit - card 8.06 442 226.92 54 27 1credit - information 5.31 277 12.98 231 56 48hotel - reservation 8.03 33 1.19 55 228 171grocery - money 5.94 0 0.00 202 316 316registration - arrangement 6 10 0.06 196 267 273arrangement - accommodation 5.41 2 0.01 228 308 308month - hotel 1.81 17 0.15 336 249 250type - kind 8.97 609 12.37 12 15 50arrival - hotel 6 0 0.00 195 316 316bed - closet 6.72 43 1.08 148 214 176closet - clothes 8 31 0.54 58 233 216situation - conclusion 4.81 48 0.45 249 206 220situation - isolation 3.88 17 0.14 278 249 252impartiality - interest 5.16 4 0.03 237 295 290direction - combination 2.25 52 0.33 326 201 233street - place 6.44 170 3.24 168 108 127street - avenue 8.88 87 0.95 15 162 182street - block 6.88 132 2.93 131 125 131street - children 4.94 92 1.52 246 154 160listing - proximity 2.56 2 0.01 319 308 308listing - category 6.38 7 0.05 171 283 279cell - phone 7.81 319 102.29 70 44 4production - hike 1.75 2 0.01 338 308 308benchmark - index 4.25 2 0.01 268 308 308media - trading 3.88 45 0.74 276 211 201media - gain 2.88 544 8.25 311 17 68dividend - payment 7.63 73 4.08 81 178 117dividend - calculation 6.48 0 0.00 163 316 316calculation - computation 8.44 12 0.12 31 260 255currency - market 7.5 153 2.93 93 113 131OPEC - oil 8.59 87 13.81 23 162 44oil - stock 6.34 78 0.66 175 176 206announcement - production 3.38 0 0.00 295 316 316announcement - warning 6 9 0.07 197 271 268profit - warning 3.88 7 0.05 277 283 279profit - loss 7.63 170 7.49 80 108 74

112

Word Pair ws-353 Baseline ASKNet ws-353 Baseline ASKNetScore Score Score Rank Rank Rank

dollar - yen 7.78 66 4.94 71 186 96dollar - buck 9.22 10 0.08 5 267 264dollar - profit 7.38 62 0.69 107 191 204dollar - loss 6.09 22 0.21 191 240 246computer - software 8.5 293 17.96 27 55 35network - hardware 8.31 79 2.28 39 173 145phone - equipment 7.13 262 4.70 118 65 105equipment - maker 5.91 42 0.36 207 215 230luxury - car 6.47 62 0.67 164 191 205five - month 3.38 182 2.41 294 103 142report - gain 3.63 461 7.57 290 24 73investor - earning 7.13 45 1.01 119 211 180liquid - water 7.89 449 12.17 65 25 51baseball - season 5.97 114 5.00 199 136 95game - victory 7.03 104 1.52 123 143 160game - team 7.69 825 65.19 76 9 5marathon - sprint 7.47 3 0.02 96 300 298game - series 6.19 393 12.54 189 34 49game - defeat 6.97 148 2.16 129 115 150seven - series 3.56 185 14.63 291 101 41seafood - sea 7.47 19 0.32 97 246 235seafood - food 8.34 26 0.39 35 235 226seafood - lobster 8.7 6 0.05 20 288 279lobster - food 7.81 71 1.18 69 182 172lobster - wine 5.7 0 0.00 216 316 316food - preparation 6.22 131 1.41 185 126 164video - archive 6.34 6 0.03 173 288 290start - year 4.06 887 18.01 272 7 34start - match 4.47 235 6.31 262 76 81game - round 5.97 709 21.38 198 12 27boxing - round 7.61 92 10.34 83 154 55championship - tournament 8.36 83 1.31 34 170 167fighting - defeating 7.41 2 0.01 104 308 308line - insurance 2.69 305 18.61 314 48 33day - summer 3.94 266 6.40 274 62 80summer - drought 7.16 0 0.00 117 316 316summer - nature 5.63 55 0.75 221 200 199day - dawn 7.53 103 1.05 90 145 179nature - environment 8.31 299 8.29 37 52 67environment - ecology 8.81 96 5.68 18 149 86nature - man 6.25 2368 57.34 180 1 10man - woman 8.3 842 21.15 40 8 28man - governor 5.25 275 15.78 234 58 39murder - manslaughter 8.53 222 49.56 25 82 13soap - opera 7.94 13 0.07 60 259 268opera - performance 6.88 400 16.19 133 32 37life - lesson 5.94 3 0.02 201 300 298focus - life 4.06 314 4.06 271 46 118production - crew 6.25 115 4.30 182 134 112television - film 7.72 446 8.35 74 26 64

113

Word Pair ws-353 Baseline ASKNet ws-353 Baseline ASKNetScore Score Score Rank Rank Rank

lover - quarrel 6.19 3 0.02 186 300 298viewer - serial 2.97 0 0.00 307 316 316possibility - girl 1.94 2 0.01 331 308 308population - development 3.75 331 5.49 282 43 89morality - importance 3.31 10 0.09 296 267 263morality - marriage 3.69 5 0.03 284 292 290Mexico - Brazil 7.44 120 2.04 100 131 153gender - equality 6.41 113 3.43 170 137 126change - attitude 5.44 180 13.19 227 105 47family - planning 6.25 89 1.17 181 159 174opera - industry 2.63 205 2.70 317 88 136sugar - approach 0.88 16 0.15 348 253 250practice - institution 3.19 303 5.91 300 50 84ministry - culture 4.69 8 0.05 254 279 279problem - challenge 6.75 151 1.79 145 114 154size - prominence 5.31 95 0.87 230 151 188country - citizen 7.31 305 4.91 111 48 97planet - people 5.75 145 2.44 215 116 141development - issue 3.97 384 4.90 273 35 98experience - music 3.47 275 2.54 292 58 138music - project 3.63 155 1.44 289 112 163glass - metal 5.56 302 8.81 224 51 61aluminum - metal 7.83 79 2.64 68 173 137chance - credibility 3.88 0 0.00 279 316 316exhibit - memorabilia 5.31 2 0.01 232 308 308concert - virtuoso 6.81 8 0.05 141 279 279rock - jazz 7.59 240 7.89 84 75 70museum - theater 7.19 22 0.28 116 240 239observation - architecture 4.38 6 0.03 266 288 290space - world 6.53 494 7.07 161 23 76preservation - world 6.19 96 0.87 190 149 188admission - ticket 7.69 85 2.39 75 166 144shower - thunderstorm 6.31 2 0.01 176 308 308shower - flood 6.03 34 0.31 194 227 237weather - forecast 8.34 72 3.90 36 181 122disaster - area 6.25 126 1.73 183 128 156governor - office 6.34 85 4.79 172 166 103architecture - century 3.78 202 5.59 281 90 87

114

Appendix C

Semantic Relatedness Scores &Rankings - Improved Corpus

This appendix provides the complete set of relatedness scores and score rankings forws-353, baseline system and ASKNet as computed on the improved corpus. Anextract from this table is given in Table 6.6.

115

Word Pair ws-353 Baseline ASKNet ws-353 Baseline ASKNetScore Score Score Rank Rank Rank

love - sex 6.77 599 5.74 144 110 169tiger - cat 7.35 947 1.82 109 73 270tiger - tiger 10 8773 175.86 1 8 8book - paper 7.46 5215 36.2 98 20 51computer - keyboard 7.62 1403 26.45 82 59 65computer - internet 7.58 390 5.2 86 144 178plane - car 5.77 922 1.09 214 76 288train - car 6.31 3705 5.16 177 26 179telephone - communication 7.5 491 15.52 94 127 95television - radio 6.77 2657 47.63 143 33 38media - radio 7.42 833 11.61 103 84 112drug - abuse 6.85 435 19.14 138 136 85bread - butter 6.19 693 21.91 188 98 77cucumber - potato 5.92 20 0.26 204 287 321doctor - nurse 7 426 50.97 127 139 36professor - doctor 6.62 1545 126.82 156 52 14student - professor 6.81 1503 33.08 139 53 55smart - student 4.62 23 2.34 256 281 246smart - stupid 5.81 3 12.93 213 323 105company - stock 7.08 3259 41.84 121 30 44stock - market 8.08 6061 49.58 52 15 37stock - phone 1.62 33 0.34 340 267 318stock - CD 1.31 7 0.14 341 311 332stock - jaguar 0.92 430 16.25 345 138 89stock - egg 1.81 8 0.2 335 303 326fertility - egg 6.69 28 2.93 150 274 231stock - live 3.73 318 2.57 283 157 235stock - life 0.92 338 2.08 344 152 263book - library 7.46 4419 67.1 99 23 30bank - money 8.12 3561 33.85 48 27 53wood - forest 7.73 1836 25.36 72 42 69money - cash 9.15 901 37.97 6 77 47professor - cucumber 0.31 217 2.9 352 177 232king - cabbage 0.23 88 0.96 353 221 297king - queen 8.58 108 3.4 24 209 221king - rook 5.92 476 5.07 205 131 180bishop - rabbi 6.69 103 1.76 152 212 271Jerusalem - Israel 8.46 11617 98.15 28 7 17Jerusalem - Palestinian 7.65 4212 97.15 79 24 18holy - sex 1.62 283 4.87 339 161 186fuck - sex 9.44 392 10.69 2 143 114Maradona - football 8.62 844 36.74 22 81 50football - soccer 9.03 1412 133.9 10 57 13football - basketball 6.81 529 20.01 140 121 84football - tennis 6.63 84 2.2 154 222 252tennis - racket 7.56 70 32.38 89 237 57Arafat - peace 6.73 2093 107.12 147 39 16Arafat - terror 7.65 673 17.16 78 102 87Arafat - Jackson 2.5 55 0.73 321 249 304

116

Word Pair ws-353 Baseline ASKNet ws-353 Baseline ASKNetScore Score Score Rank Rank Rank

law - lawyer 8.38 13792 110.76 33 3 15movie - star 7.38 1479 9.96 108 54 122movie - popcorn 6.19 1234 42.94 187 64 42movie - critic 6.73 248 3.88 146 169 209movie - theater 7.92 6648 168.49 62 11 9physics - proton 8.12 1378 22.38 47 60 74physics - chemistry 7.35 433 20.8 110 137 81space - chemistry 4.88 144 2.42 248 195 239alcohol - chemistry 5.54 61 2.37 225 244 244vodka - gin 8.46 81 0.5 29 224 313vodka - brandy 8.13 20 4.96 45 285 184drink - car 3.04 621 3.39 304 107 222drink - ear 1.31 1327 1.65 342 63 272drink - mouth 5.96 43 4.99 200 260 183drink - eat 6.87 2170 4.29 134 38 198baby - mother 7.85 53 14.06 67 252 102drink - mother 2.65 48 1.08 316 256 289car - automobile 8.94 621 5.65 14 106 170gem - jewel 8.96 5917 70.91 13 16 28journey - voyage 9.29 3 24.71 3 320 71boy - lad 8.83 90 3.97 17 219 205coast - shore 9.1 377 63.43 7 148 33asylum - madhouse 8.87 10 135.62 16 300 12magician - wizard 9.02 3434 406.39 11 28 3midday - noon 9.29 33 222.97 4 268 6furnace - stove 8.79 289 31.9 19 160 58food - fruit 7.52 489 6.91 92 130 151bird - cock 7.1 565 16.11 120 116 90bird - crane 7.38 80 7.52 105 226 142tool - implement 6.46 136 6.36 166 199 157brother - monk 6.27 597 87.11 179 111 20crane - implement 2.69 176 2.96 315 185 230lad - brother 4.46 12 1.4 263 298 279journey - car 5.85 107 3.06 212 210 227monk - oracle 5 61 17.06 241 242 88cemetery - woodland 2.08 0 0 329 352 352food - rooster 4.42 35 2.39 264 265 242coast - hill 4.38 31 2.12 265 271 259forest - graveyard 1.85 22 0.29 333 282 320shore - woodland 3.08 1 0.01 303 342 342monk - slave 0.92 4 0.17 347 318 329coast - forest 3.15 226 9.72 302 174 124lad - wizard 0.92 7 0.16 346 313 330chord - smile 0.54 0 0 350 348 348glass - magician 2.08 0 0 330 349 349noon - string 0.54 0 0 351 350 350rooster - voyage 0.62 1 0.02 349 332 339money - dollar 8.42 559 15.27 32 118 96money - cash 9.08 933 38.39 8 75 46money - currency 9.04 1409 37.1 9 58 49

117

Word Pair ws-353 Baseline ASKNet ws-353 Baseline ASKNetScore Score Score Rank Rank Rank

money - wealth 8.27 596 7.37 41 112 144money - property 7.57 679 6.19 87 100 164money - possession 7.29 326 10.43 112 154 116money - bank 8.5 3271 29.99 26 29 60money - deposit 7.73 1130 21.39 73 67 78money - withdrawal 6.88 80 5.62 132 227 173money - laundering 5.65 8005 199.95 217 9 7money - operation 3.31 779 10.2 297 91 118tiger - jaguar 8 490 23.37 57 129 73tiger - feline 8 42 5.82 59 262 168tiger - carnivore 7.08 73 10.08 122 234 120tiger - mammal 6.85 54 1.29 135 251 282tiger - animal 7 873 7.08 128 79 149tiger - organism 4.77 24 0.45 250 279 316tiger - fauna 5.62 8 1.93 222 305 265tiger - zoo 5.87 353 8.75 210 150 130psychology - psychiatry 8.08 15 0.2 51 295 325psychology - anxiety 7 0 0 126 344 344psychology - fear 6.85 15 0.19 137 296 327psychology - depression 7.42 2 0.02 101 330 338psychology - clinic 6.58 16 0.22 158 291 323psychology - doctor 6.42 2 0.03 169 326 337psychology - Freud 8.21 161 2.14 42 187 258psychology - mind 7.69 123 1.64 77 204 273psychology - health 7.23 158 2.1 115 188 260psychology - science 6.71 476 6.34 149 132 158psychology - discipline 5.58 97 1.3 223 214 281psychology - cognition 7.48 379 5.05 95 146 181planet - star 8.45 3022 15.23 30 31 97planet - constellation 8.06 136 6.34 53 200 159planet - moon 8.08 584 15.76 50 114 91planet - sun 8.02 1930 24.81 56 40 70planet - galaxy 8.11 225 5.87 49 175 167planet - space 7.92 976 7.06 63 71 150planet - astronomer 7.94 742 27.85 61 95 61precedent - example 5.85 279 4.34 211 162 196precedent - information 3.85 62 0.92 280 240 298precedent - cognition 2.81 13 2.21 312 297 251precedent - law 6.65 2541 42.93 153 34 43precedent - collection 2.5 7 0.39 320 308 317precedent - group 1.77 49 1.16 337 254 285precedent - antecedent 6.04 8 0.11 193 304 334cup - coffee 6.58 709 21.11 157 96 79cup - tableware 6.85 0 0 136 353 353cup - article 2.4 498 2.25 322 124 250cup - artifact 2.92 7 0.99 310 310 294cup - object 3.69 76 0.91 288 231 300cup - entity 2.15 45 1.44 328 259 278cup - drink 7.25 210 6.78 114 179 153cup - food 5 155 2.1 243 190 262

118

Word Pair ws-353 Baseline ASKNet ws-353 Baseline ASKNetScore Score Score Rank Rank Rank

cup - substance 1.92 26 0.79 332 276 303cup - liquid 5.9 65 2.27 208 239 248jaguar - cat 7.42 2505 7.62 102 35 139jaguar - car 7.27 779 4.41 113 90 193energy - secretary 1.81 77 1.03 334 230 292secretary - senate 5.06 205 2.74 239 181 234energy - laboratory 5.09 74 2.01 238 233 264computer - laboratory 6.78 245 5.63 142 171 171weapon - secret 6.06 137 7.61 192 197 140FBI - fingerprint 6.94 792 64.03 130 87 32FBI - investigation 8.31 1587 240.38 38 50 4investigation - effort 4.59 72 9.26 257 235 127Mars - water 2.94 860 4.17 309 80 203Mars - scientist 5.63 88 3.02 219 220 228news - report 8.16 1470 25.91 43 55 67canyon - landscape 7.53 16 9.67 91 294 125image - surface 4.56 602 6.23 258 109 161discovery - space 6.34 408 7.23 174 141 145water - seepage 6.56 56 6.01 160 247 166sign - recess 2.38 78 1.47 324 229 276Wednesday - news 2.22 19 0.91 327 288 301mile - kilometer 8.66 1124 72.48 21 68 27computer - news 4.47 126 2.15 260 203 257territory - surface 5.34 24 0.46 229 280 315atmosphere - landscape 3.69 19 2.25 287 289 249president - medal 3 22 0.72 305 283 305war - troops 8.13 457 10.41 44 134 117record - number 6.31 12330 146.27 178 5 10skin - eye 6.22 203 4.74 184 182 188Japanese - American 6.5 377 4.21 162 147 201theater - history 3.91 116 1.47 275 205 277volunteer - motto 2.56 2 5.42 318 324 176prejudice - recognition 3 3 0.88 306 322 302decoration - valor 5.63 2 54.05 218 328 34century - year 7.59 4709 13.83 85 21 103century - nation 3.16 6141 3.47 301 13 218delay - racism 1.19 0 0 343 346 346delay - news 3.31 154 3.42 298 191 220minister - party 6.63 1768 26.24 155 43 66peace - plan 4.75 227 1.88 253 173 268minority - peace 3.69 8 1.06 285 306 291attempt - peace 4.25 208 7.17 267 180 147government - crisis 6.56 187 3.58 159 184 216deployment - departure 4.25 782 10.43 269 89 115deployment - withdrawal 5.88 0 0 209 351 351energy - crisis 5.94 550 7.67 203 119 138announcement - news 7.56 158 7.48 88 189 143announcement - effort 2.75 8 4.44 313 307 191stroke - hospital 7.03 289 8.71 124 159 131disability - death 5.47 68 2.15 226 238 256

119

Word Pair ws-353 Baseline ASKNet ws-353 Baseline ASKNetScore Score Score Rank Rank Rank

victim - emergency 6.47 110 11.95 165 207 109treatment - recovery 7.91 56 2.51 64 248 237journal - association 4.97 29 2.29 245 272 247doctor - personnel 5 20 3.47 242 286 219doctor - liability 5.19 12 1.38 236 299 280liability - insurance 7.03 6153 225.19 125 12 5school - center 3.44 623 7.22 293 105 146reason - hypertension 2.31 126 4.42 325 202 192reason - criterion 5.91 62 6.2 206 241 163hundred - percent 7.38 264 6.36 106 168 156Harvard - Yale 8.13 3798 79.17 46 25 22hospital - infrastructure 4.63 43 0.92 255 261 299death - row 5.25 700 2.1 235 97 261death - inmate 5.03 222 30.52 240 176 59lawyer - evidence 6.69 134 4.21 151 201 200life - death 7.88 4703 22.04 66 22 76life - term 4.5 5242 7.73 259 19 137word - similarity 4.75 95 6.25 252 217 160board - recommendation 4.47 17 1 261 290 293governor - interview 3.25 8 3.9 299 302 207OPEC - country 5.63 5368 72.98 220 18 26peace - atmosphere 3.69 4 0.24 286 317 322peace - insurance 2.94 5 0.16 308 316 331territory - kilometer 5.28 50 10.19 233 253 119travel - activity 5 142 3.23 244 196 225competition - price 6.44 315 5.63 167 158 172consumer - confidence 4.13 371 5.45 270 149 175consumer - energy 4.75 216 2.52 251 178 236problem - airport 2.38 76 1.15 323 232 286car - flight 4.94 437 2.16 247 135 255credit - card 8.06 47692 524.34 54 1 2credit - information 5.31 1705 14.11 231 47 100hotel - reservation 8.03 276 37.73 55 163 48grocery - money 5.94 2 0.03 202 325 336registration - arrangement 6 7 2.18 196 312 254arrangement - accommodation 5.41 1 0.55 228 338 312month - hotel 1.81 49 1.55 336 255 275type - kind 8.97 1355 6.42 12 61 155arrival - hotel 6 54 0.71 195 250 306bed - closet 6.72 109 7.81 148 208 136closet - clothes 8 81 64.49 58 225 31situation - conclusion 4.81 57 4.17 249 246 202situation - isolation 3.88 16 2.75 278 292 233impartiality - interest 5.16 202 3.73 237 183 212direction - combination 2.25 35 1.9 326 266 267street - place 6.44 339 2.96 168 151 229street - avenue 8.88 95 43.64 15 216 41street - block 6.88 324 17.24 131 155 86street - children 4.94 153 5.03 246 193 182listing - proximity 2.56 1 2.39 319 337 243listing - category 6.38 147 4.63 171 194 190

120

Word Pair ws-353 Baseline ASKNet ws-353 Baseline ASKNetScore Score Score Rank Rank Rank

cell - phone 7.81 11828 78.55 70 6 24production - hike 1.75 1 0.12 338 334 333benchmark - index 4.25 2 0.97 268 331 296media - trading 3.88 79 3.62 276 228 214media - gain 2.88 831 3.11 311 85 226dividend - payment 7.63 672 41.56 81 103 45dividend - calculation 6.48 1 0.01 163 341 341calculation - computation 8.44 57 8.33 31 245 135currency - market 7.5 401 7.08 93 142 148OPEC - oil 8.59 1668 78.74 23 48 23oil - stock 6.34 72 0.68 175 236 309announcement - production 3.38 3 0.04 295 321 335announcement - warning 6 7 26.72 197 309 63profit - warning 3.88 6 1.07 277 314 290profit - loss 7.63 837 20.12 80 83 83dollar - yen 7.78 495 80.5 71 125 21dollar - buck 9.22 9 3.36 5 301 223dollar - profit 7.38 113 4.68 107 206 189dollar - loss 6.09 32 0.98 191 269 295computer - software 8.5 2696 45.09 27 32 40network - hardware 8.31 246 25.75 39 170 68phone - equipment 7.13 491 11.48 118 126 113equipment - maker 5.91 36 3.92 207 264 206luxury - car 6.47 95 2.5 164 215 238five - month 3.38 242 4.38 294 172 194report - gain 3.63 786 3.8 290 88 211investor - earning 7.13 101 12.41 119 213 106liquid - water 7.89 1709 15.54 65 46 94baseball - season 5.97 1204 35.96 199 66 52game - victory 7.03 164 6.22 123 186 162game - team 7.69 6831 24.58 76 10 72marathon - sprint 7.47 2 1.19 96 329 284game - series 6.19 6124 74.28 189 14 25game - defeat 6.97 274 8.43 129 164 134seven - series 3.56 1554 33.01 291 51 56seafood - sea 7.47 47 3.59 97 258 215seafood - food 8.34 41 11.66 35 263 111seafood - lobster 8.7 21 9.66 20 284 126lobster - food 7.81 529 9.75 69 120 123lobster - wine 5.7 0 0 216 345 345food - preparation 6.22 619 11.89 185 108 110video - archive 6.34 28 0.59 173 275 311start - year 4.06 2172 8.63 272 37 132start - match 4.47 678 8.6 262 101 133game - round 5.97 2479 7.57 198 36 141boxing - round 7.61 1035 15.03 83 70 98championship - tournament 8.36 153 20.84 34 192 80fighting - defeating 7.41 1 3.27 104 339 224line - insurance 2.69 1883 6.11 314 41 165day - summer 3.94 1338 15.6 274 62 93

121

Word Pair ws-353 Baseline ASKNet ws-353 Baseline ASKNetScore Score Score Rank Rank Rank

summer - drought 7.16 0 0 117 343 343summer - nature 5.63 82 2.4 221 223 240day - dawn 7.53 137 4.3 90 198 197nature - environment 8.31 1118 9.97 37 69 121environment - ecology 8.81 768 33.66 18 92 54nature - man 6.25 5829 4.27 180 17 199man - woman 8.3 12541 142.99 40 4 11man - governor 5.25 1593 3.85 234 49 210murder - manslaughter 8.53 15307 634.36 25 2 1soap - opera 7.94 328 4.38 60 153 195opera - performance 6.88 1755 15.65 133 44 92life - lesson 5.94 47 0.71 201 257 307focus - life 4.06 574 4.92 271 115 185production - crew 6.25 528 13.21 182 122 104television - film 7.72 1753 22.37 74 45 75lover - quarrel 6.19 2 20.62 186 327 82viewer - serial 2.97 1 0.01 307 335 340possibility - girl 1.94 1 0.17 331 333 328population - development 3.75 693 4.14 282 99 204morality - importance 3.31 25 1.85 296 277 269morality - marriage 3.69 61 1.09 284 243 287Mexico - Brazil 7.44 272 3.88 100 166 208gender - equality 6.41 1227 46.43 170 65 39change - attitude 5.44 1441 14.45 227 56 99family - planning 6.25 822 14.1 181 86 101opera - industry 2.63 272 1.91 317 165 266sugar - approach 0.88 16 0.32 348 293 319practice - institution 3.19 625 4.74 300 104 187ministry - culture 4.69 25 0.64 254 278 310problem - challenge 6.75 490 8.77 145 128 129size - prominence 5.31 92 2.39 230 218 241country - citizen 7.31 525 6.84 111 123 152planet - people 5.75 322 1.63 215 156 274development - issue 3.97 594 3.49 273 113 217experience - music 3.47 389 3.69 292 145 213music - project 3.63 957 12.28 289 72 107glass - metal 5.56 886 12.1 224 78 108aluminum - metal 7.83 264 51.45 68 167 35chance - credibility 3.88 0 0 279 347 347exhibit - memorabilia 5.31 1 0.68 232 336 308concert - virtuoso 6.81 5 5.24 141 315 177rock - jazz 7.59 840 26.46 84 82 64museum - theater 7.19 28 1.23 116 273 283observation - architecture 4.38 4 0.2 266 319 324space - world 6.53 750 2.2 161 93 253preservation - world 6.19 104 2.35 190 211 245

122

Word Pair ws-353 Baseline ASKNet ws-353 Baseline ASKNetScore Score Score Rank Rank Rank

admission - ticket 7.69 461 70.23 75 133 29shower - thunderstorm 6.31 1 0.48 176 340 314shower - flood 6.03 31 9.23 194 270 128weather - forecast 8.34 937 90.72 36 74 19disaster - area 6.25 414 6.48 183 140 154governor - office 6.34 742 27.85 172 94 62architecture - century 3.78 560 5.55 281 117 174

123

Bibliography

Satanjeev Banerjee and T. Pedersen. Extended gloss overlaps as a measure of seman-tic relatedness. In In Proceedings of the Eighteenth International Conference onArtificial Intelligence (IJCAI-03), 2003.

M. Bilenko, R. Mooney, W. Cohen, P. Ravikumar, and S. Fienberg. Adaptivename matching in information integration. Intelligent Systems, IEEE, 18:16 –23, Sep/Oct 2003.

Johan Bos. Towards wide-coverage semantic interpretation. In Proceedings of SixthInternational Workshop on Computational Semantics IWCS-6, pages 42–53, 2005.

Johan Bos, Stephen Clark, Mark Steedman, James R. Curran, and Julia Hockenmaier.Wide-coverage semantic representations from a CCG parser. In Proceedings of the20th International Conference on Computational Linguistics (COLING-04), pages1240–1246, Geneva, Switzerland, 2004.

T. Briscoe and J. Carroll. Robust accurate statistical annotation of general text.In Proceedings of the 3rd International Conference on Language Resources andEvaluation, pages 1499–1504, Las Palmas, Gran Canaria, 2002.

Alexander Budanitsky and Graeme Hirst. Evaluating wordnet-based measures ofsemantic distance. Computational Linguistics, 32:13 – 47, March 2006.

J. Carletta. Assessing agreement on classification tasks: the Kappa statistic. Com-putational Linguistics, 22(2):249–254, 1996.

Eugene Charniak. A maximum-entropy-inspired parser. In Proceedings of the FirstConference on North American Chapter of the Association for Computational Lin-guistics, pages 132–139, San Francisco, CA, USA, 2000. Morgan Kaufmann Pub-lishers Inc.

Eugene Charniak and Micha Elsner. Em works for pronoun anaphora resolution. InProceedings of the 12th Conference of the European Chapter of the Association forComputational Linguistics, pages 148 – 156, Athens Greece, 2009.

Eugene Charniak and Robert P. Goldman. A Bayesian model of plan recognition.Artificial Intelligence, 64(1):53–79, 1993.

124

BIBLIOGRAPHY

Kenneth Ward Church and Patrick Hanks. Word association norms, mutual informa-tion, and lexicography. Computational Linguistics, 16(1):22–29, 1990.

S. Clark and J. R. Curran. Wide-coverage efficient statistical parsing with CCG andlog-linear models. Computational Linguistics, 33(4):493–552, 2007.

Stephen Clark and James R. Curran. Parsing the WSJ using CCG and log-linearmodels. In Proceedings of the 42nd Annual Meeting of the Association for Compu-tational Linguistics (ACL ’04), pages 104–111, Barcelona, Spain, 2004.

Allan M. Collins and Elizabeth F. Loftus. A spreading-activation theory of semanticprocessing. Psychological Review, 82(6):407–428, 1975.

M. Collins. Head-driven statistical models for natural language parsing. Computa-tional Linguistics, 29(4):589–637, 2003.

Micheal Collins. Head-Driven Statistical Models for Natural Language Parsing. PhDthesis, University of Pennsylvania, 1999.

F. Crestani. Application of spreading activation techniques in information retrieval.Artificial Intelligence Review, 11(6):453 – 482, Dec 1997.

J. R. Curran and S. Clark. Language independent NER using a maximum entropytagger. In Proceedings of the Seventh Conference on Natural Language Learning(CoNLL-03), pages 164–167, Edmonton, Canada, 2003.

Jon Curtis, G. Matthews, and D. Baxter. On the effective use of Cyc in a question an-swering system. In Papers from the IJCAI Workshop on Knowledge and Reasoningfor Answering Questions, Edinburgh, Scotland, 2005.

Jon Curtis, D. Baxter, and J. Cabral. On the application of the Cyc ontology toword sense disambiguation. In Proceedings of the Nineteenth International FLAIRSConference, pages 652 – 657, Melbourne Beach, FL, May 2006.

H. Trang Dang, J. Lin, and D. Kelly. Overview of the TREC 2006 question answeringtrack. In Proceedings of the Fifteenth Text Retrieval Conference (TREC 2006),Gaithersburg, MD, 2006.

E. W. Dijkstra. A note on two problems in connection with graphs. NumericalMathematics, 1:269 – 271, 1959.

William B. Dolan, L. Vanderwende, and S. Richardson. Automatically deriving astructured knowledge base from on-line dictionaries. In Proceedings of the PacificAssociation for Computational Linguistics, Vancouver, British Columbia, April1993.

Oren Etzioni, Michael Cafarella, Doug Downey, Stanley Kok, Ana-Maria Popescu,Tal Shaked, Stephen Soderland, Daniel S. Weld, and Alexander Yates. Web-scale

125

BIBLIOGRAPHY

information extraction in KnowItAll: (preliminary results). In WWW ’04: Pro-ceedings of the 13th international conference on World Wide Web, pages 100–110,New York, NY, USA, 2004. ACM.

Christiane Fellbaum, editor. WordNet : An Electronic Lexical Database. MIT Press,Cambridge, Mass, USA, 1998.

Lev Finkelstein, Evgeniy Gabrilovich, Yossi Matias, Ehud Rivlin, Zach Solan, GadiWolfman, and Eytan Ruppin. Placing search in context: The concept revisited. InACM Transactions on Information Systems, volume 20(1), pages 116–131, 2002.

Emden R. Gansner and Stephen C. North. An open graph visualization system andits applications to software engineering. Software — Practice and Experience, 30(11):1203 – 1233, 2000.

R. V. Guha and A. Garg. Disambiguating people in search. In 13th World Wide WebConference (WWW 2004), New York, USA, 2004.

A. Hickl, J. Williams, J. Bensley, K. Roberts, B. Rink, and Y. Shi. Recognizingtextual entailment with LCC’s groundhog system. In Proceedings of the SecondPASCAL Challenges Workshop, Venice, Italy, 2006.

G. Hirst. Semantic Interpretation and the Resolution of Ambiguity. Studies in NaturalLanguage Processing. Cambridge University Press, Cambridge, UK, 1987.

J. Hockenmaier. Data and Models for Statistical Parsing with Combinatory CategorialGrammar. PhD thesis, University of Edinburgh, 2003.

Thad Hughes and Daniel Ramage. Lexical semantic relatedness with random graphwalks. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natu-ral Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pages 581–589, Prague, Czech Republic, 2007.

J. J. Jiang and D. W. Conrath. Semantic similarity based on corpus statistics andlexical taxonomy. In International Conference on Research on Computational Lin-guistics (ROCLING X), Taipei, Taiwan, September 1997.

H. Kamp. A theory of truth and semantic representation. In J. Groenendijk et al.,editors, Formal Methods in the Study of Language. Mathematisch Centrum, 1981.

Hans Kamp and Uwe Reyle. From Discourse to Logic : Introduction to Modeltheo-retic Semantics of Natural Language, Formal Logic and Discourse RepresentationTheory. Kluwer Academic, Dordrecht, 1993.

Rick Kjeldsen and Paul R. Cohen. The evolution and performance of the GRANTsystem. Technical report, University of Massachusetts, Amherst, MA, USA, 1988.

Dan Klein and Christopher D. Manning. Fast exact inference with a factored modelfor natural language parsing. Advances in Neural Information Processing Systems,15:3–10, 2003.

126

BIBLIOGRAPHY

Douglas B. Lenat. Cyc: A large-scale investment in knowledge infrastructure. Com-munications of the ACM, 38(11):33 – 38, 1995.

Dekang Lin. An information-theoretic definition of similarity. In Proceedings of the15th International Conference on Machine Learning, 1998.

H. Liu and P Singh. Commonsense reasoning in and over natural language. InProceedings of the 8th International Conference on Knowledge-Based IntelligentInformation & Engineering Systems (KES’2004), Wellington, New Zealand, 2004a.

H Liu and P Singh. Conceptnet: A practical commonsense reasoning tool-kit. BTTechnology Journal, 22:211 – 226, Oct 2004b.

Margaret Masterman. Semantic message detection for machine translation, usingan interlingua. In Proceedings of the 1961 International Conference on MachineTranslation of Languages and Applied Language Analysis, pages 438 – 475, London,1962.

Cynthia Matuszek, J. Cabral, M. Witbrock, and J. DeOliveira. An introduction tothe syntax and content of Cyc. In 2006 AAAI Spring Symposium on Formalizingand Compiling Background Knowledge and Its Applications to Knowledge Repre-sentation and Question Answering, Stanford, CA, USA, March 2006.

D.E. Meyer and R.W. Schvaneveldt. Facilitation in recognizing pairs of words: Evi-dence of a dependence between retrieval operations. Journal of Experimental Psy-chology, 90(2):227–234, 1971.

D. Moldovan, S. Harabagiu, R. Girju, P. Morarescu, F. Lacatusu, A. Novischi, A. Bad-ulescu, and O. Bolohan. LCC tools for question answering. In 11th Text RetrievalConference, Gaithersburg, MD, 2002.

Vivi Nastase. Topic-driven multi-document summarization with encyclopedic knowl-edge and spreading activation. In Proceedings of the 2008 Conference on EmpiricalMethods in Natural Language Processing (EMNLP-2008), pages 763–772, Honolulu,October 2008.

Sebastian Pado and Mirella Lapata. Dependency-based construction of semanticspace models. Computational Linguistics, 33(2):161–199, 2007.

Patrick Pantel and Marco Pennacchiotti. Espresso: Leveraging generic pat-terns for automatically harvesting semantic relations. In Proceedings of Confer-ence on Computational Linguistics / Association for Computational Linguistics(COLING/ACL-06), Sydney, Australia, 2006.

Patrick Pantel, Deepak Ravichandran, and Eduard Hovy. Towards terascale knowl-edge acquisition. In Proceedings of Conference on Computational Linguistics(COLING-04), pages 771 – 777, Geneva, Switzerland, 2004.

M. F. Porter. An algorithm for suffix stripping. Program, 14(3):130–137, 1980.

127

BIBLIOGRAPHY

S Preece. A Spreading Activation Model for Information Retrieval. PhD thesis,University of Illinois, Urbana, IL, 1981.

James Pustejovsky, Robert Knippen, Jessica Littman, and Roser Saurı. Temporal andevent information in natural language text. Language Resources and Evaluation,39(2-3):123–164, 2005.

M. Ross Quillian. The teachable language comprehender: A simulation program andtheory of language. Communications of the ACM, 12(8):459 – 476, 1969.

Philip Resnik. Semantic similarity in a taxonomy: An information-based measure andits application to problems of ambiguity in natural language. Journal of ArtificialIntelligence Research, 11:95–130, 1999.

Stephen D. Richardson, William B. Dolan, and Lucy Vanderwende. Mindnet: Ac-quiring and structuring semantic information from text. In Proceedings of COLING’98, 1998.

G. Salton and C. Buckley. On the use of spreading activation methods in auto-matic information retrieval. In SIGIR ’88: Proceedings of the 11th annual in-ternational ACM SIGIR conference on research and development in informationretrieval, pages 147 – 160, New York, NY, USA, 1988. ACM Press.

L. Schubert and M. Tong. Extracting and evaluating general world knowledge fromthe Brown corpus. In Proceedings of the HLT/NAACL 2003 Workshop on TextMining, 2003.

Roger W. Schvaneveldt, editor. Pathfinder associative networks: studies in knowledgeorganization. Ablex Publishing Corp., Norwood, NJ, USA, 1990. ISBN 0-89391-624-2.

Push Singh, Thomas Lin, Erik T. Mueller, Grace Lim, Travell Perkins, and Wan LiZhu. Open mind common sense: Knowledge acquisition from the general public.In Lecture Notes in Computer Science, volume 2519, pages 1223 – 1237. SpringerBerlin / Heidelberg, 2002.

John F. Sowa. Semantic networks. In S. C. Shapiro, editor, Encyclopedia of ArtificialIntelligence. Wiley-Interscience, New York, 2nd edition, 1992.

C. Spearman. The proof and measurement of association between two things. TheAmerican journal of psychology, 100(3-4):441–471, 1987.

Mark Steedman. The Syntactic Process. The MIT Press, Cambridge, MA., 2000.

Tom Stocky, Alexander Faaborg, and Henry Lieberman. A commonsense approach topredictive text entry. In Proceedings of Conference on Human Factors in ComputingSystems, Vienna, Austria, April 2004.

128

BIBLIOGRAPHY

D. R. Swanson. Fish oil, raynaud’s syndrome, and undiscovered public knowledge.Perspectives in Biology and Medicine, 30(1):7–18, 1986.

E. F. Tjong Kim Sang and F. De Meulder. Introduction to the CoNLL-2003 sharedtask: Language-independent named entity recognition. In Walter Daelemans andMiles Osborne, editors, Proceedings of CoNLL-2003, pages 142–147, 2003.

Peter D. Turney. Lecture notes in computer science 1: Mining the web for synonyms:PMI-IR versus LSA on TOEFL, 2001.

J. van Eijck. Discourse representation theory. In Encyclopedia of Language andLinguistics. Elsevier Science Ltd, 2 edition, 2005.

J. van Eijck and H. Kamp. Representing discourse in context. In J. van Benthem andA. ter Meulen, editors, Handbook of Logic and Language. MIT Press, CambridgeMA, USA, 1997.

Xiaojun Wan, Jianfeng Gao, Mu Li, and Binggong Ding. Person resolution in personsearch results: Webhawk. In CIKM ’05: Proceedings of the 14th ACM internationalconference on Information and knowledge management, pages 163 – 170, New York,NY, USA, 2005. ACM Press.

Huan Wang, Xing Jiang, Liang-Tien Chia, and Ah-Hwee Tan. Ontology enhancedweb image retrieval: aided by Wikipedia & spreading activation theory. In MIR’08: Proceeding of the 1st ACM international conference on Multimedia informationretrieval, pages 195–201, New York, NY, USA, 2008. ACM.

Word Net. WNStats - wordnet 2.1 database statistics. Viewed 25 July, 2006, 2006.http://wordnet.princeton.edu/.

129