Dynamic Prediction of Communication Flow Using Social Context

25
November 5, 2008 @ACM HyperText and HyperMedia 2008, Pittsburgh, PA 1 Dynamic Prediction of Communication Flow using Social Context Munmun De Choudhury Hari Sundaram Ajita John Dorée Duncan Seligmann November 5, 2008 Arts, Media & Engineering Arizona State University, Tempe Collaborative Applications Research Avaya Labs, New Jersey

Transcript of Dynamic Prediction of Communication Flow Using Social Context

November 5, 2008@ACM HyperText and HyperMedia 2008, Pittsburgh, PA 1

Dynamic Prediction of Communication

Flow using Social Context

Munmun De Choudhury

Hari Sundaram

Ajita John

Dorée Duncan Seligmann

November 5, 2008

Arts, Media & Engineering

Arizona State University, Tempe

Collaborative Applications Research

Avaya Labs, New Jersey

November 5, 2008@ACM HyperText and HyperMedia 2008, Pittsburgh, PA 2

Introduction

� Why is the problem important?

• Determine information propagation

and the roles of people in the process.

• Targeted advertising, spread of

fashions and fads, innovations,

consumer interests etc.

• Determine community evolution.

November 5, 2008

A temporal representation framework

using communication and social context

to efficiently predict communication

flow in social networks.

Bob

Alice

November 5, 2008@ACM HyperText and HyperMedia 2008, Pittsburgh, PA 3

Error in Prediction of Intent

0

5

10

15

20

25

30

35

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39

Time (Weeks)

Err

or i

n P

red

icti

on

(%

)

Baseline Metho d

OurAp p ro ach

Our Approach

� Estimate two properties of communication flow: the intent to communicate and associated delay [De

Choudhury ’07].

• Design features to characterize communication and social

context.

• Select a subset of optimal contextual features using a voting strategy from five different feature selection techniques.

• Use features in a Support Vector Regression framework to predict the intent and the delay.

November 5, 2008

Baseline

Our Approach

Imp

rov

em

en

t

in p

red

icte

d

err

or

Error in Prediction of Intent to communicate

� Experimental results on MySpace dataset with effective prediction (error ~12-15%).

November 5, 2008@ACM HyperText and HyperMedia 2008, Pittsburgh, PA 4

Related Work

� Work on information diffusion [Gruhl, Tomkins ’04].

� Early adoption based flow model for recommendation systems [Song et al. ’06].

� Work on information roles of people [Nakajima et al. ‘05, Song et al. ‘06].

� Limitations:• information flow estimated from indirect evidence.

• context not been considered.

• Roles of people with respect to long term habitual behavior not analyzed.

November 5, 2008

Agitators and Summarizers [Nakajima et al. ‘05]

November 5, 2008@ACM HyperText and HyperMedia 2008, Pittsburgh, PA 5

Introduction / Related work

Problem StatementCommunication Context

Social Context

Prediction

Experimental Results

Conclusions

Outline

November 5, 2008 5

• Two sub-problems:

Intent to communicate

Communication Delay

• Role of context

November 5, 2008@ACM HyperText and HyperMedia 2008, Pittsburgh, PA 6

� The probability that a person will engage herself in some communication (given a particular topic and at a certain point of time) with another person.• It is contingent upon several factors or features defined by the

communication context [De Choudhury et al. 07] and social context.

November 5, 2008 6

What is Intent to Communicate?

Movie: 40%

Sports: 40%

Movie: 80%

Dinner: 20%Alice

Bob

Ann

November 5, 2008@ACM HyperText and HyperMedia 2008, Pittsburgh, PA 7

What is Delay in Propagation?

� The amount of time passed between the

reception of a message (on a certain topic) and

the corresponding response by a person.

Movie: 4 hours

Sports: 25 mins

Movie: 2 days

Dinner: 15 hours

November 5, 2008

Alice

Bob

Ann

November 5, 2008@ACM HyperText and HyperMedia 2008, Pittsburgh, PA 8

What is the role of context?

Gladwell’s approach: information

flows through specific “hot” pointsWatt’s approach: “hot” points are just

coincidental; information flow depends

on the vulnerability of people to

communication

Communication flow depends upon: how Alice and Bob communicate (communication

context), and what intrinsic roles Alice and Bob play in their communication (social

context)

Conventional view of communication flow in the community

Bob

Alice

November 5, 2008@ACM HyperText and HyperMedia 2008, Pittsburgh, PA 9

Introduction / Related work

Problem Statement

Communication Context

Social Context

Prediction

Experimental Results

Conclusions

Outline

November 5, 2008 9

• What is communication

context?

• Overview of

Neighborhood, Topic and

Recipient context

Neighborhood

Topic

Recipient

November 5, 2008@ACM HyperText and HyperMedia 2008, Pittsburgh, PA 10

Communication Context

� Communication context [De Choudhury,

Sundaram et al. ‘07] is the set of attributes that

affect communication between two individuals.

� Contextual attributes are dynamic [Dourish ’02].

• relationship between messages

• past communication behavior of a person

• response patterns from others

November 5, 2008 10

November 5, 2008@ACM HyperText and HyperMedia 2008, Pittsburgh, PA 11

Features of Communication Context

� Neighborhood context

• It refers to the effect of the user’s social network on her communication.

• Susceptibility

• Backscatter

� Topic context

• It refers to the effect of the semantics of a user’s past

communication on a topic on her future communication.

• Message Coherence

• Temporal Coherence

• Topic Relevance

• Topic Quantity

� Recipient context

November 5, 2008@ACM HyperText and HyperMedia 2008, Pittsburgh, PA 12

Introduction / Related work

Problem Statement

Communication Context

Social Context

Prediction

Experimental Results

Conclusions

Outline

November 5, 2008 12

• What is social context?

• Features of social context

November 5, 2008@ACM HyperText and HyperMedia 2008, Pittsburgh, PA 13

Social Context

� Social context:

• set of contextual attributes that refer to who is communicating with whom and what is the strength of relationship shared between them.

� Two features: information roles and strength of ties.

Star network with only out-going

communication links: high tendency to

send messages.

Alice Bob

Several in-coming links: receptor of

messages.

Alice

Bob

Charlie

Dan

Emilie

Same clique – strong tie – higher chance of

communication. Different cliques – weak tie – lower

chance of communication.

November 5, 2008@ACM HyperText and HyperMedia 2008, Pittsburgh, PA 14

Information Roles

� Information role is a contextual attribute acquired over time which impacts a person’s communication behavior.

� Three different categories of roles of people: • generators, people who generate information by themselves or from other

sources,

• mediators, people who act as transmitters of information between people, and

• receptors, people who mostly receive messages.

(b)

Figure : (a) Generators (red) (b) Mediators (green) (c) Receptors

(gray).

(a) (c)

November 5, 2008@ACM HyperText and HyperMedia 2008, Pittsburgh, PA 15

Strength of ties

� The nature of relationship between two people affects communication [Granovetter ’73].

� Strength of a tie between two people:

• overlap of their friends’ circles.

� The pattern of relationships between actors reveals the likelihood that individuals will be exposed to particular kinds of information.

• Weak tie and flow of new information.

• Strong tie and flow of existing information.

Topic: iPhone 3G [new

information]

Weak tie

Topic: Joe’s birthday

party last week [existing

information]

Strong tie

November 5, 2008@ACM HyperText and HyperMedia 2008, Pittsburgh, PA 16

Introduction / Related work

Problem Statement

Communication Context

Social Context

PredictionExperimental Results

Conclusions

Outline

November 5, 2008 16

• SVR approach

• Feature Selection

Contextual features Features in transformed

kernel space

Support Vector

Regression

November 5, 2008@ACM HyperText and HyperMedia 2008, Pittsburgh, PA 17

Prediction of Intent and Delay

� Support Vector Regression framework

• Use features of communication and social context and actual frequency of communication over N weeks for training a Support Vector Machine regressor.

• SVR further used to predict intent and delay at the (N+1)th week.

� Feature Selection

• Need to eliminate redundant features dynamically.

• Several static and dynamic feature selection techniques:

• Correlation coefficient

• Mutual information

• Decision trees

• Principal Component Analysis (PCA)

• kNN based estimation

• Use of a ‘voting strategy’ to determine the best k features at the beginning of each time slice.

November 5, 2008@ACM HyperText and HyperMedia 2008, Pittsburgh, PA 18

Introduction / Related work

Problem Statement

Communication Context

Social Context

Prediction

Experimental ResultsConclusions

Outline

November 5, 2008 18

• Prediction of intent and delay

• Qualitative Evaluation

MySpace.com

November 5, 2008@ACM HyperText and HyperMedia 2008, Pittsburgh, PA 19

Prediction Results

� Data description:• 20,000 users from MySpace, 1,425,010 messages from September 2005 to April

2007.

� Baseline Method:• Consider all features of communication context without feature selection [De

Choudhury et al. 2007].

� The training set of 70 weeks.

� Experiments over a period of 40 weeks, averaged over eight different contacts of a person Charlie and four topics.• Prediction error of 12-15 % compared to 25-30 % using the baseline method.

Error in Prediction of Intent

0

5

10

15

20

25

30

35

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39

Time (Weeks)

Erro

r i

n P

red

icti

on

(%

)

Baseline Metho d

OurApp ro ach

Error in Prediction of Delay

0

5

10

15

20

25

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39

Time (Weeks)

Erro

r i

n P

red

icti

on

(%)

Baseline Method

Our Appro ach

November 5, 2008@ACM HyperText and HyperMedia 2008, Pittsburgh, PA 20

Qualitative Evaluation of Predicted Intent

Radar Plot of Predicted Intent

0

0.2

0.4

0.6

0.8

1

1

2

3

4

5

6

7

8

9

10

JasonSusan

AnnaJennifer

MikeDavid

BillTom

Figure : Dynamics of predicted intent across

time and contacts for a user.

Direction of increasing

time

November 5, 2008@ACM HyperText and HyperMedia 2008, Pittsburgh, PA 21

Qualitative Evaluation of Predicted Delay

Figure : Dynamics in delay between Charlie and three of her contacts. The visualizations show

dynamics for ten consecutive weeks (Y axis) and for four topics (X axis). The length of each horizontal

line is proportional to the measure of delay.

CHARLIE→→→→ DAVID CHARLIE→→→→ SUSANCHARLIE→→→→ JASON

November 5, 2008@ACM HyperText and HyperMedia 2008, Pittsburgh, PA 22

Introduction / Related work

Problem Statement

Communication Context

SVM Based prediction

MySpace dataset

Experimental Results

Conclusions

Outline

November 5, 2008 22

• Summary and Contributions

• Future Work

Error in Prediction of Intent

0

5

10

15

20

25

30

35

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39

Time (Weeks)

Erro

r i

n P

red

icti

on

(%

)

Baseline M ethod

OurAppro ach

November 5, 2008@ACM HyperText and HyperMedia 2008, Pittsburgh, PA 23

� Predict communication flow in large scale social networks by incorporating social context.

� Excellent results on a real world dataset MySpace.com• error ~12-15 % compared to 25-30 % using only

communication context.

� Conclusions• Social context is key to predicting communication

flow.

• Intent to communicate appears to be a property of context; delay seems to be a property of habitual characteristics of people.

November 5, 2008 23

Summary and Conclusions

Error in Prediction of Intent

0

5

10

15

20

25

30

35

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39

Time (Weeks)

Erro

r i

n P

red

icti

on

(%

)

Baseline Metho d

OurAp p ro ach

Radar Plot of Predicted Intent

0

0.2

0.4

0.6

0.8

1

1

2

3

4

5

6

7

8

9

10

JasonSusan

AnnaJennifer

MikeDavid

BillTom

November 5, 2008@ACM HyperText and HyperMedia 2008, Pittsburgh, PA 24

Future Work

� How does the ‘age’ and nature of information impact communication flow?

� Model of collective user behavior.

� Inherent property of a communication media that affects communication flow.

‘Age’ of information

Co

mm

un

ica

tio

n F

low

Emergent collective user behavior

November 5, 2008@ACM HyperText and HyperMedia 2008, Pittsburgh, PA 25November 5, 2008 25

Thanks!

November 5, 2008 25