Kernels for Link Prediction with Latent Feature Models
-
Upload
independent -
Category
Documents
-
view
1 -
download
0
Transcript of Kernels for Link Prediction with Latent Feature Models
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
Kernels for Link Predictionwith Latent Feature Models
Canh Hao Nguyen & Hiroshi Mamitsuka
Bioinformatics Center, Institute of Chemical ResearchKyoto University
{canhhao,mami}@kuicr.kyoto-u.ac.jp
Canh Hao Nguyen & Hiroshi Mamitsuka Kernels for Link Prediction with Latent Feature Models
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
Outline1 Background
Link Prediction on NetworksNetwork Structures
2 Latent Feature ModelsMotivationLatent Feature Models
3 Link KernelsKernels from Latent Feature ModelsNode KernelsLink Kernels
4 ExperimentsLatent Feature AssumptionTime ComplexityLink Prediction Results
5 Conclusion2/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
Link Prediction on NetworksNetwork Structures
Link Prediction on Networks
A formal ways of representing relationsbetween entities.
Found in social networks, CollaborativeFiltering, Systems Biology.
A link is a relation between twoentities.
Problem: predicting new links given anetworks.
Information used: nodes’ informationor network structures.
3/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
Link Prediction on NetworksNetwork Structures
Assumption of Link Prediction
),( 21 PP
),( 42 PP
),( 23 PP
),( 53 PP
),( 51 PP
Fundamental assumption:independence and identicaldistribution of links.
BUT...
networks have structures
4/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
Link Prediction on NetworksNetwork Structures
Assumption of Link Prediction
),( 21 PP
),( 42 PP
),( 23 PP
),( 53 PP
),( 51 PP
Fundamental assumption:independence and identicaldistribution of links.
BUT... networks have structures
4/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
Link Prediction on NetworksNetwork Structures
Network Structures
Networks proved to have structures: scale-free, similarity, bipartite,clique, motif ...
Similarity networks.Social networks, metabolicnetworks...
Bipartite networks.Collaborative Filtering,protein-ligand bindings...
Latent feature modelbased networks.
Social networks, PPI, GRnetworks...
5/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
Link Prediction on NetworksNetwork Structures
Objective
To model network structures with latent features
To derive latent feature models for non-similarity networks.
To approximate the model using kernel frameworks.
To learn the model optimally and efficiently.
and apply to the link prediction problem in biological networks.
(a) Input (b) Learning (c) Output
6/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
MotivationLatent Feature Models
Outline
1 Background
2 Latent Feature ModelsMotivationLatent Feature Models
3 Link Kernels
4 Experiments
5 Conclusion
7/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
MotivationLatent Feature Models
Biological Motivation
Protein protein interactions are based on domain domaininteractions.
Protein docking is based on shape complementarity.
These features (domain, shape) determine PPI ability.
8/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
MotivationLatent Feature Models
Biological Motivation
Protein protein interactions are based on domain domaininteractions.
Protein docking is based on shape complementarity.
These features (domain, shape) determine PPI ability.
8/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
MotivationLatent Feature Models
Latent Feature Models of Links
A node in the graph contains some(latent) features (F ).
Some pairs of features interact (W).
The nodes contain that pairs link to eachother.
A = F ×W × FT
1 2 3
2
4
6
8
10
12
14
16
1 2 3
1
2
3
2 4 6 8 10 12 14 16
1
2
3
2 4 6 8 10 12 14 16
2
4
6
8
10
12
14
16
Latent Feature Vector
Feature Interaction Matrix
Latent Feature Matrix
Feature Interaction
9/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
MotivationLatent Feature Models
Inferring the Models
given the adjacency matrix A.
Not globally optimal.
Using Indian Buffet Processes (IBP)
... and Gibbs sampling.
Sampling F from some distribution.
Inferring W and repeat the process again.
Very time consuming.
Scalable to small graphs (< 500 nodes).
10/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
Kernels from Latent Feature ModelsNode KernelsLink Kernels
Outline
1 Background
2 Latent Feature Models
3 Link KernelsKernels from Latent Feature ModelsNode KernelsLink Kernels
4 Experiments
5 Conclusion
11/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
Kernels from Latent Feature ModelsNode KernelsLink Kernels
Latent Feature Models in Kernels
Kernels to encode similarity
High similarity on the sameclass.
Low similarity otherwise.
),( 21 PP
),( 42 PP
),( 23 PP
),( 53 PP
),( 51 PP
Link Similarity with latent features.
Links are similar if nodes aresimilar.
Nodes are similar if they sharemany latent features.
Ideal Kernels
for any diagonal, nonnegativeD ∈ Rd×d ,
K ∗(D) = F × D × FT
Any node kernels should be close to Ideal Kernels
12/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
Kernels from Latent Feature ModelsNode KernelsLink Kernels
Latent Feature Models in Kernels
Kernels to encode similarity
High similarity on the sameclass.
Low similarity otherwise.
),( 21 PP
),( 42 PP
),( 23 PP
),( 53 PP
),( 51 PP
Link Similarity with latent features.
Links are similar if nodes aresimilar.
Nodes are similar if they sharemany latent features.
Ideal Kernels
for any diagonal, nonnegativeD ∈ Rd×d ,
K ∗(D) = F × D × FT
Any node kernels should be close to Ideal Kernels
12/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
Kernels from Latent Feature ModelsNode KernelsLink Kernels
Latent Feature Models in Kernels
Kernels to encode similarity
High similarity on the sameclass.
Low similarity otherwise.
),( 21 PP
),( 42 PP
),( 23 PP
),( 53 PP
),( 51 PP
Link Similarity with latent features.
Links are similar if nodes aresimilar.
Nodes are similar if they sharemany latent features.
Ideal Kernels
for any diagonal, nonnegativeD ∈ Rd×d ,
K ∗(D) = F × D × FT
Any node kernels should be close to Ideal Kernels
12/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
Kernels from Latent Feature ModelsNode KernelsLink Kernels
Node Kernels
Idea: nodes with more common latentfeatures are more similar.
Observation: nodes with more commonlatent features have similar connectivitypatterns.
We use connectivity pattern to define nodesimilarity.
Kn = norm(A2) (diagonally normalized).
In sparse networks, this kernel Kn
approximates ideal kernels.
PPI networks are sparse (> 80% nodes inyeast, all in fruit-fly are of degree ≤ 10).
13/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
Kernels from Latent Feature ModelsNode KernelsLink Kernels
Node Kernels
Idea: nodes with more common latentfeatures are more similar.
Observation: nodes with more commonlatent features have similar connectivitypatterns.
We use connectivity pattern to define nodesimilarity.
Kn = norm(A2) (diagonally normalized).
In sparse networks, this kernel Kn
approximates ideal kernels.
PPI networks are sparse (> 80% nodes inyeast, all in fruit-fly are of degree ≤ 10).
13/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
Kernels from Latent Feature ModelsNode KernelsLink Kernels
Kernels on Links
Given node similarity are encoded in node kernel Kn.
We use pairwise kernels to encode link similarity.
If corresponding nodes are similar, links are similar.
K ({a, b}, {c , d}) = Kn(a, c) ∗ Kn(b, d) + Kn(a, d) ∗ Kn(b, c)
Feature-wise: if ai , bj are feature of a and baibj + ajbi is a feature of {a, b}
14/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
Kernels from Latent Feature ModelsNode KernelsLink Kernels
Demonstration
1 2 3
2
4
6
8
10
12
14
16
1 2 3
1
2
3
2 4 6 8 10 12 14 16
1
2
3
2 4 6 8 10 12 14 16
2
4
6
8
10
12
14
16
2 4 6 8 10 12 14 16
2
4
6
8
10
12
14
16
Network model Incomplete observation
F2
−0.4
−0.2
0
−0.3−0.2−0.100.10.20.30.4
−0.5
−0.4
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
0.4
Nodes and Links Nonlinks class
15/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
Kernels from Latent Feature ModelsNode KernelsLink Kernels
Properties (Kn similar to ideal kernels)
Positivity: (Kn)ij for a pair of nodes (i , j) is positive if theyshare a common feature.if there is a shared feature according to the generative model,Kn should recognize it
Monotonicity: the more a pair of nodes share features, thehigher the kernel value is.Kn respects the monotonicity as an ideal kernels: morefeatures, higher kernel values
16/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
Kernels from Latent Feature ModelsNode KernelsLink Kernels
Properties (Kn similar to ideal kernels) cont.
Recall that Kn = FDFT = F (WEW T )FT , (E = FTF), idealkernels K ∗(D̂) = FD̂FT .
Ideal Condition: Kn is an ideal kernel iff WFT has all rowvectors uncorrelated.
On Sparse Networks: If the model is sparse in the sense that
(∑
i 6=j |WilWjkElk |p)1p ≤ δ then Kn is close to an ideal kernels
in the sense that‖ D − D̂ ‖p≤ δ
17/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
Latent Feature AssumptionTime ComplexityLink Prediction Results
Outline
1 Background
2 Latent Feature Models
3 Link Kernels
4 ExperimentsLatent Feature AssumptionTime ComplexityLink Prediction Results
5 Conclusion
18/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
Latent Feature AssumptionTime ComplexityLink Prediction Results
Experiments and Data
Input: adjacency matrix A.
90/10 train/test splits.
Learn SVMs on thekernels.
Show: running times,assumption comparison,prediction AUCs on PPInetworks.
Compare to using nodes’attributes: sequencekernels.
PPI networks of yeast and fruit fly:the largest networks available.
Extracted from DIP databases(hand-curated, only directinteractions).
We used only the connected part ofthe networks.
We filtered out the nodes withdegrees less than m.
For m = 1, network sizes: 4762proteins in yeast and 6644 proteinsin fruit fly.
19/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
Latent Feature AssumptionTime ComplexityLink Prediction Results
On the Latent Feature Model Assumpsion
Reasonability of the latent feature assumption as opposed to thesimilarity assumption: nearest neighbor classifier.
0.68
0.72
0.76
0.8
0.84
0.88
0 1 2 3 4 5 6 7 8 9 10 11
AUC
Minimum degree of nodes
LK-NN
SimNN
Latent feature assumption is more suitable than than similarity.
20/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
Latent Feature AssumptionTime ComplexityLink Prediction Results
Execution Time of Our Method vs. IBP
Our method scales to this size of data as opposed to explicitfeature generation methods.
0 2 4 6 8 1050
100
150
200
250
300
350
400
450
500
Minimum degree of nodes
Sec
ond
Time to train one model
−7.5 −7 −6.5 −6 −5.5 −5 −4.5 −4 −3.5
x 104
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5x 104
Model log−likelihood
Sec
ond
Log−likelihood convergence in time
Our method on all subnetworks IBP on the smallest one.
Our method takes minutes on this sizes of data. IBP only scales tothe smallest ones in many hours.
21/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
Latent Feature AssumptionTime ComplexityLink Prediction Results
Link Prediction Performance on yeast PPI networks
0.83
0.85
0.87
0.89
0.91
0 1 2 3 4 5 6 7 8 9 10 11
AUC
Minimum degree of nodes
Linear
Gaussian
IBP
Our method gives very high performance compared to IBP.
The results are significantly higher than random, meaning thatnetworks have significant structures.
Using sequence (spectrum kernels) to predict links in thesenetworks: 0.71± 0.008.
22/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
Latent Feature AssumptionTime ComplexityLink Prediction Results
Link Prediction Performance on fruit fly PPI networks
0.7
0.72
0.74
0.76
0.78
0.8
0 1 2 3 4 5 6 7 8 9
AUC
Minimum degree of nodes
Linear
Gaussian
IBP
Our method gives higher result than IBP.
The networks have significant structures.
Using sequence (spectrum kernels) to predict links in thesenetworks: 0.65± 0.016.
23/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
Conclusion
We argued to use network structures to predict links.
We proposed a latent feature model to describe the networkstructures.
We used kernels to encode the model implicitly and to train itefficiently and optimally.
Our method scaled to real data of PPI networks, unlike IBP.
Our method gave a high performance of predicting PPI.
Our method is efficient and effective for latent feature models ofnetworks
24/24
BackgroundLatent Feature Models
Link KernelsExperimentsConclusion
Conclusion
We argued to use network structures to predict links.
We proposed a latent feature model to describe the networkstructures.
We used kernels to encode the model implicitly and to train itefficiently and optimally.
Our method scaled to real data of PPI networks, unlike IBP.
Our method gave a high performance of predicting PPI.
Our method is efficient and effective for latent feature models ofnetworks
24/24