CS224W Machine Learning with Graphs

Lecture 5

Title: Spectral Clustering
Date: 2020. 04. 07 (TUE) ~ 2020. 04. 08 (WED)
Materials: Slides YouTube
Additional Materials: ratsgo’s blog

Using Network modularity, good partitioning means maximizing # of within-group connections and minimizing # of between-group connections

Graph Cuts

Minimum-cut (\(arg min_{A,B}cut(A,B)\))
- However, this criterion doesn’t mean optimal cut (Degenerate¹ case)
- Because of that doesn’t consider internal cluster connectivity (only consider external cluster connectivity)
Conductance (Smaller is better)
- \[\varnothing(A,B) = {\{cut(A,B)\}\over{min(vol(A),vol(B))}}\]
- where, \(vol(A)\) means total weighted degree of the nodes in A \(vol(A) = \sum_{\{i\}\in{A}}k_i\)
- It’s more balanced partitions.
- However, its computing is NP-hard

\(A\)(Adjacency matrix of undirected \(G\))
\(x\) is vector of dimensionality(\(\mathfrak{R}^n\)): vector of nodes(?)
\(y_i\) is a sum of labels \(x_j\) of neighbors of \(i\)()\(y = A \cdot x\))
\({A}\cdot{x} = {\lambda}\cdot{x}\) where \(x\) is eigenvector and \(\lambda\) is eigenvalue
If there is a \(d\)-regular graph(it means every node connect by same degree), its eigenvector is {1,1,…,1,1} and eigenvalue is \(d\)
We can split graph when \(x_n\) and \(x_{n-1}\) are different In the \(d\)-regular graph, \(x_n\) and \(x_{n-1}\) are always same. However, if these two values are different, it means that there are weaker connection and it is the split point
So, \(x_{n-1}\) splits the nodes into two group, when \(x_{n-1}[i] > 0\) or \(x_{n-1}[i] < 0\) (positive and negative means left side group or right side group)

Adjacency matrix (\(A\))
Degree matrix (\(D\))
Laplacian matrix (\(L = D - A\))
- If \(x = (1,...,1)\) then \({L}\cdot{x} = 0\) and so \(\lambda = \lambda_1 = 0\)

\(lambda_2\) means the second smallest eigenvalue (The smallest eigenvalue is always zero)
\[\lambda_2 = min{\{ \sum_{\{(i,j)\in{E}\}}(x_i - x_j)^2 \}\over{ \sum_i x_i^2 }}\]
\(\sum_{\{(i,j)\}\in{E}}(x_i-x_j)^2\): We can find split point where each value in sigma is not zero
\(\sum_i x_I^2\): This is just 1
So, minimum \(\lambda_2\) means the optimization of graph partitioning

Lambda2

Find Optimal Cut [Fiedler’73]
- \[{y_i} = \begin{cases} +1 & if\ {i} \in {A} \\ -1 & if\ {i} \in {B}\end{cases}\]
- Minimize \(\sum_iy_i\), it’s the optimal cut in partition (A, B)

Find specific motif in the graph and cluster
- Different motifs reveal different modular structures
Motifs cut and conductance
- Motif conductance(\(\varnothing(S) = {\{\#(motifs cut)\} \over {vol_m(S)}}\))
- # of motif cut means how many motif cut \(S\)
- \(vol_M(S)\) means how many end points in the subgraph \(S\)
  - \(vol_m(S)\) include the end points of cutting motif
- Finding set \(S\) with the minimal motif conductance is NP-hard
Motif Spectral Clustering
- Create a new weighted graph \(W^{(M)}\) and Apply spectral clustering on \(W^{(M)}\)
- Algorithm
  1. Pre-processing
    - Create weighted graph \(W^{(M)}\)
      - Remove irrelevant edges
      - Add weights how many times participates in the motif \(M\)
  2. Decomposition
  3. Grouping(Sweep Procedure)
    - The select to best cut using motif conductance
- Try to multiple motifs and select one of them
  - Best motif decrease conductance quickly and less global minimum conductance