CS224W Machine Learning with Graphs
Lecture 6
Title: Message Passing and Node Classification
Date: 2020. 04. 09 (THU) ~ 2020. 04. 13 (MON)
Materials: Slides YouTube
Node Classification
- Main question: How to predict labels of unlabeled nodes with some labeled nodes
- Collective classification
- Intuition: Correlations exist in networks
- Relational classification, Iterative classification, Belief propagation
Correlations in Networks
- Homophily
- Characteristic of nodes determine the edge(connection)
- Individuals to associate and bond with similar others
- Influence
- Connections determine characteristic of nodes
- Social connections can influence the individual characteristics
- Confounding
- Other factor(environment) determines both of characteristic of nodes and connections
Classification with Network Data(Binary Classification / Homophily network)
-
Similar nodes are typically close together or directly connected
-
Guilt -by-association
- Using Feature of \(O\)(object in network), Label of the \(O\)’s neighborhood, Feature of \(O\)’s neighborhood
-
Markov Assumption
- the label \(Y_i\) depends on the labels of its neighbors \(N_i\)
- \[P(Y_i|i) = P(Y_i|N_i)\]
Collective classification
- Steps
- Local classifier: Assign initial labels
- Standard classification task(only use node attributes)
- Relational classifier: Capture correlations
- Collective inference: Propagate correlations through network
- Iterate until the inconsistency between neighboring labels is minimized
- Local classifier: Assign initial labels
-
Relational classifier
- Probability of \(Y_i\)(class/label of \(i\)) is a weighted average of probability of its neighbors
- Labeled nodes initialize true label \(Y\)
- Unlabeled nodes initialize \(Y\) uniformly
- Update all nodes in a random order until convergence(or until maximum # of iterations)
- \(P(Y_i = c) = { {1}\over{\left\vert N_i \right\vert}}\sum_{(i, j)\in{E}}W(i, j)P(Y_i = c) = { {1}\over{\sum_{\{(i,j)\}\in{E}}W(i,j)}}\sum_{(i, j)\in{E}}W(i, j)P(Y_i = c)\) where \(c\) is label, \(\left\vert{N_i}\right\vert\) is the # of neighbors \(W(i, j)\) is the edge strength(weight)
- Doesn’t determine the label of node 4. It’s a bridge node of the two groups
Iterative classification
- Using attributes and label both
- Train a classifier using Attributes vector(\(a_i\))
- Steps
- Training
- Covert each node \(i\) to a flat vector \(a_i\)
- Train the classifiers
- There are two different classifier
- Classifier trained only attributes vector
- Classifier trained attributes and link vector
- Bootstrap phase
- Use local classifier \(f(a_i)\) to compute best value for \(Y_i\)
- Iteration phase
- Update node vector \(a_i\) (Typically update information about link)
- Update label \(Y_i\) to \(f(a_i)\)
- Continue this step until convergence
- Training
Application of iterative classification framework
- fake reviewer/review detection
- Easy to fake: Behavioral analysis(individual behaviors), Language analysis(content of review)
- Hard to fake: graph structure(relationships between reviewers, reviews, stores)
- Bipartite rating graph(positive review is +1 rating, negative review is -1 rating)
- Fairness(\(F(u)\)), Goodness(\(G(p)\)), Reliability(\(R(u, p)\))
- \[F(u) = { {\sum_{ {(u,p)}\in{Out(u)}} R(u, p)} \over {|out(u)|}}\]
- \[G(p) = { {\sum_{ {(u, p)} \in {In(p)}} \cdot score(u, p)}\over{|In(p)|}}\]
- \[R(u, p) = { { {1} \over {y_1+y_2}} (y_1 \cdot F(u)) + y_2 \cdot (1- { {|score(u, p) - G(p)|} \over {2}}) }\]
Collective Classification: Belief Propagation
- Receive the message(state, attribute, etc) from neighbors and update, then pass toward other neighbors
- After receive from and pass toward every neighbors, node can calculate their own state
- However, it doesn’t work loopy(cyclic) graph
Loopy BP algorithm
Notation
- Label-label potential matrix(\(\psi\)): \(\psi(Y_i, Y_j)\) is probability of a node \(j\) being in the state \(Y_j\) given that it has a \(i\) neighbor in state \(Y_i\)
- Prior belief(\(\phi\)): Probability \(\phi_i(Y_i)\) of node \(i\) being in the state \(Y_i\)
- \(m_{i \to j}(Y_j)\) is \(i\)’s estimate of \(j\) being in the state \(Y_j\)
- \(\mathcal{L}\) is the set of all states
Loopy BP algorithm
- Initialize all message to 1 and repeat to calculate for each node
- After convergence, label-label potential is 1
Application of Belief Propagation
-
Online Auction Fraud(especially, Non-delivery fraud)
-
Fraudsters try to form near-bipartite cores (2 roles)
- Accomplices: trades with honest
- Fraudsters: trades with accomplice and fraud with honest
- These actions maintain their feedback score over than fraudsters who fraud with honest only
-
Three groups have different provabilities