CS224W Machine Learning with Graphs

Lecture 10

Title: Deep Generative Models for Graphs
Date: 2020. 04. 28(TUE) ~ 2020. 05. 04(MON)
Materials: Slides YouTube Additional Materials:

Deep Graph Encoders

Generate node embeddings based on local network neighborhoods and neural networks
- Graph Convolutional Neural Networks(GCN): average neighborhood information
- GraphSAGE: generalized neighborhood aggregation

Problem of Graph Generation

Graph Generation Tasks

Realistic graph generation
Goal-directed graph generation

Why is it hard?

Large and variable output space
- We need to \(n^2\) values to express \(n\) nodes (adjacency matrix)
- Different graphs need to different sizes
Non-Uniqueness
- \(n\)-node graph can be represented in \(n!\) ways
Complex dependencies
- To form the edges, we need have long-range dependencies(more memory)

ML Basics for Graph Generation

Graph Generative Models

Given: \(p_{data}(G)\), the set of graphs
Goal: Distribution \(p_{model}(G)\) and sample from \(p_{model}(G)\)
- \(p_{model}(G)\) means the new set of graphs using learning
Generate the new set of graphs using assumed(learned) distribution with input set(observation) of graphs
- \(p_{data}(x)\) is the data distribution in the real world. \(x_i\) is the sampled in the \(p_{data}(x)\)
- \(p_{model}(x;\theta)\) is the model that we use to approximate \(p_{data}(x)\)
- The goal is the making \(p_{model}(x;\theta)\) close to \(p_{data}(x)\) and getting sample from \(p_{model}(x;\theta)\)

Make \(p_{model}(x;\theta)\) close to \(p_{data}(x)\)

Find \(\theta\) to maximize likelihood
\[\theta^* = argmax_\theta \mathbb{E}_{x \sim p_{data}} log_{p_{model}}(x | \theta)\]

Sample from \(p_{model}(x;\theta)\)

To sample from a complex distribution, using noise(seed) from a simple noise distribution
- Using the noise from a noise distribution as parameter of function(\(f( \cdot )\)), we can make that the output(\(x_i\)) follows a complex distribution
- Sample from a simple noise distribution \(z_i \sim N(0,1)\)
- Transform the noise \(z_i\) vis \(f(\cdot)\) \(x_i = f(z_i;\theta)\)
- \(f( \cdot )\) is designed by Deep Neural Networks

Auto-regressive models

Auto-regressive means that every next action depends on the previous actions
In the graph case, the next action(\(t\)-th action, \(x_t\)) will be adding node or adding edge
Apply chain rule
- \[p_{model}(x;\theta) = \prod_{t=1}^n p_{model}(x_t|x_1,...,x_{t-1};\theta)\]

GraphRNN

Recurrent: Every output feedback into input (Auto regressive)
Graph \(G\) is made by Generation process \(S^\pi\)(recipe of drawing graph), where \(\pi\) is node ordering

Generation Process

The sequence \(S^\pi\) has two levels
- Node-level: At each step, a new node is added
  - Node-level \(S^\pi\) is the order of adding node
  - \[S^\pi = (S_1^\pi, S_2^\pi, S_3^\pi, S_4^\pi, S_5^\pi)\]
- Edge-level: At each step, a new edge is added
  - Edge-level \(S^\pi\) is the set of whether or not connect previous nodes
  - \[S_4^\pi = (S_{4,1}^\pi, S_{4,2}^\pi, S_{4,3}^\pi) = (0, 1, 1)\]
GraphRNN has a node-level RNN and edge-level RNN
- Using these two-level sequence, we make the graph and adjacency matrix(the yellow area)

GraphRNN

adjacency matrix

Recurrent Neural Network

RNN

RNN cell calculates the output and the state using input and state of previous step
- \(s_t\): State of RNN after time \(t\)
- \(x_t\): Input of RNN at time \(t\), Output of RNN at time \(t-1\)
- \(y_t\): Output of RNN at time \(t\)
- SOS(\(s_0\)) and EOS(\(y_t\)) use zero vector
In the generation(stochastic), output is the probability vector and input is the sample based on the probability

Training phase

At training time, use an observed sequence \(y^*\) and “Teacher Forcing”
Teacher Forcing means model use real sequence(\(y^*\)) output (not model’s output) as a input
Loss \(L\) is the Binary cross entropy \(L = -[y_1^* log(y_1) + (1-y_1^*) log (1-y_1)]\)
Using loss and back propagation, adjust RNN parameters

Tractability

Too many steps for edge generation because of that any node can connect to any prior node(random ordering)
Using BFS ordering, reduce steps for edge generation

BFS Ordering

Evaluating Generated Graphs

Define similarity metric for graphs
- Visual similarity
- Graph statistics similarity

Applications and Open Questions

Generating Graph’s Goal
- High scores(Optimization)
- Valid(Obey underlying rules)
- Realistic(imitating an original graph)
Graph Convolutional Policy Network(GCPN)
- Combine graph representation + RL
- Graph Neural Network(Valid), Reinforcement learning(High scores), Adversarial training(Realistic)

CS224W Graph Machine Learning

CS224W Machine Learning with Graphs

Lecture 10

Deep Graph Encoders

Problem of Graph Generation

Graph Generation Tasks

Why is it hard?

ML Basics for Graph Generation

Graph Generative Models

Make \(p_{model}(x;\theta)\) close to \(p_{data}(x)\)

Sample from \(p_{model}(x;\theta)\)

Auto-regressive models

GraphRNN

Recurrent Neural Network

Training phase

Tractability

Evaluating Generated Graphs

Applications and Open Questions

Related Posts

[Deep Reinforcement Learning Nanodegree Chapter 4] Multi-Agent Reinforcement Learning 15 Dec 2020

[Deep Reinforcement Learning Nanodegree Chapter 3] Policy-Based Methods 10 Dec 2020

[Deep Reinforcement Learning Nanodegree Chapter 2] Value-Based Methods 27 Nov 2020