CS224W Machine Learning with Graphs
Lecture 1
Title: Introduction; Structure of Graphs
Date: 2020. 03. 29 (Sun)
Materials: Slides
Introduction of Network
- Natural Graphs
- General language for describing complex systems of interacting entities
- the interactions between the components
- Why?
- Universal language for describing complex data
- Shared vocabulary between fields
- Data availability & computational challenges
- Impact
Networks and Applications
Sort of Networks analysis
- Node classification (Predict the attributes(type/color) of a given node)
- Link prediction (Predict whether two nodes are linked)
- Community detection (Identify densely linked clusters of nodes)
- Network similarity (Measure similarity of two nodes or networks)
Applications
- Social Networks
- e.g., Social Circle Detection(Community detection)
-
Infrastructure
-
Knowledge
- e.g., Content recommendation(Link prediction)
- Type of networks
- Knowledge Graphs
- Heterogeneous Graphs
- Multimodal Graphs
- Embedding Nodes
- Map nodes to d-dimensional embeddings such that nodes with similar network neighborhoods are embedded close together
- Transfer nodes to vectors for computation
-
Online Media
- e.g., Hoax article detection (Community detection?), Prediction Virality, Product Adoption
- Biomedicine
- e.g., Protein-protein interaction(PPI) networks, Metabolic networks, Side effects detection(Predict adverse when taking a pair of drugs simultaneously)
- Biomedical Graph(Heterogeneous graph, Link prediction)
Network Analysis Tools
- SNAP.PY
- SNAP C++
- NetworksX
- graph-tool
Structure of Graphs
- Network: a collection of objects where some pairs of objects are connected by links
- Objects(\(N\)): nodes, vertices
- Interactions(\(E\)): links, edges
- System(\(G(N,E)\)): networks, graph
- Difference between Network and Graph
- Network refers to real-world systems
- Graph is mathematical representation of a network
Choice of Network Representation
Type of Graphs
- Undirected Graph: symmetrical, reciprocal, without direction
- Directed Graph: with direction(arcs)
-
Subtypes of graphs
-
Unweighted(undirected)
-
Weighted(undirected)
-
Self-edges(self-loops, undirected)
-
Multigraph(undirected)
-
Node Degrees(\(k_i\))
- # of edges adjacent to node \(i\)
- \[k_A = 4\]
- In directed networks node degree separates in-degree and out-degree
- In-degree: # of the edges to node \(i\)
- Out-degree: # of the edges from node \(i\)
- \(k^{in}_C=2\), \(k^{out}_C = 1\), \(k_C=3\)
Complete Graph
- Maximum # of edges(\(E_{max} = \{\{N(N-1)\}\over\{2\}\}\))
- If the graph is \(E = E_{max}\), it’s called a complete graph
- Average degree \(= N-1\)
- However, most real-world networks are sparse graph
- \(E << E_{max}\) or \(\bar{k} << N-1\)
Bipartite Graph
-
The graph whose nodes can be divided into two disjoint sets \(U\) and \(V\)
- e.g., Authors-to-Papers, Actors-to-Movies
Representing Graph
- Adjacency Matrix
- Representing(Mapping) graph to n-dimensional vector(matrix)
-
Edge List
- Edges represent a pair of nodes combination (\((Node_{From}, Node_{To})\))
-
Adjacency list
- It’s an easier way to represent graph
Edge Attributes
- Properties depending on the structure of the rest of the graph
- Possible options: Weight, Ranking, Type, Sign(Flag, Boolean)
Connectivity of Undirected Graphs
- Connected Graph means that any two vertices can be joined by a path
- Disconnected Graph is made up of two or more connected components
- Connected Graph can be disconnected by Bridge edge and Articulation nodes
- Strongly connected and Weakly connected
- Strongly connected graph has every path from each node to every other node
- Weakly connected graph has more than one missing paths from each node to every other node
- Strongly connected components(SCCs) is sub-graph which can make strongly connected graph