Top Banner
Deep Graph Library Overview, Updates, and Future Directions George Karypis University of Minnesota ([email protected] ) AWS Deep Learning Science ([email protected] ) (on leave) www.dgl.ai
30

Deep Graph Library

Feb 06, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Deep Graph Library

Deep Graph LibraryOverview, Updates, and Future Directions

George KarypisUniversity of Minnesota ([email protected])AWS Deep Learning Science ([email protected]) (on leave)

www.dgl.ai

Page 2: Deep Graph Library

DGL: The history

2

2018 2019 2020

First prototype

Development started

V0.1(NeurIPS’18)

V0.2Sampling APIs

V0.3Fused message passing

Multi-GPU/-core

V0.3.1NN modulesDGL-Chem

V0.4Heterogeneous graph

DGL-KE

Page 3: Deep Graph Library

DGL: Design & API

Page 4: Deep Graph Library

DGL meta-objective & architecture

4

• Forward and backward compatible• Forward: easy to develop new models• Backward: seamless integration with

existing frameworks (MXNet/Pytorch/Tensorflow)

• Fast and Scalable

Page 5: Deep Graph Library

Flexible message handlingMessage function

Update function

Reduce function[Gilmer 2017, Wang 2017, Battaglia 2018]

5

Page 6: Deep Graph Library

Flexible message propagation

• Full propagation (“everyone shouts to everyone near you”)

• Propagation by graph traversal• Topological order on sentence parsing tree• Belief propagation order• Sampling

• Propagation by random walk

6

Page 7: Deep Graph Library

DGL programming interface

• Graph as the core abstraction• DGLGraph• g.ndata[‘h’]

• Simple but versatile message passing APIs

Active set specifies which nodes/edges to trigger the computation on.

7

can be user-defined functions (UDFs) or built-in symbolic functions.

Page 8: Deep Graph Library

Writing GNNs is intuitive in DGL

8

update_all is a shortcut for send(G.edges()) + recv(G.nodes())

Page 9: Deep Graph Library

Writing GNNs is intuitive in DGL (GAT)

9

Page 10: Deep Graph Library

Different scenarios require different supports

Single giant graph Many moderate-sized graphs

Dynamic graph Heterogeneous graph

10

Sampling Batching graphs

Mutation Heterogeneous

Page 11: Deep Graph Library

Performance

Page 12: Deep Graph Library

Scalability: single machine, single GPU

13

Scalability with graph size Scalability with graph density

3.4x 7.5x

PyG: pytorch-geometric

Page 13: Deep Graph Library

Scalability: single machine, NUMA

X1, 2TB, 128 vCPUData set: Reddit (232K nodes, 114M edges)Controlled-variate sampling

Page 14: Deep Graph Library

Scalability: single machine, multi-GPU

p3.16xlarge, 8 V100 GPUs, 64 vCPUData set: Reddit (232K nodes, 114M edges)Trained with neighbor sampling

Page 15: Deep Graph Library

What’s new and what’s in the pipeline?

Page 16: Deep Graph Library

Heterogenous graph

Example: recommendation system, GCMC

Graph Convolutional Matrix Completion

Page 17: Deep Graph Library

Supporting Heterogeneous Graph

18

Page 18: Deep Graph Library

*Official training on MovieLens-10M has to be in mini-batch, which lasts for over 24+ hours

Dataset RMSE (DGL) RMSE (Official) Speed (DGL) Speed (Official) Speedup

MovieLens-100K 0.9077 0.910 0.025 s/epoch 0.101 s/epoch 5xMovieLens-1M 0.8377 0.832 0.070 s/epoch 1.538 s/epoch 22x

MovieLens-10M 0.7875 0.777* 0.648 s/epoch Long*

Example: graph convolutional matrix completion

Page 19: Deep Graph Library

Distributed training: GCN (preliminary)

Neighbor samplingData set: Reddit (232K nodes, 114M edges)Testbed: c5n.18x, 100Gb/s network, 72vCPU

Page 20: Deep Graph Library

TF backend (preliminary)

Vanilla TF (TF 1.0)DGL + TF (TF 2.0)

Page 21: Deep Graph Library

Pre-trainedmolecule

generation models

DGL Package: DGL-LifeSci

• Utilities for data processing

• Models for molecular property prediction and molecule generation

• Graph Conv, GAT, MPNN, AttentiveFP, SchNet, MGCN, ACNN, DGMG, JTNN

• Efficient implementations

• Training scripts

• Pre-trained modelsEfficient implementations

Page 22: Deep Graph Library

DGL Package: DGL-KE• An open-source package to efficiently compute

knowledge graph embedding in various hardware:• Many-core CPU machine• Multi-GPU machine• A cluster of machines

• DGL-KE support popular KGE models:• TransE, TransR• DistMult, ComplEx, RESCAL• RotatE

• Applications: search, recommendation, question & answering

Page 23: Deep Graph Library

DGL-KE – Focus on high performance

• Maximize locality:• Metis graph partitioning to reduce network communication in distributed training.• Relation partitioning to avoid communication for relations in multi-GPU training.

• Increase computation-to-memory intensity:• Joint negative sampling to reduce the number of entities in a mini-batch

• Reduce the demands on memory bandwidth:• Sparse relation embeddings to reduce computation and data access in a batch.

• Hide data access latency:• Overlap gradient update with batch computation.

Page 24: Deep Graph Library

DGL-KE: Performance

Datasets: FB15K (15K nodes, 592K edges); Freebase (86M nodes, 338M edges)

p3.8xlarge instance, up to 8 V100 GPUs

Multi-GPU performance on FB15k

x1.32xlarge instance, 128 vCPU, 2TB RAM

Many-core performance on Freebase

Distributed performance on Freebase

x1.32xlarge instance, 128 vCPU, 2TB RAM

Page 25: Deep Graph Library

DGL: next step(s)

27

2018 2019 2020

First prototype

Development started

V0.2Sampling APIs

V0.3Fused message passing

Multi-GPU/-core

V0.3.1NN modulesDGL-LifeSci

V0.4Heterogeneous graph

DGL-KE

V0.5DGL-RecSysTF support

Distributed training

More model zoosMore NN modulesFaster training…

V0.1(NeurIPS’18)

Page 26: Deep Graph Library

Community

Page 27: Deep Graph Library

Open source, the source of innovation

3975 github stars312k downloads for all versions on Pip8.8K downloads for all version on Conda1.8K anaconda downloads of 0.4.1

32 model examples, 28 NN modules (including 14 GNN convolution modules)6 pretrained models for chemistryGCN, generative, KG, RecSys…47 contributors, 10 core developers

Page 28: Deep Graph Library

Channels

• Discuss forum https://discuss.dgl.ai• Any questions about DGL• Average response time: <1 day

• Github Issues https://github.com/dmlc/dgl/issues

• Bug report and feature request.

• Twitter @GraphDeep• Latest news and releases

• Wechat group• 24/7 on-call J

Page 29: Deep Graph Library

Do you want to contribute?

• Data scientist? Researcher? or just ML lover?• Develop new models & applications.

• Tech writer? Native speaker?• Revise documents.

• System hacker?• More algorithms and operators on graphs.

• Share your work and experience from using DGL: https://github.com/dglai/awesome-dgl

Page 30: Deep Graph Library

Q&A We are hiring!

https://www.dgl.ai