Where we are and where we should be going DEEP LEARNING FRAMEWORKS JACK LEE | University of Toronto AMY WANG | Huawei Canada
Where we are and where we should be going
DEEP LEARNING FRAMEWORKS
JACK LEE | University of TorontoAMY WANG | Huawei Canada
BACKGROUNDArchitecture
Frontend API
Graph IR
Graph Executor
Kernel Library
INPUTS OUTPUTS
TensorFlow Frontend
Graph IR
Kernel Implementation
MOTIVATIONSFrontend Interface
TensorFlow Frontend
Autograph
PyTorch Frontend
MOTIVATIONSGraph Optimizations
PyTorch Frontend
Trace-based JIT
AST-based JITTensorFlow Graph IR
XLA Lower Level opsAutomatic differentiation every iteration.
MOTIVATIONSKernel Specialization
XLA Lower Level ops
Benchmarks
Deep Fusion Tiling
Graph Lowering
Frontend API
INPUTS OUTPUTS
Graph IR
Compiler IR
COMPILED NETWORK
MOTIVATIONSKernel Specialization
NNVM API
TVM API
OUTPUTS
Compiler IR
Generated GPU Code
INPUTS OUTPUTS
NNVM Graph IR
TVM Halide IR
COMPILED KERNELS
CUSTOM RUNTIME
STATE OF THE ART SUMMARY
TENSORFLOW TENSORFLOW XLA PYTORCH PYTORCH - GLOW NNVM + TVM
Staged Frontend
✘ ✘
Native Frontend
✘ ✘ ✘
GraphOptimization
✘
Kernel Specialization
✘ ✘
Runtime Specialization
✘ ✘ ✘ ✘ ✘
ExecutionLevel
C++ C++ Python Machine Code C++
THE DVM FRONTENDDeep Learning Compilation Framework
TENSORFLOW PYTORCH NATIVE SYNTAX
IR Transformation IR Transformation Parser (Clang/Python AST)
IR Builder
THE DVM MIDENDDeep Learning Compilation Framework
SSA-based IRLow level opsControl Flow
Data Flow
Automatic Differentiation
Graph Optimizations
Profile Guided Optimizations
Compiler Optimizations
THE DVM BACKENDSDeep Learning Compilation Framework
Default Runtime Codegen
Compatible Compiler
Specialized Runtime Source
Code
Handwritten Kernel Source
Code
Compiled Network
Runtime + Kernel Codegen
ClusteredSpecialized
Runtime Source Code
FusedSpecialized
Kernel Source Code
Compatible Compiler
Compiled Network
Q&A
Default Runtime Codegen
Compatible Compiler
Specialized Runtime Source Code
Handwritten Kernel Source Code
Compiled Network
Runtime + Kernel Codegen
ClusteredSpecialized Runtime
Source Code
FusedSpecialized Kernel
Source Code
Compatible Compiler
Compiled Network
TENSORFLOW PYTORCH NATIVE SYNTAX
IR Transformation IR Transformation Parser (Clang/Python AST)
IR Builder
SSA-based IRLow level opsControl Flow
Data Flow
Automatic Differentiation
Graph Optimizations
Profile Guided Optimizations
Compiler Optimizations