Top Banner
Dr. Christoph Angerer, 09.01.2017 CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL SYSTEMS
36

CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

Jul 07, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

Dr. Christoph Angerer, 09.01.2017

CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL SYSTEMS

Page 2: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

2

Monitoring Effects of Carbon and Greenhouse Gas Emissions

Minute-by-minute AI Weather Forecasting

insideHPC.com SurveyNovember 2016

92%believe AI will impact their work

93%using deep learning seeing positive results

DEEP LEARNING IS ENTERING HPC

Page 3: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

3

65x in 3 Years

K40

K80 + cuDNN

1

M40 + cuDNN4

P100 + cuDNN5

0x

10x

20x

30x

40x

50x

60x

70x

2013 2014 2015 2016

AlexNet Training Performance

WHY THE EXCITEMENT?GPUs as Enablers of Breakthrough Results

Paper: H.Zhang et al. StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks, arXiv:1612.03242

We can generate photorealistic images from textual descriptions now!

Achieve super-human accuracy in classification

And we are getting faster fast

Page 4: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

4

AGENDA

A Quick Introduction to Neural Networks

Four Questions and Partial Answers

Concluding Remarks

Page 5: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

5

1-SLIDE INTRO TO CONVOLUTIONAL NEURAL NETSForward/Backward Propagation

Input

Convolution

Activation

Fully-connected

Loss-Function(Cross Entropy)

Backward-propagation(gradient computation)

𝛻𝛻 𝛻

𝛻

SGD

OptimizationAlgorithm

weight updates

All layers are differentiable

Classification(Softmax)

Forward-propagation

𝛻

Page 6: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

6

1-SLIDE INTRO TO RECURRENT NEURAL NETSNetwork + Internal State => Dependencies Over Time

= …

Diagrams from: http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Page 7: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

7

CATEGORIZATION BY SIGNAL

Unsupervisedlearning

Supervisedlearning

Reinforcementlearning

One-shotlearning

ModelTesting

Data

Training Data

ModelUnlabeled Training Data

Model Environment

Labels (expected results)

ModelVery small set of training

dataUse

Page 8: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

8

CATEGORIZATION BY INPUT/OUTPUT

Diagram from: http://karpathy.github.io/2015/05/21/rnn-effectiveness/

ImageClassification

ImageCaptions

Sentiment Analysis

Text Recognition

Generative(diabolo and others)

Recurrent

Auto-encoder,GANs

Page 9: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

9

WHAT IF I DON’T HAVE ENOUGH TRAINING DATA?

Page 10: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

10

HOW MUCH TRAINING DATA IS NEEDED?

• No general answer, need to experiment

• Test error >> training error: probably more data (overfitting?)

• Test error ≈ training error: more data probably doesn’t help

• Look at learned filters: noisy filters generally want more training

• For N functions, need > log(N)+c training cases (see: A Theory of the Learnable, L.G. Valiant, 1984)

• Example: N parameters of type float32 = max 232N distinct networks > 32N samples.

• Rough Guideline: some constant (e.g. 10) multiple of # parameters to avoid overfitting

• Batch normalization, Regularization, etc can give improvement

A recursive answer

Page 11: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

11

HOW LARGE SHOULD MY NETWORK BE?

• Depends on the amount of training data available

• Too small: bad generalization; Too large: overfitting

• And the complexity of the function to be learned1

• 1-hidden layer (grows exponentially) vs. deep networks (may grow linearly)

• Rough Design Guideline:

• First and last layer are given by model

• Number of nodes of a hidden layer somewhere between the size of its input and output layer

• Number of nodes in layer should be < 2 * #input nodes to avoid overfitting

• The rest is Art(?)

A recursive answer

1 Y.Bengio, Y.LeCun. Scaling learning algorithms towards AI. Large-scale Kernel Machines, 2007

Page 12: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

12

HOW TO GET MORE TRAINING DATA?

• Data Augmentation and Data Synthesis

• e.g., adding artificial background noise to speech samples (10x increase for Baidu)

• e.g., adding shifts, rotations, distortions to images

• Training and Testing on Simulators

• Google DeepMind Lab

• Self-Driving Vehicles Playing for Data: Ground Truth

from Computer Games, S.Richter et al., ECCV, 2016)

• Robotics

• One-shot Learning, GANs, Autoencoders?

And their Labels

Page 13: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

13From: K.Cranmer. Machine Learning & Likelihoos Free Inference in Particle Physics, NIPS2016

EXAMPLE: PARTICLE PHYSICS (CERN)

Page 14: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

14

MY DATA IS SYMMETRIC OR INVARIANT IN XYZ?

Page 15: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

15

INVARIANTS AND SYMMETRIES IN DATA

• CNNs don’t understand Invariants and Symmetries out of the box

• Pooling and downsampling helps with some transformations

• (Training and Test-time) Data augmentation may explode the training set

• Scale/Rotate/Transform/Perturbate each training image many times?

• Approaches:

• Teach networks about certain symmetries (e.g. rotation)

• Normalize/preprocess data to ensure well-known layout

• Find encoding of the data that is invariant to certain operations

Pattern Recognition

Page 16: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

16

EXPLOITING SYMMETRY IN CONV NETSTeaching CNNs about Rotation

From: S.Dieleman et al. Exploiting Cyclic Symmetry in Convolutional Neural Networks, CoRR, 2016

Page 17: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

17

NORMALIZING AND PRE-PROCESSING DL trained on jet images vs. physically-motivated feature driven approaches

From: L. de Oliveira et al., Jet-Images -- Deep Learning Edition, JHEP07, 2016

Page 18: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

18

INVARIANT ENCODING

Molecules are encoded as Vectors of Nuclear Charges and Inter-atomic Distance Matrices

=> Translation and rotation Invariant Representation

From: K.Schütt et al., Quantum-Chemical Insights from Deep Tensor Neural Networks, arXiv:1609.08259

Page 19: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

19

HOW DO I REPRESENT MY DATA IN NEURAL NETWORKS?

Page 20: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

20

SIMPLE EXAMPLE: CLASSIFICATIONOne-Hot Encoding

Training Data Scalar Encoding

[0.0]

[1.0]

[2.0]

[3.0]

[4.0]

[5.0]

[6.0]

[7.0]

[8.0]

[9.0]

One-Hot Encoding

[1,0,0,0,0,0,0,0,0,0]

[0,1,0,0,0,0,0,0,0,0]

[0,0,1,0,0,0,0,0,0,0]

[0,0,0,1,0,0,0,0,0,0]

[0,0,0,0,1,0,0,0,0,0]

[0,0,0,0,0,1,0,0,0,0]

[0,0,0,0,0,0,1,0,0,0]

[0,0,0,0,0,0,0,1,0,0]

[0,0,0,0,0,0,0,0,1,0]

[0,0,0,0,0,0,0,0,0,1]

Page 21: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

21

IMAGE SEGMENTATION & BOUNDING BOXES

Creative use of feature channels

Diagram From: B.Li, T.Zhang, T.Xia. Vehicle Detection from 3D Lidar Using Fully Convolutional Network, CoRR, 2016

GIF from: https://devblogs.nvidia.com/parallelforall/image-segmentation-using-digits-5/

DownsamplingUpsampling

(same-size output)

1 Channel per Object Class (incl.

Background)

1 Channel per BB point coordinate

Page 22: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

22

ORGANIZING SPEECH INTO FEATURE MAPSReducing Problems to Image Recognition

From: Ossama et al. Convolutional Neural Networks for Speech Recognition, IEEE/ACM Trans. Audio, Speech, and Lang. Proc, 2014

Page 23: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

23

ENCODING TIME SERIES AS IMAGESGramian Angular Fields (GAF) and Markov Transition Fields (MTF)

From: Z.Wang, T.Oates. Encoding Time Series as Images for Visual Inspection and Classification Using Tiled Convolutional Neural Networks, AAAI Workshop, 2015

Page 24: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

24

DL FOR SIGNAL PROCESSINGLooking for Gravitational Waves

From: D.George, E.A.Huerta. Deep Neural Networks to Enable Real-time Multimessenger Astrophysics, arXiv:1701.00008 [astro-ph.IM]

Classifier:Detect Presence of

GWs

Regression:Parameter Estimation

(i.e., masses of the two black holes)

Page 25: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

25

HOW CAN I TRUST THE NETWORK?

Page 26: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

26

“DEEP NEURAL NETS ARE BLACK BOXES”

• If a network performs well on the test data and appears to work reasonably well on real data…

• Can we trust it?

• Are there formal error bounds on the recognition accuracy?

• E.g., would you trust a trained NN to operate your nuclear power plant?

• Field of active research (DARPA, MIT, Capital One, many others)

• Debugging and Understanding NN behavior

• Rationales for network decisions

… even if you can look at the internals…

Page 27: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

27

ATTACKING NEURAL NETWORKS

1. Run input x through the classifier model

2. Derive a perturbation tensor that maximizes chances of misclassification:

1. Find blind spots in input space; or

2. Linear perturbation in direction of neural network’s cost function gradient; or

3. Select only input dimensions with high saliency*

3. Apply scaled effective perturbation (δx) to x

1. Larger perturbation == higher probability for misclassification

2. Smaller perturbation == less likely for human detection

Spoofing and Malicious Misclassification

Small Pixel-level Pertubations

900.00Recognized Amount:

Page 28: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

28

LOOKING INSIDE NEURAL NETS

• Inspecting the NN

• Visualize activations, filters, generate input that maximizes activation of a neuron

• Occlude parts of the input and check expectations

• (e.g., http://cs231n.github.io/understanding-cnn/)

• Capture Model Confidence, Estimate Uncertainty

• Place Gaussian Distribution over Weights => Bayesian Neural Networks

• G.Yarin. Uncertainty in Deep Learning, PhD Thesis, University of Cambridge, 2016

• How to gain scientific insight from a trained network?

Debugging, Understanding, Verifying

Page 29: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

29

CONCLUDING REMARKS

Page 30: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

30

DEEP LEARNING —A NEW COMPUTING MODEL

“Software that writes software”

“little girl is eating

piece of cake"

LEARNING

ALGORITHM

“millions of trillions

of FLOPS”

Page 31: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

31

PIONEERS ADOPTING HPCFOR DEEP LEARNING

“Investments in computer systems — and I think

the bleeding-edge of AI, and deep learning

specifically, is shifting to HPC — can cut down

the time to run an experiment from a week to

a day and sometimes even faster.”

— Andrew Ng, Baidu

Dr. Andrew Ng, Chief Scientist, Baidu

Page 32: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

32

ROOM FOR FUTURE WORK

• Data Acquisition

• How to get enough high-quality labeled data (or unlabeled learning or less need for input data); how much simulated data is okay without generating artefacts?

• Exploiting Properties of the Data (Symmetries, Invariants, …)

• To speed up learning, improve precision, guarantee properties

• Data Representations, especially for non-image data

• Trusting the Network

• Formal verification, attack models, error bounds, cost of misclassification, gaining scientific insight

• Other topics: new use cases (signal processing), how to design networks, new layer types, generative models, debugging, optimizing

The Four Questions Revisited

Page 33: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

33

NVIDIA EXPERTISE AT EVERY STEP

Solution ArchitectsGlobal Network

of PartnersDeep Learning

InstituteGTC

Conferences

1:1 support

Network training setup

Network optimization

Certified expert instructors

Worldwide workshops

Online courses

Epicenter of industry leaders

Onsite training

Global reach

NVIDIA Partner Network

OEMs

Startups

Need image

Page 34: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

34

NVIDIA DEEP LEARNING PARTNERS

Graph Analytics Enterprises Data ManagementDL Frameworks Enterprise DL

Services Core Analytics Tech

Page 35: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

35

REFERENCES

• Y.Bengio, Y.LeCun. Scaling learning algorithms towards AI. Large-scale Kernel Machines, 2007

• K.Cranmer. Machine Learning & Likelihoos Free Inference in Particle Physics, NIPS2016

• S.Dieleman et al. Exploiting Cyclic Symmetry in Convolutional Neural Networks, CoRR, 2016

• L. de Oliveira et al., Jet-Images -- Deep Learning Edition, JHEP07, 2016

• K.Schütt et al., Quantum-Chemical Insights from Deep Tensor Neural Networks, arXiv:1609.08259

• Ossama et al. Convolutional Neural Networks for Speech Recognition, IEEE/ACM Trans. Audio, Speech, and Lang. Proc, 2014

• Z.Wang, T.Oates. Encoding Time Series as Images for Visual Inspection and Classification Using Tiled Convolutional Neural Networks, AAAI Workshop, 2015

• D.George, E.A.Huerta. Deep Neural Networks to Enable Real-time Multimessenger Astrophysics, arXiv:1701.00008 [astro-ph.IM]

• B.Li, T.Zhang, T.Xia. Vehicle Detection from 3D Lidar Using Fully Convolutional Network, CoRR, 2016• https://devblogs.nvidia.com/parallelforall/image-segmentation-using-digits-5/

• H.Zhang et al. StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks, arXiv:1612.03242

Page 36: CHALLENGES IN MACHINE LEARNING FOR COMPLEX PHYSICAL … · Reinforcement learning One-shot learning Model Testing Data Training Data Unlabeled Training Data Model ... the time to

Dr. Christoph [email protected]