Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroimaging

1 / 42

Diagnosis of alzheimer's disease with deep learning

2016. 7. 4Seonho Park

2 / 42

Outline

Introduction to Machine LearningConvolutional Neural Network

Diagnosing of Alzheimer’s disease

3 / 42



4 / 42

Introduction to Machine Learning

x1

x2

x1

y

x1

x2

<Supervised Learning> <Unsupervised Learning>classification regression clustering

Category of Machine Learning

문제 + 정답

문제 + 정답

문제 + 정답

데이터 + 레이블 머신러닝 학습

머신러닝 모델 정답 예측새로운 데이터

문제 + 정답

문제 + ???

분류 회귀

CatComputer

LionPencilPig

레이블 없는 데이터 머신러닝 학습 군집화

5 / 42

Introduction to Machine LearningScikit-Learn

• Machine Learning Library in Python• http://scikit-learn.org/• Classification: Decision trees, SVM, NN• Regression: GP, Ordinary LS, Ridge Regression, SVR • Clustering: k-Means, Spectral Clustering

http://scikit-learn.org/

http://scikit-learn.org/

6 / 42

Introduction to Machine LearningWhy Deep Learning?

• Deep Learning = Deep Neural Network• Data and Machine Learning

† http://cs229.stanford.edu/materials/CS229-DeepLearning.pdf

7 / 42

Introduction to Machine LearningArtificial neural networks

Training = Find weights (parameters)Inference = get output by specific input and trained weights

8 / 42

Introduction to Machine LearningConvolutional Neural Network (CNN)

• Image Processing (Computer Vision)

9 / 42

Introduction to Machine LearningRecurrent Neural Network (RNN)

• Time Series Data• Natural Language Processing• Translation, Speech Recognition, Auto Caption• 자동번역 , 음성인식 , 이미지 캡션 생성 등에 활용

† Towards End-to-End Speech Recognition with Recurrent Neural Networks, Alex Graves et al (2014)

10 / 42

Introduction to Machine LearningWhy GPU?

• CuDNN: GPU-accelerated library of primitives for deep neural networks• VRAM limitation, Double/Single/Half Precision• Linear Algebra: CuBLAS, MAGMA

11 / 42

Introduction to Machine LearningFrameworks

Cuda-Con-vNet

Pylearn2Lasagne

12 / 42

Introduction to Machine LearningOpen Sources for Deep Learning

† Comparative Study of Deep Learning Software Frameworks, Soheil Bahrampour et al (2015)

13 / 42

Introduction to Machine LearningPioneers

• Yann Lecun• Geoffrey Hinton• Yoshua Bengio• Andrew Ng• Jürgen Schmidhuber

14 / 42

Image Recognition Speech Recognition Auto Caption

Self Driving Car Natural Language Processing Recommendation System

Introduction to Machine LearningApplications

15 / 42



16 / 42

Convolutional Neural Network Overview

• Classification• Convolution Operation + MLP• Architecture

• Convolutional Layer (Convolution Operator, Activation)• Subsampling (Downsampling, Pooling)• Fully Connected Layer• Classifier

17 / 42

Convolutional Neural Network LeNet5† Convolutional Operation

† Gradient Based Learning Applied to Document Recognition, Yann LeCun et al (1998)

• Digit Recognition • Weight matrix (filter): 4D tensor [# of feature at layer m, # of features at layer m-1, height, width]

18 / 42

Convolutional Neural Network Activation function (nonlinearity)

† Systematic evaluation of CNN advances on the ImageNet, Dmytro Mishkin, et al (2016)

19 / 42

Convolutional Neural Network Pooling Layer

• Erase Noise• Reduce Feature Map Size (Memory Save)

† Systematic evaluation of CNN advances on the ImageNet, Dmytro Mishkin, et al (2016)

20 / 42

Convolutional Neural Network Training

• Error(Loss) Function: Categorical Cross Entropy

• Design Variable: weights(W), bias(b)

• Backpropagationconjunction with an optimization method such as gradient descent

• Vanishing gradient

21 / 42

Convolutional Neural Network Mini-Batch Method

• Computational Efficiency• Memory Use• Iteration & Epoch

Vanilla Gradient Descent

Stochastic Gradient Descent• Parameter update for each training example x(i) and label y(i)

• Step size(η) is typically set to 10-3

22 / 42

Convolutional Neural Network Training (Optimization)

• Update Functions

• Second-order Method (L-BFGS) is not common in practice• NAG is more standard

23 / 42

Convolutional Neural Network Overfitting and Regularization • Dropout

• Relaxation: Add Regularization Term to Loss Function• Remove Layer (Reduce Parameters), Add Feature

† Dropout: A Simple Way to Prevent Neural Networks from Overfitting, Nitish Srivastava et al (2014)

24 / 42

Convolutional Neural Network Local Optimum?

† Identifying and attacking the saddle point problem in high-dimensional non-convex optimization, Yann N. Dauphin et al (2014)

• Non-convex optimization problem • deeper and more profound difficulty originates from the proliferation of saddle points,

not local minima, especially in high dimensional problems of practical interest

25 / 42

Convolutional Neural Network Parallel Computation

• Architectural Parallel: Divide Channel• Data Parallel: Divide Batch

26 / 42

ILSVRC• Evaluate algorithms for object detection and image classification at large

scale• Training: 1.3M/ Test: 100k, 1000 categories

Convolutional Neural Network

27 / 42

AlexNet• ILSVRC12 1st Place• 15.3% error rate (2nd place achieved 26.5% error rate) • Architecture Parallel (2GPU used)

† ImageNet Classification with Deep Convolutional Neural Networks, Alex Krizhevsky et al. (2012)


28 / 42

VGG Net• DeepMind• ILSVRC14 2nd Place• 6.8% error rate

† VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION, Karen Simonyan et al. (2014)


29 / 42

GoogLeNet• Google• Inception module• ILSVRC14 1st Place• 6.67% error rate

† Going Deeper with Convolutions, Christian Szegedy et al. (2014)


30 / 42

MSRA• MicroSoft• PReLU activation• Weight initialization • 4.94% error rate (Surpass Human Level, 5.1%)

† Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification, Kaiming He et al. (2015)


31 / 42

Inception-v3• Google• Inception Module Upgrade• 50 GPUs• 3.46% error rate• Public Use with TensorFlow

† Going Deeper with Convolutions, Christian Szegedy et al. (2015)


32 / 42

Convolutional Neural Network Deep Neural Networks are Easily Fooled†

† Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images, A Nguyen et al (2014)

• It is possible to produce images totally unrecognizable to human eyes

• interesting differences between human vision and current DNNs

• raise questions about the generality of DNN computer vi-sion

33 / 42

Convolutional Neural Network Neural Style

† A Neural Algorithm of Artistic Style, Leon A. Gatys et al (2014)

• Style + Contents reconstruction

• Caffe framework• https://github.com/jcjohnson/neural-style

34 / 42



35 / 42

Diagnosing of Alzheimer’s diseaseTraditional Diagnosis of Alzheimer’s disease

• Review medical history• Mini Mental Status Exam• Physical Exam• Neurological Exam• Brain Image: Structural(MRI,CT), Functional(fMRI)

• NC(Normal Condition), MCI(Mild Cognitive Impairment), AD• AD: Vascular/Non-Vascular

36 / 42

Diagnosing of Alzheimer’s diseaseAD Patients’ MRI Features

• Temporal Lobe: Hippocampus• Ventricle

37 / 42

Diagnosing of Alzheimer’s diseaseCase Study: Machine Learning for diagnosing of AD• PET, MRI images

• Patch Extraction• Restrict Bolzmann Machine• Accuracy: 92.4%(MRI), 95.35%(MRI+PET)

† Hierarchical feature representation and multimodal fusion with deep learning for AD/MCI diagnosis, Heung-Il Suk et al (2014)

38 / 42

Diagnosing of Alzheimer’s diseaseCase Study: Machine Learning for diagnosing of AD• Feature: Cortex Thickness

• FreeSurfer• Linear discriminant analysis (LDA)• Accuracy: Sensitivity: 82%, Specificity: 93%

† Individual subject classification for Alzheimer’s disease based on incremental learning using a spatial frequency representation of cortical thickness data, Young-Sang Cho et al

(2012)

39 / 42

Diagnosing of Alzheimer’s diseasePreprocessing

• Data Set: about 1400 of T1 MRI from SMC• FreeSurfer: Skull Stripping: reduce size [256,256,256][190,190,190] / 67MB27MB • Pixel Value Normalization [0,255] [-1,1] • Mirrored cropping

40 / 42

Diagnosing of Alzheimer’s diseaseArchitecture

• CNN• Lasagne (Theano) Framework• Inception Module, Batch Normalization• 3D Convolution + CuDNN v3 (Github)• 2 TITAN X GPU: Data Parallel (PyCUDA)• Batch Size: 80

• Training Set #Healthy Condition(HC): 761 #Alzheimer’s Disease (AD): 389• Test Set #Healthy Condition(HC): 105 #Alzheimer’s Disease (AD): 84

Data

41 / 42

Diagnosing of Alzheimer’s diseaseArchitecture

input

24*Conv 11/5

MaxPool7/2

288*Conv 3/2

FC 120

DropOut

SoftMax

input

36*Conv 16/6

MaxPool3/2

120*Conv 4/1

Batch Norm

MaxPool3/2

60*Conv 1/1

96*Conv 3/112*Conv 1/1

24*Conv 5/1 24*Conv 1/1

MaxPool3/1

48*Conv 1/1

Concatenate

MaxPool3/2

FC 150

128*Conv 1/1

192*Conv 3/132*Conv 1/1

96*Conv 5/1 64*Conv 1/1

MaxPool3/1

128*Conv 1/1

Concatenate

96*Conv 1/1

208*Conv 3/116*Conv 1/1

48*Conv 5/1 64*Conv 1/1

MaxPool3/1

192*Conv 1/1

Concatenate

SoftMax

input

60*Conv 10/2

MaxPool2/2

144*Conv 3/1

Batch Norm

MaxPool3/2

48*Conv 1/1

72*Conv 3/1

18*Conv 1/1

36*Conv 5/148*Conv 1/1

MaxPool3/1

48*Conv 1/1

Concatenate

MaxPool3/2

FC 500

96*Conv 1/1

208*Conv 3/116*Conv 1/1

48*Conv 5/164*Conv 1/1

MaxPool3/1

192*Conv 1/1

Concatenate

160*Conv 1/1

320*Conv 3/132*Conv 1/1

128*Conv 5/1 128*Conv 1/1

MaxPool3/1

256*Conv 1/1

Concatenate

SoftMax

280*Conv 1/1

340*Conv 3/132*Conv 1/1

128*Conv 5/1 128*Conv 1/1

MaxPool3/1

228*Conv 1/1

Concatenate

AvgPool3/1

MidasNet1

MidasNet2

MidasNet3

42 / 42

Convergence History

Model AccuracyMidasNet1 167/189 (88.4%)MidasNet2 169/189 (89.4%)MidasNet3 169/189 (89.4%)

0 7 14 21 28 35 42 49 56 63 70 77 84 91 98 105 112 119 126 133 140 147 154 161 168 175 182 189 1960.01

0.1

1

10

Epoch

Cost

Diagnosing of Alzheimer’s diseaseResult

Thank You

Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroimaging

Engineering