Kaggle Deep Learning Competition

KaggleDeep Learning Competition

in Computer Vision

Alexander Eckert <[email protected]>

Oct, 2018

Thanks to Nicole Finnie for preparing the bulk of the slides!

Agenda

l A very brief introduction of Kaggle

l Kaggle 2018 Data Science Bowl – Nuclei detection

l Q & A

https://www.kaggle.com/c/data-science-bowl-2018

l Brief introduction of Kaggle

Prizes | Medals | Points

https://www.kaggle.com

Ranking & Rewards

l Competition ranking- Public leaderboard

l evaluated on subset of ground truthl visible during competition

- Private leaderboard l unseen datal determines final ranking

- Rewardsl Money, medals, ranking points

l Global ranking- Competitions- Kernels (Notebooks)

- Forum discussions

https://www.kaggle.com

Our team

https://www.kaggle.com/c/data-science-bowl-2018

l Our objectives- Apply and learn state of the art CV algorithms, ANN frameworks & architectures

- Take perspective of a data scientist, discuss approaches

- Compete

l 2018 Data Science Bowl – Nuclei Detection

Detect

Mask of one nucleushttps://www.kaggle.com/c/data-science-bowl-2018

Exploratory Data Analysis (EDA)K-Means

(Mini) U-Net – a CNN autoencoder

16 (3x3) filters

64 (3x3) filters

128 (3x3) filters

32 (3x3) filters

256 (3x3) filters

32 (3x3) filters

dropout 0.1

dropout 0.2

dropout 0.2

dropout 0.3

dropout 0.1

activation: Sigmoid

optimization: adamloss: binary cross entropy

input: 256x256x2 output: 256x256x2

https://lmb.informatik.uni-freiburg.de/people/ronneber/u-net/

Hidden feature engineering

KNN on edge pixels

channel 0 channel 1train image

windowing crops, flip(data augmentation)

2 U-Net modelscolour & grey

Train

Post processing of predictionsl Necessary for instance segmentation

l (Predicted masks – predicted contours) => label seeds

l Labelling (using random walker/watershed algorithm)

ground truth

post-processed

labelling (instance segmentation)predicted contours

test image

Model ensemblingbaseline

transform

noises

(data augmentation)

Pixel prediction using weighted majority vote

Final mean IoU: 0.545

ensembling

train images

test image

https://github.com/nicolefinnie/kaggle-dsb2018

Silver medal

Learnings

l Very time intensive- Important to efficiently (e.g. in parallel) evaluate promising paths and check classification results/errors

- GPUs and patience required for (deep learning) model training

l Follow forum discussions for new ideas, knowledge sharing and competition timeline

l Combine orthogonal approaches and create weighted ensembles

l Steep learning curve, but great for acquiring (or improving) skills- Keras/Tensorflow & PyTorch NN frameworks

- Python libraries like OpenCV, Pandas, Scikit, Jupyter

- Image processing techniques

- Network architectures for semantic segmentation

- Building modelling & prediction pipelines

Q & A

Sometimes it can go really wrong... (toxic image)What? A painting

in the test set?

Good job, U-Nets!

Kaggle Deep Learning Competition

Documents