Top Banner
Deep Learning Ruslan Salakhutdinov Department of Computer Science University of Toronto
66

Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Jul 15, 2015

Download

Technology

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Deep Learning

Ruslan Salakhutdinov

Department of Computer Science

University of Toronto

Page 2: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Images & Video

Relational Data/ Social Network

Massive increase in both computational power and the amount of data available from web, video cameras, laboratory measurements.

Mining for Structure

Speech & Audio

Gene Expression

Text & Language

Geological DataProduct Recommendation

Climate Change

Mostly Unlabeled

• Develop statistical models that can discover underlying structure, cause, or statistical correlation from data in unsupervised or semi-supervised way. • Multiple application domains.

Deep Learning

Page 3: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Impact of Deep Learning

• Speech Recognition

• Computer Vision

• Language Understanding

• Recommender Systems

• Drug Discovery and Medical Image Analysis

Page 4: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Deep Learning in Action • Achieves state-of-the-art on many object recognition tasks! Try it at deeplearning.cs.toronto.edu!

Page 5: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Example: Understanding Images

Model Samples

• a group of people in a crowded area .• a group of people are walking and talking .• a group of people, standing around and talking .• a group of people that are in the outside .

strangers, coworkers, conventioneers, attendants, patrons

TAGS:

Nearest Neighbor Sentence:

people taking pictures of a crazy person

Page 6: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Image Tagging and Retrievalmosque, tower, building, cathedral,dome, castle

kitchen, stove, oven,refrigerator, microwave

ski, skiing, skiers, skiiers,snowmobile

bowl, cup, soup, cups, coffee

beach

snow

Page 7: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Speech Recognition

Page 8: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Merck Molecular Activity Challenge

• Deep Learning technique: Predict biological activities of different molecules, given numerical descriptors generated from their chemical structures.

• To develop new medicines, it is important to identify molecules that are highly active toward their intended targets.

Toronto team takes first place!

Page 9: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

• From their blog:

- Restricted Boltzmann machines - Probabilistic Matrix Factorization

(Salakhutdinov et. al. ICML, 2007, Salakhutdinov and Mnih, 2008)

To put these algorithms to use, we had to work to overcome some limitations, for instance that they were built to handle 100 million ratings, instead of the more than 5 billion that we have, and that they were not built to adapt as members added more ratings. But once we overcame those challenges, we put the two algorithms into production, where they are still used as part of our recommendation engine.

Netflix uses:

Both of these algorithms were developed by us at Toronto!

Page 10: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership
Page 11: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Deep Learning in the News

Page 12: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Key Computational Challenges

- Learning from billions of (unlabeled) data points

- Developing new parallel algorithms

Building bigger models using more data improves performance of deep learning algorithms!

Scaling up our deep learning algorithms:

- Scaling up Computation using clusters of GPUs and FPGAs

Page 13: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Building Artificial Intelligence

Develop computer algorithms that can:

- See and recognize objects around us

- Perceive human speech

- Understand natural language

- Navigate around autonomously

- Display human like Intelligence

Personal assistants, self-driving cars, etc.

Page 14: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Talk Roadmap

• Introduction

• Key Deep Learning Models

• Applications: Multimodal Learning and Language Modeling

Page 15: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Learning Feature Representations

pixel 1

pixel 2 Learning

Algorithm

pixel 2

pix

el 1

SegwayNon-SegwayInput Space

Page 16: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Learning Feature Representations

pixel 2

pix

el 1

SegwayNon-SegwayInput Space

Handle

Wheel

Learning

Algorithm

Feature

Representation

Handle

Wh

eel

Feature Space

Page 17: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Traditional Approaches

Image vision features Recognition

Object detection

Audio classification

Audio audio featuresSpeaker

identification

DataFeature

extraction

Learning

algorithm

Page 18: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Computer Vision Features

SIFT Spin image

HoG RIFT

Textons GLOH

Page 19: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Computer Vision Features

SIFT Spin image

HoG RIFT

Textons GLOH

Deep Learning

Page 20: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

ZCR

Spectrogram MFCC

RolloffFlux

Audio Features

Page 21: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Audio Features

ZCR

Spectrogram MFCC

RolloffFlux

Deep Learning

Page 22: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Example: Boltzmann Machine

Input data (e.g. pixel intensities of an image, words from webpages, speech signal).

Target variables (response) (e.g. class labels, categories, phonemes).

Model parameters

Latent (hidden) variables

Markov Random Fields, Undirected Graphical Models.

Page 23: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Unsupervised Learning

Vector of word counts on a webpage

Latent variables: semantic topics

804,414 newswire stories

(Hinton & Salakhutdinov, Science 2006)

Page 24: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Talk Roadmap

• Introduction

• Key Deep Learning Models

• Applications: Multimodal Learning and Language Modeling.

Page 25: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Restricted Boltzmann Machines

- Can characterize uncertainty.

Pair-wise Unary

Markov random fields, Boltzmann machines, log-linear models.

Image visible variables

Feature Detectors

Define a proper probabilistic model:

- Deal with missing or noisy data.

- Can simulate from the model.

Page 26: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Modeling Images

(Salakhutdinov & Hinton, NIPS 2007; Salakhutdinov & Murray, ICML 2008)

Learned features (out of 10,000)4 million unlabelled images

= 0.9 * + 0.8 * + 0.6 * …

New Image

Page 27: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Modeling Images and Text

(Salakhutdinov & Hinton, NIPS 2007; Salakhutdinov & Murray, ICML 2008)

Learned ``strokes’’Data: Handwritten characters

Learned features: ``topics’’

russianrussiamoscowyeltsinsoviet

clintonhousepresidentbillcongress

computersystemproductsoftwaredevelop

tradecountryimportworldeconomy

stockwallstreetpointdow

Reuters dataset: 804,414 unlabeled newswire stories

Bag-of-Words

Page 28: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Learned features: ``genre’’

Fahrenheit 9/11Bowling for ColumbineThe People vs. Larry FlyntCanadian BaconLa Dolce Vita

Independence DayThe Day After TomorrowCon AirMen in Black IIMen in Black

Friday the 13thThe Texas Chainsaw MassacreChildren of the CornChild's PlayThe Return of Michael Myers

Scary MovieNaked Gun Hot Shots!American Pie Police Academy

Netflix dataset: 480,189 users 17,770 movies Over 100 million ratings

State-of-the-art performance on the Netflix dataset.

Recommender Engine

(Salakhutdinov, Mnih, Hinton, ICML 2007)

Multinomial visible: user ratings

Binary hidden: user preferences

Page 29: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Image

Low-level features:Edges

Input: Pixels

(Salakhutdinov & Hinton, Neural Computation 2012)

Deep Boltzmann Machines: Learning Hierarchies of Features

Page 30: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Image

Higher-level features:Combination of edges

Low-level features:Edges

Input: Pixels

Learn simpler representations,then compose more complex ones

(Salakhutdinov & Hinton, Neural Computation 2012)

Deep Boltzmann Machines: Learning Hierarchies of Features

Page 31: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Learning Multiple Layers• Biological and theoretical justification for learning multiple layers of representation

• Biologically inspired learning:

- Brain has hierarchical architecture

- Cortex appears to have a generic learning algorithm

- Humans learn simpler representations, then compose more complex ones

Page 32: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Learning Feature Hierarchies

Layer 1 Primitives

Lee et.al., ICML 2009

Layer 2 Parts

Layer 3 Objects

Learn simpler representations, then compose more complex ones.

Page 33: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Good Generative Model?

Handwritten Characters

Page 34: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Good Generative Model?

Handwritten Characters

Page 35: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Good Generative Model?

Handwritten Characters

Real DataSimulated

Page 36: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Good Generative Model?

Handwritten Characters

Real Data Simulated

Page 37: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Good Generative Model?

Handwritten Characters

Page 38: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Good Generative Model?

MNIST Handwritten Digit Dataset

Page 39: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Talk Roadmap

• Introduction

• Key Deep Learning Models

• Applications: Multimodal Learning and Language Modeling.

Page 40: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Data – Collection of Modalities

• Multimedia content on the web -image + text + audio.

• Product recommendation systems.

• Robotics applications.

AudioVision

Touch sensorsMotor control

sunset, pacificocean, bakerbeach, seashore, ocean

car, automobile

Page 41: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Shared Concept

“Modality-free” representation

“Modality-full” representation

“Concept”

sunset, pacific ocean, baker beach, seashore,

ocean

Page 42: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

• Improve Classification

Multi-Modal Input

pentax, k10d, kangarooislandsouthaustralia, sa australiaaustraliansealion 300mm

SEA / NOT SEA

• Retrieve data from one modality when queried using data from another modality

beach, sea, surf, strand, shore, wave, seascape, sand, ocean, waves

• Fill in Missing Modalitiesbeach, sea, surf, strand, shore, wave, seascape, sand, ocean, waves

Page 43: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Challenges - I

Very different input representations

Image Text

sunset, pacific ocean, baker beach, seashore,

ocean • Images – real-valued, dense

Difficult to learn cross-modal features from low-level representations.

Dense

• Text – discrete, sparse

Sparse

Page 44: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Challenges - II

Noisy and missing data

Image Textpentax, k10d, pentaxda50200, kangarooisland, sa, australiansealion

mickikrimmel, mickipedia, headshot

unseulpixel, naturey

< no text>

Page 45: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Challenges - II Image Text Text generated by the model

beach, sea, surf, strand, shore, wave, seascape, sand, ocean, waves

portrait, girl, woman, lady, blonde, pretty, gorgeous, expression, model

night, notte, traffic, light, lights, parking, darkness, lowlight, nacht, glow

fall, autumn, trees, leaves, foliage, forest, woods, branches, path

pentax, k10d, pentaxda50200, kangarooisland, sa, australiansealion

mickikrimmel, mickipedia, headshot

unseulpixel, naturey

< no text>

Page 46: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

0

0

1

0

0

Dense, real-valued image features

Gaussian modelReplicated Softmax

Multimodal DBM

Word counts

(Srivastava & Salakhutdinov, NIPS 2012, JMLR 2014)

Page 47: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Multimodal DBM

0

0

1

0

0

Dense, real-valued image features

Gaussian modelReplicated Softmax

Word counts

(Srivastava & Salakhutdinov, NIPS 2012, JMLR 2014)

Page 48: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Gaussian modelReplicated Softmax

0

0

1

0

0

Multimodal DBM

Word counts

Dense, real-valued image features

(Srivastava & Salakhutdinov, NIPS 2012, JMLR 2014)

Page 49: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Text Generated from Images

canada, nature, sunrise, ontario, fog, mist, bc, morning

insect, butterfly, insects, bug, butterflies, lepidoptera

graffiti, streetart, stencil, sticker, urbanart, graff, sanfrancisco

portrait, child, kid, ritratto, kids, children, boy, cute, boys, italy

dog, cat, pet, kitten, puppy, ginger, tongue, kitty, dogs, furry

sea, france, boat, mer, beach, river, bretagne, plage, brittany

Given Generated Given Generated

Page 50: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Text Generated from Images

Given Generated

water, glass, beer, bottle, drink, wine, bubbles, splash, drops, drop

portrait, women, army, soldier, mother, postcard, soldiers

obama, barackobama, election, politics, president, hope, change, sanfrancisco, convention, rally

Page 51: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Images from Text

water, red, sunset

nature, flower, red, green

blue, green, yellow, colors

chocolate, cake

Given Retrieved

Page 52: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

MIR-Flickr Dataset

Huiskes et. al.

• 1 million images along with user-assigned tags.

sculpture, beauty, stone

nikon, green, light, photoshop, apple, d70

white, yellow, abstract, lines, bus, graphic

sky, geotagged, reflection, cielo, bilbao, reflejo

food, cupcake, vegan

d80

anawesomeshot, theperfectphotographer, flash, damniwishidtakenthat, spiritofphotography

nikon, abigfave, goldstaraward, d80, nikond80

Page 53: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Results• Logistic regression on top-level representation.

• Multimodal Inputs

Learning Algorithm MAP Precision@50

Random 0.124 0.124

LDA [Huiskes et. al.] 0.492 0.754

SVM [Huiskes et. al.] 0.475 0.758

DBM-Labelled 0.526 0.791

Deep Belief Net 0.638 0.867

Autoencoder 0.638 0.875

DBM 0.641 0.873

Mean Average Precision

Labeled 25K examples

+ 1 Million unlabelled

State-of-the-art performance

Page 54: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Generating Sentences

Input

A man skiing down the snow covered mountain with a dark sky in the background.

Output

• More challenging problem.

• How can we generate complete descriptions of images?

Page 55: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Learning Semantic Representation

• Key Idea: Each word w is represented as a D-dimensional real-valued vector rw 2 RD.

Dimension 2

Dim

ensi

on

2

Semantic Space

table

chair

dolphinwhale

November

Page 56: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Joint Feature space

A castle and reflecting water

A ship sailing in the ocean

A plane flying in the sky

Multimodal Neural Language Models (Kiros, et.al., ICML 2014)

Learning Semantic Representation

Page 57: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Tagging and Retrievalmosque, tower, building, cathedral,dome, castle

kitchen, stove, oven,refrigerator, microwave

ski, skiing, skiers, skiiers,snowmobile

bowl, cup, soup, cups, coffee

beach

snow

Page 58: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Multimodal Linguistic RegularitiesNearest Images

Ryan Kiros, 2014

Page 59: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Multimodal Linguistic RegularitiesNearest Images

Ryan Kiros, 2014

Page 60: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Caption Generation

Page 61: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Caption Generation

Model Samples

• Two men in a room talking on a table .• Two men are sitting next to each other .• Two men are having a conversation at a table .• Two men sitting at a desk next to each other .

colleagues waiters waiter entrepreneurs busboy

TAGS:

Page 62: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

More Examples

spider, spiders, arachnid, insects, insect

creepy, spooky, elfin

Model Samples

Giant spider found in the Netherlands.Look at the new spider web.This was near the black spider web.I like the spider.The pattern of one spider web.

TAGS:

Page 63: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Multi-Modal Models

Laser scans

Images

Video

Text & Language

Time series data

Speech & Audio

Develop learning systems that come closer to displaying human like intelligence

Page 64: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Summary• Efficient learning algorithms for Hierarchical Generative Models.

Learning more adaptive, robust, and structured representations.

• Deep models can improve current state-of-the art in many application domains: Object recognition and detection, text and image retrieval, handwritten

character and speech recognition, and others.

Text & image retrieval / Object recognition

Learning a Category Hierarchy

Dealing with missing/occluded data

HMM decoder

Speech Recognition

sunset, pacific ocean, beach, seashore

Multimodal Data

Object Detection

Page 65: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Our Toronto Lab

We collaborate with and consult for various organizations

Page 66: Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS Global Leadership

Thank you