Top Banner
Andrew Ng CS228: Deep Learning & Unsupervised Feature Learning Andrew Ng
9

Andrew Ng CS228: Deep Learning & Unsupervised Feature Learning Andrew Ng TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Dec 24, 2015

Download

Documents

Sandra Parker
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Andrew Ng CS228: Deep Learning & Unsupervised Feature Learning Andrew Ng TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Andrew Ng

CS228: Deep Learning &

Unsupervised Feature Learning

Andrew Ng

Page 2: Andrew Ng CS228: Deep Learning & Unsupervised Feature Learning Andrew Ng TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Andrew Ng

How is computer perception done?

Image Low-levelvision features

Recognition

Object detection

Computer vision is hard!

Page 3: Andrew Ng CS228: Deep Learning & Unsupervised Feature Learning Andrew Ng TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Andrew Ng

How is computer perception done?

Image Vision features Recognition

Object detection

Audio Audio features Speaker ID

Audio classification

NLP

Text Text features

Text classification, MT, IR, etc.

Page 4: Andrew Ng CS228: Deep Learning & Unsupervised Feature Learning Andrew Ng TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Andrew Ng

Sensor representations

Input Learning/AIalgorithm

Low-level features

Page 5: Andrew Ng CS228: Deep Learning & Unsupervised Feature Learning Andrew Ng TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Andrew Ng

A plethora of sensors

Camera array

3d range scan (laser scanner)

3d range scans (flash lidar)

Audio

A general-purpose algorithm for good sensor representations?

Visible light image

Thermal Infrared

Page 6: Andrew Ng CS228: Deep Learning & Unsupervised Feature Learning Andrew Ng TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Andrew Ng

Sensor representation in the brain

[BrainPort; Martinez et al; Roe et al.]

Seeing with your tongueHuman echolocation (sonar)

Auditory cortex learns to see.

Auditory Cortex

Page 7: Andrew Ng CS228: Deep Learning & Unsupervised Feature Learning Andrew Ng TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Andrew Ng

Learning abstract representations

pixels

edges

object parts(combination of edges)

object models

[Related work: Deep learning, Hinton, Bengio, LeCun, and others.]

Page 8: Andrew Ng CS228: Deep Learning & Unsupervised Feature Learning Andrew Ng TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Andrew Ng

Feature learning for audio

Learned features correspond tophonemes and other “basic units”of sound.

Learned features

Algorithm:

Page 9: Andrew Ng CS228: Deep Learning & Unsupervised Feature Learning Andrew Ng TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Andrew Ng

TIMIT Phone classification AccuracyPrior art (Clarkson et al.,1999) 79.6%

Stanford Feature learning 80.3%

TIMIT Speaker identification AccuracyPrior art (Reynolds, 1995) 99.7%Stanford Feature learning 100.0%

Audio

Images

Multimodal (audio/video)

CIFAR Object classification Accuracy

Prior art (Yu and Zhang, 2010) 74.5%

Stanford Feature learning 79.6%

NORB Object classification Accuracy

Prior art (Ranzato et al., 2009) 94.4%

Stanford Feature learning 97.0%

AVLetters Lip reading Accuracy

Prior art (Zhao et al., 2009) 58.9%

Stanford Feature learning 65.8%

Galaxy

Other feature learning records: Different phone recognition task (Hinton), PASCAL VOC object classification (Yu)

Hollywood2 Classification Accuracy

Prior art (Laptev et al., 2004) 48%

Stanford Feature learning 53%

KTH Accuracy

Prior art (Wang et al., 2010) 92.1%

Stanford Feature learning 93.9%

UCF Accuracy

Prior art (Wang et al., 2010) 85.6%

Stanford Feature learning 86.5%

YouTube Accuracy

Prior art (Liu et al., 2009) 71.2%

Stanford Feature learning 75.8%

Video