Andrew Ng CS228: Deep Learning & Unsupervised Feature Learning Andrew Ng TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Andrew Ng

CS228: Deep Learning &

Unsupervised Feature Learning

Andrew Ng

Andrew Ng

How is computer perception done?

Image Low-levelvision features

Recognition

Object detection

Computer vision is hard!

Andrew Ng

How is computer perception done?

Image Vision features Recognition

Object detection

Audio Audio features Speaker ID

Audio classification

NLP

Text Text features

Text classification, MT, IR, etc.

Andrew Ng

Sensor representations

Input Learning/AIalgorithm

Low-level features

Andrew Ng

A plethora of sensors

Camera array

3d range scan (laser scanner)

3d range scans (flash lidar)

Audio

A general-purpose algorithm for good sensor representations?

Visible light image

Thermal Infrared

Andrew Ng

Sensor representation in the brain

[BrainPort; Martinez et al; Roe et al.]

Seeing with your tongueHuman echolocation (sonar)

Auditory cortex learns to see.

Auditory Cortex

Andrew Ng

Learning abstract representations

pixels

edges

object parts(combination of edges)

object models

[Related work: Deep learning, Hinton, Bengio, LeCun, and others.]

Andrew Ng

Feature learning for audio

Learned features correspond tophonemes and other “basic units”of sound.

Learned features

Algorithm:

Andrew Ng

TIMIT Phone classification AccuracyPrior art (Clarkson et al.,1999) 79.6%

Stanford Feature learning 80.3%

TIMIT Speaker identification AccuracyPrior art (Reynolds, 1995) 99.7%Stanford Feature learning 100.0%

Audio

Images

Multimodal (audio/video)

CIFAR Object classification Accuracy

Prior art (Yu and Zhang, 2010) 74.5%


NORB Object classification Accuracy

Prior art (Ranzato et al., 2009) 94.4%


AVLetters Lip reading Accuracy

Prior art (Zhao et al., 2009) 58.9%


Galaxy

Other feature learning records: Different phone recognition task (Hinton), PASCAL VOC object classification (Yu)

Hollywood2 Classification Accuracy

Prior art (Laptev et al., 2004) 48%

Stanford Feature learning 53%

KTH Accuracy

Prior art (Wang et al., 2010) 92.1%


UCF Accuracy

Prior art (Wang et al., 2010) 85.6%


YouTube Accuracy

Prior art (Liu et al., 2009) 71.2%


Video

Andrew Ng CS228: Deep Learning & Unsupervised Feature Learning Andrew Ng TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Documents