Top Banner
Adam Coates AI for 100 million people with deep learning Adam Coates @adampaulcoates Silicon Valley AI Lab
26

Adam Coates at AI Frontiers: AI for 100 Million People with Deep Learning

Jan 24, 2017

Download

Technology

AI Frontiers
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Adam Coates at AI Frontiers: AI for 100 Million People with Deep Learning

AdamCoates

AIfor100millionpeoplewithdeeplearning

AdamCoates@adampaulcoates

SiliconValleyAILab

Page 2: Adam Coates at AI Frontiers: AI for 100 Million People with Deep Learning

AdamCoates

SiliconValleyAILab

• Mission: DevelophardAItechnologiesthatletushavesignificantimpactonhundredsofmillionsofusers.

Ø Chooseproblemsthatsignificantlyaffectlargenumbersofpeople.

Page 3: Adam Coates at AI Frontiers: AI for 100 Million People with Deep Learning

AdamCoates

AIfor100millionpeople

• Firstgoal:speechrecognitioneverywhere.

Ifyou’reconnectingtointernetforfirsttimein2017,you’relikelyusingamobiledevice.

Speechwilltransformmobiledeviceinterfaces.

Page 4: Adam Coates at AI Frontiers: AI for 100 Million People with Deep Learning

AdamCoates

AIfor100millionpeople

• Firstgoal:speechrecognitioneverywhere.

Mobiledevices

Captioning

Homedevices

Cars/Hands-freeinterfaces

Page 5: Adam Coates at AI Frontiers: AI for 100 Million People with Deep Learning

AdamCoates

AIfor100millionpeople

• Firstgoal:speechrecognitioneverywhere.

Diversity

Accuracy

Specializedsolutions(e.g.,IVR)

TraditionalLVCSR

Human-levelspeechrecognition

Page 6: Adam Coates at AI Frontiers: AI for 100 Million People with Deep Learning

AdamCoates

Speechrecognition:TraditionalASR• TraditionalspeechsystemsbuiltonstandardML+engineering

practices.Features

Computefeatures

Acou

sticM

odel

Predictphonemes

LanguageM

odel

Transcrip

tion

“WhattimeisitinBeijing?”

Mergewithpronunciationandlanguagedata.

Sequ

encem

odel

Combineintostatesequence

Someapplicationscanbesolvedthisway.Butit’shardtoscaleourowncleverness.

Page 7: Adam Coates at AI Frontiers: AI for 100 Million People with Deep Learning

AdamCoates

Deeplearning

Data&Computation

Performance DeepLearningalgorithms

Manypreviousmethods

Majoradvantageofdeeplearning:scalability.

Effort

Page 8: Adam Coates at AI Frontiers: AI for 100 Million People with Deep Learning

AdamCoates

Speechrecognitionwithdeeplearning

• Replacemostofspeechsystemwithlargeneuralnetwork.

Spectrograms

Languagem

odel

Transcrip

tion

“WhattimeisitinBeijing?”

SimpleLM(nolinguisticinfo)

Deep

Learning

Page 9: Adam Coates at AI Frontiers: AI for 100 Million People with Deep Learning

AdamCoates

“DeepSpeech”

• Poureffortintodata+computation.– Trytocatchuptohumanaccuracybyscaling.

Data&Computation

Performance DeepSpeech

Manypreviousmethods

Page 10: Adam Coates at AI Frontiers: AI for 100 Million People with Deep Learning

AdamCoates

DeepSpeech

Spectrogram

• Traingiantneuralnetworkstopredictcharactersfromaudio.– Train“endtoend”

[e.g.,Gravesetal.,2006]

Ø Hardpartistrainingatscale andsearchingforbestmodel.

Ø Needdata+computingpower.

Page 11: Adam Coates at AI Frontiers: AI for 100 Million People with Deep Learning

AdamCoates

RawTrainingData

0

2000

4000

6000

8000

10000

12000

14000

WSJ Switchboard Fisher DeepSpeech

80 300

2000

11940

Hoursofrawspeechdata

WeneedalotofdataforendtoendDLsystems:usereadspeech.

Page 12: Adam Coates at AI Frontiers: AI for 100 Million People with Deep Learning

AdamCoates

Dataaugmentation

• Manyformsofdistortionthatmodelshouldberobustto:– Reverb,noise,farfieldeffects,echo,

compressionartifacts,changesintempo.

• Learntoberobustbytrainingfromdatawithdistortions!– Easiertoengineerdatapipelinethantoengineerrecognitionpipeline.

Rawaudio($$$$) Novelaudio

Augmentation

Page 13: Adam Coates at AI Frontiers: AI for 100 Million People with Deep Learning

AdamCoates

Example:farfield

Speech

ImpulseResponse

=*AugmentedFar-fieldspeech

Compare:RealFar-fieldspeech

[BillyJun,Rewon Child,SanjeevSatheesh]

Reduceserrorsonfar-fielddataby15%-20%.Reliesonmodelsearch+large-scaletraining.

Page 14: Adam Coates at AI Frontiers: AI for 100 Million People with Deep Learning

AdamCoates

Augmenteddataset

0100002000030000400005000060000700008000090000100000

WSJ Switchboard Fisher DeepSpeech

80 300 2000

~96,000

Synthesizeddata

• Augmentationgreatlyexpandsavailabledata.– Trainedmodelshavehearddecades ofuniqueaudio.

Page 15: Adam Coates at AI Frontiers: AI for 100 Million People with Deep Learning

AdamCoates

Compute

TitanXGPU~6TeraFLOPs

“Speedoflight”=3-6weekson1GPU

1experimentrequires>10,000,000,000,000,000,000FLOPs(10sofExaFLOPs)

Page 16: Adam Coates at AI Frontiers: AI for 100 Million People with Deep Learning

AdamCoates

Compute

Infiniband network

ScaleouttolargenumbersofGPUs(e.g.,8– 64)

• Cutexperimenttimesto~3-5days.– Achieve~50%ofpeakFLOPson8+GPUs.– Comparabletosupercomputingworkloads.

Page 17: Adam Coates at AI Frontiers: AI for 100 Million People with Deep Learning

AdamCoates

DeepSpeechforMandarin

• DeepSpeechisdrivenbydata.–MandarinisverydifferentfromEnglish.• “Tonal”,thousandsofcharacters

DeepSpeechTraining

“Myfavoritesingeris…”

Englishtrainingdata

Page 18: Adam Coates at AI Frontiers: AI for 100 Million People with Deep Learning

AdamCoates

DeepSpeechforMandarin

• DeepSpeechisdrivenbydata.–MandarinisverydifferentfromEnglish.• “Tonal”,thousandsofcharacters

DeepSpeechTraining

“我最喜欢的歌手是…”

Mandarintrainingdata

Page 19: Adam Coates at AI Frontiers: AI for 100 Million People with Deep Learning

AdamCoates

DeepSpeechforMandarin

• Withafewchanges,asinglealgorithmcanlearnanentirelynewlanguage.– Competitivewithcommitteeofnativespeakersforshortaudioclips.

Ø Learnshybridspeech(e.g.,famouspeople,iphones):

我最喜欢的歌手是Angelababy

Page 20: Adam Coates at AI Frontiers: AI for 100 Million People with Deep Learning

AdamCoates

Makingdeviceseasierandmoreefficient

• Doesspeechmakeadifference?YES!

Page 21: Adam Coates at AI Frontiers: AI for 100 Million People with Deep Learning

AdamCoates

Comparingspeechwithkeyboardinput

• Compareuserperformance/experienceforspeechvs.traditionalkeyboard.[Ruan etal.,arxiv.org/abs/1608.07323]

Page 22: Adam Coates at AI Frontiers: AI for 100 Million People with Deep Learning

AdamCoates

Speechis3xfasterthantyping

Ruan etal.,arxiv.org/abs/1608.07323

Page 23: Adam Coates at AI Frontiers: AI for 100 Million People with Deep Learning

AdamCoates

…andmoreaccurate.

Ruanetal.,arxiv.org/abs/1608.07323

Page 24: Adam Coates at AI Frontiers: AI for 100 Million People with Deep Learning

AdamCoates

TalkType – voice-centrickeyboardforAndroid

• OpportunitytorethinkproductexperiencesaroundspeechandAI.

[NickyChan,Bijit Halder,KennyLiou,Thuan Nguyen,NinaWei]talktype.baidu.com

Page 25: Adam Coates at AI Frontiers: AI for 100 Million People with Deep Learning

AdamCoates

AIfor100millionpeople

• DeepLearningisclosinggapwithhumansonspeech,throughscalability.– Stillmoretodo;butitkeepsgettingbetter.

• SpeechalreadyenablingproliferationofnewAIproducts.– Let’smakethemworkforeveryone.

Page 26: Adam Coates at AI Frontiers: AI for 100 Million People with Deep Learning

AdamCoates

Thankyou!

AdamCoates

[email protected]

@adampaulcoates

IfyouwanttohelpbringAIto100sofmillionsofpeople,cometalktous!

http://research.baidu.com