Top Banner
A Snake Learns Machine Learning and Python Igor Guerrero @igorgue
33

A Snake Learns

Apr 07, 2018

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 1/33

A Snake LearnsMachine Learning and Python

Igor Guerrero

@igorgue

Page 2: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 2/33

What's Machine Learning?

Page 3: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 3/33

Page 4: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 4/33

Page 5: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 5/33

 

"A branch of artificial intelligence

 , is a scientific discipline

concerned with the design and development of algorithms that 

allow computers to evolve behaviors based on empirical data ,

 such as from sensor data or databases".

- Wikipedia (http://en.wikipedia.org/wiki/Machine_Learning)

Page 6: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 6/33

Cool Story, Bro!

Machine Learning is more than just 

algorithms! 

Page 7: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 7/33

Machine Learning in real life.

Data Input

Algorithms

Data Output

Runtime

Page 8: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 8/33

Big Data is Big

Page 9: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 9/33

Page 10: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 10/33

I'm not telling you to switch database...

If your current relational database doesn't cut it for ML

there are alternatives! 

 And really good ones! 

http://aws.amazon.com/elasticmapreduce/(let them run your stuff, based on Hadoop)

Page 11: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 11/33

Brute-force "learning"

Data is the algorithm

Page 12: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 12/33

Silly Google practices this!

89,600 < 714,000,000

Brute-forcing their spell checker...

Not so genius now right? 

Page 13: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 13/33

 

http://code.google.com/apis/predict/

Page 14: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 14/33

Page 15: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 15/33

 

The Netflix Challenge winner was a collection of resultsgenerated by multiple algorithms:

http://www.netflixprize.com/leaderboard 

Page 16: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 16/33

NLP

Natural Language Processing, I 

knew grammar was useful.

Page 17: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 17/33

 

 A field of computer science and linguistics concerned with the

interactions between computers and human (natural)

languages

Page 18: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 18/33

Guess the first word!

dataisbig

Word?(d) + ataisbig

Word?(da) + taisbig

Word?(dat) + aisbigWord?(data) + isbig

(repeat procedure with the rest)

This is known as word segmentation very useful in foreignlanguages search!

Page 19: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 19/33

 

Word?(word) = #Google hits / ~#pages of the web

It works, I promise!

http://ngrams.googlelabs.com/datasets

Google ngram database from scans from Google Books.

Page 20: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 20/33

Page 21: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 21/33

Recommendations

Based on your viewing history you

might like "Snakes on a Plane"...

Page 22: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 22/33

Amazon loves these

Page 23: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 23/33

Page 24: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 24/33

Euclidean Distance Algorithm

d ( p,q) = ( p1

− q1

)2 + ( p2

− q2

)2

Page 25: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 25/33

Page 26: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 26/33

Toby might enjoy "Lady in the Water" and "The NightListener".

And he'd hate "Just My Luck"...

Page 27: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 27/33

Classification

"Dividing" data sets

Page 28: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 28/33

Great for face recognition!

Facebook implemented it!

http://face.com offers a Free API!

Page 29: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 29/33

Support Vector Machines

The calculation the line that divide objects is done via SVM.

http://www.csie.ntu.edu.tw/~cjlin/libsvm/ 

Page 30: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 30/33

Clustering

"Similarities" between different sets

Page 31: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 31/33

This is how compression algorithms work

1. AAAA AAA AA AAAAAA

2. BB BBBBB BBB BBBBBB

3. CCC CCCC CCCC CCC

Use Euclidean Distance to know what elements aresimilar!

Page 32: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 32/33

Page 33: A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 33/33

Resources

● Programming Collective Intelligence: http://oreilly.com/catalog/9780596529321

● Hadoop tutorial: http://developer.yahoo.com/hadoop/tutorial/● R Programming language: http://www.r-project.org/

● My favorite Machine Learning community members:○ Ilya Grigorik (Google): http://www.igvita.com/○ Jonathan Harris (We Feel Fine): http://www.wefeelfine.

org/● Contact me: http://igorgue.com