Top Banner
Overview of Machine Learning & Feature Engineering Machine Learning 101 Tutorial Strata + Hadoop World, NYC, Sep 2015 Alice Zheng, Dato 1
64

Overview of Machine Learning and Feature Engineering

Apr 21, 2017

Download

Technology

Turi, Inc.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Overview of Machine Learning and Feature Engineering

1

Overview of Machine Learning & Feature Engineering

Machine Learning 101 TutorialStrata + Hadoop World, NYC, Sep 2015Alice Zheng, Dato

Page 2: Overview of Machine Learning and Feature Engineering

2

About us

Chris DuBoisIntro to recommenders

Alice ZhengOverview of ML

Piotr TeterwakIntro to image search & deep learning

Krishna SridharDeploying ML as a predictive service

Danny BicksonTA

Alon PalomboTA

Page 3: Overview of Machine Learning and Feature Engineering

3

Why machine learning?

Model data.Make predictions.Build intelligent

applications.

Page 4: Overview of Machine Learning and Feature Engineering

4

ClassificationPredict amongst a discrete set of classes

Page 5: Overview of Machine Learning and Feature Engineering

5

Input Output

Page 6: Overview of Machine Learning and Feature Engineering

6

Spam filtering data prediction

Spamvs.

Not spam

Page 7: Overview of Machine Learning and Feature Engineering

Text classification

EDUCATION

FINANCE

TECHNOLOGY

Page 8: Overview of Machine Learning and Feature Engineering

8

RegressionPredict real/numeric values

Page 9: Overview of Machine Learning and Feature Engineering

9

Stock market

Input

Output

Page 10: Overview of Machine Learning and Feature Engineering

10

SimilarityFind things like this

Page 11: Overview of Machine Learning and Feature Engineering

11

Similar productsProduct I’m buying

Output: other products I might be interested in

Page 12: Overview of Machine Learning and Feature Engineering

12

Given image, find similar images

http://www.tiltomo.com/

Page 13: Overview of Machine Learning and Feature Engineering

13

Recommender systemsLearn what I want before I know it

Page 14: Overview of Machine Learning and Feature Engineering

14

Page 15: Overview of Machine Learning and Feature Engineering

15

Playlist recommendationsRecommendations form

coherent & diverse sequence

Page 16: Overview of Machine Learning and Feature Engineering

16

Friend recommendationsUsers and “items” are of

the same type

Page 17: Overview of Machine Learning and Feature Engineering

17

ClusteringGrouping similar items

Page 18: Overview of Machine Learning and Feature Engineering

18

Clustering images

Goldberger et al.

Set of Images

Page 19: Overview of Machine Learning and Feature Engineering

19

Clustering web search results

Page 20: Overview of Machine Learning and Feature Engineering

20

Machine learning … how?Data

Answers

I fell in love the instant I laid my eyes on that puppy. His big eyes and playful tail, his soft furry paws, …

Many systems

Many tools

Many teams

Lots of methods/jargon

Page 21: Overview of Machine Learning and Feature Engineering

21

The machine learning pipeline

I fell in love the instant I laid my eyes on that puppy. His big eyes and playful tail, his soft furry paws, …

Raw data

FeaturesModels

Predictions

Deploy inproduction

Page 22: Overview of Machine Learning and Feature Engineering

22

Three things to know about ML• Feature = numeric representation of raw data• Model = mathematical “summary” of features• Making something that works = choose the right

model and features, given data and task

Page 23: Overview of Machine Learning and Feature Engineering

Feature = numeric representation of raw data

Page 24: Overview of Machine Learning and Feature Engineering

24

Representing natural text

It is a puppy and it is extremely cute.

What’s important? Phrases? Specific words? Ordering?

Subject, object, verb?

Classify: puppy or not?

Raw Text

{“it”:2, “is”:2, “a”:1, “puppy”:1, “and”:1, “extremely”:1, “cute”:1 }

Bag of Words

Page 25: Overview of Machine Learning and Feature Engineering

25

Representing natural text

It is a puppy and it is extremely cute.

Classify: puppy or not?

Raw Text Bag of Wordsit 2

they 0

I 1

am 0

how 0

puppy 1

and 1

cat 0

aardvark 0

cute 1

extremely 1

… …

Sparse vector representation

Page 26: Overview of Machine Learning and Feature Engineering

26

Representing images

Image source: “Recognizing and learning object categories,” Li Fei-Fei, Rob Fergus, Anthony Torralba, ICCV 2005—2009.

Raw image: millions of RGB triplets,one for each pixel

Classify: person or animal?Raw Image Bag of Visual Words

Page 27: Overview of Machine Learning and Feature Engineering

27

Representing imagesClassify: person or animal?Raw Image Deep learning features

3.29-15

-5.2448.31.3647.1

-1.9236.52.8395.4-19-89

5.0937.8

Dense vector representation

Page 28: Overview of Machine Learning and Feature Engineering

28

Feature space in machine learning• Raw data high dimensional vectors• Collection of data points point cloud in feature

space• Feature engineering = creating features of the

appropriate granularity for the task

Page 29: Overview of Machine Learning and Feature Engineering

Crudely speaking, mathematicians fall into two categories: the algebraists, who find it easiest to reduce all problems to sets of numbers and variables, and the geometers, who understand the world through shapes.

-- Masha Gessen, “Perfect Rigor”

Page 30: Overview of Machine Learning and Feature Engineering

30

Algebra vs. Geometry

a

bc

a2 + b2 = c2

Algebra GeometryPythagoreanTheorem

(Euclidean space)

Page 31: Overview of Machine Learning and Feature Engineering

31

Visualizing a sphere in 2D

x2 + y2 = 1

a

bc

Pythagorean theorem:a2 + b2 = c2

x

y

1

1

Page 32: Overview of Machine Learning and Feature Engineering

32

Visualizing a sphere in 3D

x2 + y2 + z2 = 1

x

y

z

1

11

Page 33: Overview of Machine Learning and Feature Engineering

33

Visualizing a sphere in 4D

x2 + y2 + z2 + t2 = 1

x

y

z

1

11

Page 34: Overview of Machine Learning and Feature Engineering

34

Why are we looking at spheres?

= =

= =

Poincaré Conjecture:All physical objects without holes

is “equivalent” to a sphere.

Page 35: Overview of Machine Learning and Feature Engineering

35

The power of higher dimensions• A sphere in 4D can model the birth and death

process of physical objects• High dimensional features can model many things

Page 36: Overview of Machine Learning and Feature Engineering

Visualizing Feature Space

Page 37: Overview of Machine Learning and Feature Engineering

37

The challenge of high dimension geometry• Feature space can have hundreds to millions of

dimensions• In high dimensions, our geometric imagination is

limited- Algebra comes to our aid

Page 38: Overview of Machine Learning and Feature Engineering

38

Visualizing bag-of-words

puppy

cute

1

1

I have a puppy andit is extremely cute

I have a puppy andit is extremely cute

it 1

they 0

I 1

am 0

how 0

puppy 1

and 1

cat 0

aardvark 0

zebra 0

cute 1

extremely 1

… …

Page 39: Overview of Machine Learning and Feature Engineering

39

Visualizing bag-of-words

puppy

cute

1

11

extremely

I have a puppy and it is extremely cute

I have an extremely cute cat

I have a cute puppy

Page 40: Overview of Machine Learning and Feature Engineering

40

Document point cloudword 1

word 2

Page 41: Overview of Machine Learning and Feature Engineering

Model = mathematical “summary” of features

Page 42: Overview of Machine Learning and Feature Engineering

42

What is a summary?• Data point cloud in feature space• Model = a geometric shape that best “fits” the

point cloud

Page 43: Overview of Machine Learning and Feature Engineering

43

Clustering modelFeature 2

Feature 1

Group data points tightly

Page 44: Overview of Machine Learning and Feature Engineering

44

Classification modelFeature 2

Feature 1

Decide between two classes

Page 45: Overview of Machine Learning and Feature Engineering

45

Regression modelTarget

Feature

Fit the target values

Page 46: Overview of Machine Learning and Feature Engineering

Visualizing Feature Engineering

Page 47: Overview of Machine Learning and Feature Engineering

47

When does bag-of-words fail?

puppy

cat

2

11

have

I have a puppy

I have a catI have a kitten

Task: find a surface that separates documents about dogs vs. cats

Problem: the word “have” adds fluff instead of information

I have a dogand I have a pen

1

Page 48: Overview of Machine Learning and Feature Engineering

48

Improving on bag-of-words• Idea: “normalize” word counts so that popular words

are discounted• Term frequency (tf) = Number of times a terms

appears in a document• Inverse document frequency of word (idf) =

• N = total number of documents• Tf-idf count = tf x idf

Page 49: Overview of Machine Learning and Feature Engineering

49

From BOW to tf-idf

puppy

cat

2

11

have

I have a puppy

I have a catI have a kitten

idf(puppy) = log 4idf(cat) = log 4idf(have) = log 1 = 0

I have a dogand I have a pen

1

Page 50: Overview of Machine Learning and Feature Engineering

50

From BOW to tf-idf

puppy

cat1

have

tfidf(puppy) = log 4tfidf(cat) = log 4tfidf(have) = 0

I have a dogand I have a pen,I have a kitten

1

log 4

log 4

I have a cat

I have a puppy

Decision surface

Tf-idf flattens uninformative

dimensions in the BOW point cloud

Page 51: Overview of Machine Learning and Feature Engineering

51

Entry points of feature engineering• Start from data and task

- What’s the best text representation for classification?• Start from modeling method

- What kind of features does k-means assume?- What does linear regression assume about the data?

Page 52: Overview of Machine Learning and Feature Engineering

Dato’s Machine Learning Platform

Page 53: Overview of Machine Learning and Feature Engineering

53

Dato’s machine learning platform

Raw data

Features ModelsPredictions

Deploy inproduction

GraphLab Create

Dato Distributed

Dato Predictive Services

Page 54: Overview of Machine Learning and Feature Engineering

54

Data structures for feature engineering

Features SFrames

User Com.

Title Body

User Disc.

SGraphs

Page 55: Overview of Machine Learning and Feature Engineering

55

Machine learning toolkits in GraphLab Create• Classification/regression• Clustering• Recommenders• Deep learning• Similarity search• Data matching• Sentiment analysis• Churn prediction• Frequent pattern mining• And on…

Page 56: Overview of Machine Learning and Feature Engineering

Demo

Page 57: Overview of Machine Learning and Feature Engineering

57

Dimensionality reductionFeature 1

Feature 2

Flatten non-useful features

PCA: Find most non-flat linear subspace

Page 58: Overview of Machine Learning and Feature Engineering

58

PCA : Principal Component Analysis

Center data at origin

Page 59: Overview of Machine Learning and Feature Engineering

59

PCA : Principal Component AnalysisFind a line, such that the average distance of every data point to the line is minimized.

This is the 1st Principal Component

Page 60: Overview of Machine Learning and Feature Engineering

60

PCA : Principal Component AnalysisFind a 2nd line, - at right angles to the 1st

- such that the average distance of every data point to the line is minimized.

This is the 2nd Principal Component

Page 61: Overview of Machine Learning and Feature Engineering

61

PCA : Principal Component AnalysisFind a 3rd line - at right angles to the previous lines - such that the average distance of every data point to the line is minimized.

…There can only be as many principle components as the dimensionality of the data.

Page 62: Overview of Machine Learning and Feature Engineering

Demo

Page 63: Overview of Machine Learning and Feature Engineering

63

Coursera Machine Learning Specialization• Learn machine learning in depth• Build and deploy intelligent applications• Year long certification program• Joint project between University of Washington +

Dato • Details:

https://www.coursera.org/specializations/machine-learning

Page 64: Overview of Machine Learning and Feature Engineering

64

Next up today

[email protected] @RainyData, #StrataConf

11:30am - Intro to recommendersChris DuBois

1:30pm - Intro to image search & deep learningPiotr Teterwak

3:30pm - Deploying ML as a predictive serviceKrishna Sridhar