Top Banner
m Machine Learning F# and Accord.net
53

M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Jan 18, 2016

Download

Documents

Imogene Hicks
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

m

Machine Learning F# and Accord.net

Page 2: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Alena Hall• Software architect, MS in Computer

Science

• Member of F# Software Foundation Board of Trustees

• Researcher in the field of mathematical theoretical abstractions possible in modern programming concepts

• Speaker and active software engineering community member

@lenadroid

Page 3: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Machine Learning

Page 4: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

• Why machine learning?

• What is the data?

• How?

Questions

Page 5: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Data Questions.

Page 6: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Data reality :\

Page 7: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Path to grasping machine learning and data science…

Page 8: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Contents• Multiple Linear

Regression• Logistic Regression

Classification• K Means

Clustering• What’s next?

Page 9: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

F# for machine learningand data science!

Page 10: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Why F#?1. Exploratory programming, interactive

environment

2. Functional programming, referential transparency

3. Data pipelines

4. Algebraic data types and pattern matching

5. Strong typing, type inference, Type Providers

6. Units of measure

7. Concurrent, distributed and cloud programming

Page 11: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Data pipelines

Page 12: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Algebraic data types

// Discriminated Union

Page 13: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Pattern matching

Page 14: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Type Providers

Page 15: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Units of measure

Page 16: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.
Page 17: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Linear Regression

Page 18: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

How to predict?1. Make a guess.2. Measure how wrong the guess

is.3. Fix the error.

Page 19: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Make a guess!

Page 20: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

MATH

Page 21: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Make a guess?What does it mean?...

Hypothesis /guess :

weights

Page 22: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Find out our mistake…

Cost function/ Mistake function:

… and minimize it:

Page 23: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Mistake function looks like…

Global minimums

Page 24: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

How to reduce the mistake?Update each slope parameter

until Mistake Functionminimum is reached:

Simultaneously

Alpha Learning rate

Derivative Direction of moving

Page 25: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Fix the error

Page 26: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Multiple Linear RegressionX [ ] – Predictors:Statistical data about bike rentals for previous years or months.

Y – Output:Amount of bike rentals we should expect today or some other day in the future.

* Y is not nominal, here it’s numerical continuous range.

Page 27: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Make a guess!

Page 28: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Fix the error

Page 29: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Multiple linear regression: Bike rentals demand

“Talk is cheap. Show me the code.”

Page 30: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

What to remember?1. Simplest regression algorithm

2. Very fast, runs in constant time

3. Good at numerical data with lots of features

4. Output from numerical continuous range

5. Linear hypothesis

6. Uses gradient descent

Linear Regression

Page 31: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Logistic Regression

Page 32: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Hypothesis function

Estimated probability that Y = 1 on input X

Page 33: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Mistake function

Mistake function is the cost for a single training data example

h(x)

Page 34: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Full mistake function

1. Uses the principle of maximum likelihood estimation.

2. We minimize it same way as with Linear Regression

Page 35: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

“Talk is cheap. Show me the code.”

Logistic Regression Classification Example

Page 36: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

What to remember?

• Classification algorithm

• Relatively small number of predictors

• Uses logistics function for hypothesis

• Has the cost function that is convex

• Uses gradient descent for correcting the mistake

Logistic Regression

Page 37: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

At this point…

Page 38: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Machine Learning

What society thinks I do…

What other programmers think I do…

Page 39: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

What I really do is…

Page 40: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

K-Means

Page 41: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.
Page 42: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Clustering

Page 43: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.
Page 44: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

What’s next?

Page 45: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.
Page 46: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

I’m Lena@lenadroid

Page 47: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Thank you!

Page 48: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

What if it doesn’t work?

Page 49: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

• Try more data• Try more features• Try less features• Try feature combinations• Try polynomial features• …

Algorithm debugging tips

Page 50: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

What else can go wrong?

Page 51: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Ideally... the hypothesis will… just fit the model

Page 52: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

Underfitting … Overfitting

Page 53: M Machine Learning F# and Accord.net. Alena Hall Software architect, MS in Computer Science Member of F# Software Foundation Board of Trustees Researcher.

• Regularization…?• Too big regularization

parameter? -> underfitting - the line is over-smoothed• Too small regularization

parameter? -> overfitting - too optimized for train data

Try out different values for the regularization parameter.