Machine Learning Kan Ouivirach
Outline
• What is Machine Learning?
• Main Types of Learning
• Model Validation, Selection, and Evaluation
• Applied Machine Learning Process
• Cautions
–Arthur Samuel (1959)
“Field of study that gives computers the ability to learn without being explicitly programmed.”
–Tom Mitchell (1988)
“A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its
performance at tasks in T, as measured by P, improves with experience E.”
Programming?
“Given a specification of a function f, implement f that meets the specification.”
Machine Learning?
“Given example (x, y) pairs, induce f such that y = f(x) for given pairs and generalizes
well for unseen x”
–Peter Norvig (2014)
Applications of Machine Learning
• Search Engines
• Medical Diagnosis
• Object Recognition
• Stock Market Analysis
• Credit Card Fraud Detection
• Speech Recognition
• etc.
Robot Localization
https://github.com/mjl/particle_filter_demo
k-Nearest Neighbors
http://bdewilde.github.io/blog/blogger/2012/10/26/classification-of-hand-written-digits-3/
Perceptron
https://datasciencelab.wordpress.com/2014/01/10/machine-learning-classics-the-perceptron/
w0x0 + w1x1
Perceptron
https://datasciencelab.wordpress.com/2014/01/10/machine-learning-classics-the-perceptron/
Probability Theoryhttps://seisanshi.wordpress.com/tag/probability/
A2A1 A3 An
Ck
. . .
P(Ck | A1, …, An) = P(Ck) * P(A1, …, An | Ck) / P(A1, …, An)
P(Ck | A1, …, An) P(Ck) * Prod P(Ai | C)
with independence assumption, we then have
Naive Bayes
Naive Bayes
No. Content Spam?
1 Party Yes
2 Sale Discount Yes
3 Party Sale Discount Yes
4 Python Party No
5 Python Programming No
Naive Bayes
P(Spam | Party, Programming) = P(Spam) * P(Party | Spam) * P(Programming | Spam)
P(NotSpam | Party, Programming) = P(NotSpam) * P(Party | NotSpam) * P(Programming | NotSpam)
We want to find if “Party Programming” is spam or not?
We need to know
P(Spam), P(NotSpam)
P(Party | Spam), P(Party | NotSpam)
P(Programming | Spam), P(Programming | NotSpam)
Naive Bayes
No. Content Spam?1 Party Yes2 Sale Discount Yes3 Party Sale Discount Yes4 Python Party No5 Python Programming No
P(Spam) = ? P(NotSpam) = ?
P(Party | Spam) = ? P(Party | NotSpam) = ?
P(Programming | Spam) = ? P(Programming | NotSpam) = ?
Naive Bayes
No. Content Spam?1 Party Yes2 Sale Discount Yes3 Party Sale Discount Yes4 Python Party No5 Python Programming No
P(Spam) = 3/5 P(NotSpam) = 2/5
P(Party | Spam) = 2/3 P(Party | NotSpam) = 1/2
P(Programming | Spam) = 0 P(Programming | NotSpam) = 1/2
Naive Bayes
P(Spam | Party, Programming) = 3/5 * 2/3 * 0 = 0
P(NotSpam | Party, Programming) = 2/5 * 1/2 * 1/2 = 0.1
P(NotSpam | Party, Programming) > P(Spam | Party, Programming)
“Party Programming” is NOT a spam.
Decision Tree
Outlook
Humidity Wind
SunnyOvercast
Rain
Yes
High Normal Strong Weak
No Yes No Yes
Day Outlook Temp Humidity WInd Play
D1 Sunny Hot High Weak No
D2 Sunny Hot High Strong No
D3 Overcast Mild High Strong Yes
D4 Rain Cool Normal Strong No
Play tennis?
Support Vector Machines
http://www.mblondel.org/journal/2010/09/19/support-vector-machines-in-python/
3 support vectors
Unsupervised Learning
• k-Means Clustering
• Hierarchical Clustering
• Gaussian Mixture Models (GMMs)
k-Means Clustering
http://stackoverflow.com/questions/24645068/k-means-clustering-major-understanding-issue/24645894#24645894
Anomaly Detection
http://modernfarmer.com/2013/11/farm-pop-idioms/
http://boxesandarrows.com/designing-screens-using-cores-and-paths/
Flappy Bird Hack using Reinforcement Learninghttp://sarvagyavaish.github.io/FlappyBirdRL/
I’ve got a perfect classifiers!
https://500px.com/photo/65907417/like-a-frog-trapped-inside-a-coconut-shell-by-ellena-susanti
http://blog.csdn.net/love_tea_cat/article/details/25972921
Overfitting (High Variance)
Normal fit Overfitting
http://blog.csdn.net/love_tea_cat/article/details/25972921
Underfitting (High Bias)
Normal fit Underfitting
How to Avoid Overfitting and Underfitting
• Using more data does NOT always help.
• Recommend to
• find a good number of features;
• perform cross validation;
• use regularization when overfitting is found.
Metrics
• Accuracy
• True Positive, False Positive, True Negative, False Negative
• Precision and Recall
• F1 Score
• etc.
Precision and Recall
http://en.wikipedia.org/wiki/Precision_and_recall
Applied Machine Learning Process
http://machinelearningmastery.com/process-for-working-through-machine-learning-problems/
Define the Problem
https://youmustdesireit.wordpress.com/2014/03/05/developing-and-nurturing-creative-problem-solving/
Prepare Data
http://vpnexpress.net/big-data-use-a-vpn-block-data-collection/
Spot Check Algorithms
https://www.flickr.com/photos/withassociates/4385364607/sizes/l/
Present Results
http://www.langevin.com/blog/2013/04/25/5-tips-for-projecting-confidence/presentation-skills-2/
http://newventurist.com/
• Curse of dimensionality
• Correlation does NOT imply causation.
• Learn many models, not just ONE.
• More data beats a cleaver algorithm.
• Data alone are not enough.
A Few Useful Things You Need to Know about Machine Learning, Pedro Domigos (2012)
Some Cautions
Example of Feature Engineering
Width (m) Length (m) Cost (baht)
100 100 1,200,000
500 50 1,300,000
100 80 1,000,000
400 100 1,500,000
Are the data good to model the area’s cost?
Size (m x m) Cost (baht)
100,000 1,200,000
25,000 1,300,000
8,000 1,000,000
400,00 1,500,000
Engineer features.
They look better here.
Let’s get our hands dirty!
https://github.com/zkan/intro-to-machine-learning