MACHINE LEARNING FOR THE CURIOUS BUT SCARED ELLEN KÖNIG @ellen_koenig
MACHINE LEARNING FOR THE CURIOUS BUT SCARED
ELLEN KÖNIG
@ellen_koenig
SO, EXACTLY WHAT DOES IT MEAN WHEN A MACHINE
“LEARNS”?
ONE WAY TO DEFINE LEARNING: LEARNING FROM EXPERIENCE
BEING ABLE TO DEAL WITH NEW SITUATIONS BASED ON THE PAST
OF HUMANS AND MACHINES
WHAT HAPPENS DURING LEARNING?
TRAINING DATA
MACHINE LEARNING
ALGORITHM
MODEL FUNCTION
(HYPOTHESIS)
Input data about the
world
Processing by internal resources
Learned represen-
tation
WHAT DOES THAT LOOK LIKE IN PRACTICE
EXAMPLES
Example Input data Learned Model
Self-driving cars Terrain data (slope, roughness, etc.)
Function mapping terrain to speed
Price prediction engine
Customer & market attributes and past
prices
Function mapping customer and market
attributes to prices
Gene sequence identification
Lots and lots of genome data
Clusters of re-occuring gene
sequence patterns
SO, EXACTLY HOW DOES A MACHINE LEARN?
COMPONENTS OF A COMPLETE MACHINE LEARNING SYSTEM
WHAT DOES A MACHINE NEED TO LEARN?
TRAINING DATA
TEST DATA
ML ALGORITHM
MODEL (HYPOTHESIS) RESULT
FEEDBACK
TWO BASIC KINDS OF MACHINE LEARNING
SUPERVISED VS UNSUPERVISED LEARNING
User tastes
User 1 likes The Clash
User 23 likes Die Ärzte
User 42 likes Helene Fischer
User 77 likes The Sex Pistols
User 99 likes Heino
Rain Wind Umbrella?
heavy light yes
none light no
light strong no
light light yes
none strong no
Supervised Unsupervised
LINEAR REGRESSION
A SIMPLE SUPERVISED LEARNING ALGORITHM
K-MEANS
A SIMPLE UNSUPERVISED LEARNING ALGORITHM
SO, HOW CAN I GET STARTED IN TEACHING A
MACHINE TO LEARN?
WHERE TO CONTINUE
RECOMMENDED RESOURCES FOR BEGINNERS (IN ORDER OF RECOMMENDATION)
▸ Tutorial for the “Kaggle Titanic Competition” (using R): http://trevorstephens.com/post/72916401642/titanic-getting-started-with-r
▸ Online courses (MOOCs):
▸ Udacity: Intro to Machine Learning: https://www.udacity.com/course/intro-to-machine-learning--ud120 (Excellent intro to applied ML using sci-kit learn and Python)
▸ Coursera: Machine Learning: https://www.coursera.org/learn/machine-learning (Friendly intro to the theory behind common ML algorithm)
▸ Book: Abu-Mostafa, Magdon-Ismail, Lin: Learning From Data - A Short Course (AMLbook.com ) (Good intro to more academic perspectives, notation and vocabulary on ML)
▸ Toolkits:
▸ Scikit-Learn (Python, great online documentation): http://scikit-learn.org/stable/
▸ stats package (many simple ML algorithms), pre-installed (R) Examples: http://www.statmethods.net/stats/regression.html
BONUS
A BASIC WORKFLOW FOR WORKING ON MACHINE LEARNING PROBLEMS
1. Understand the problem and context
2. Understand, clean and prepare the data
3. For supervised learning: Split into training and test data
4. Evaluate different algorithms with default parameters
5. Optimize the parameters and compute the results
6. Interpret and present the results
LICENCE: CREATIVE COMMONS “ATTRIBUTION - SHARE ALIKE” 4.0 HTTPS://CREATIVECOMMONS.ORG/LICENSES/BY-SA/4.0/
IMAGE CREDITS▸ Slide 2: http://www.thebluediamondgallery.com/highlighted/l/learning.html
▸ Slide 3: All https://pixabay.com/
▸ Slide 4: https://en.wikipedia.org/wiki/Consciousness#/media/File:Neural_Correlates_Of_Consciousness.jpg
▸ Slide 5: Based on https://commons.wikimedia.org/wiki/File:Machine_Learning_Technique..JPG
▸ Slide 9:
▸ https://commons.wikimedia.org/w/index.php?curid=11967659
▸ https://commons.wikimedia.org/wiki/File:Residuals_for_Linear_Regression_Fit.png
▸ Slide 10: https://commons.wikimedia.org/wiki/File:Kmeans_animation_withoutWatermark.gif