Python and Machine Learning Presented by Xavier Arrufat BCN Meetup - Python and AI Barcelona, September 25th, 2014
Jun 14, 2015
Python and Machine Learning
Presented by Xavier Arrufat
BCN Meetup - Python and AI Barcelona, September 25th, 2014
AI – Artificial Intelligence
“1. a branch of computer science dealing with the simulation of intelligent behavior in computers” “2. the capability of a machine to imitate intelligent human behavior“
Merriam-Webster dictionary
2010: a supercomputer will have the computational capacity to emulate human intelligence 2020: this same capacity will be available for US$1000 Mid 2020s: human brain scanning to contribute to an effective model of human intelligence 2029: these two elements will culminate in computers that can pass the Turing test Early 2030s: the amount of non-biological computation will exceed the "capacity of all living biological human intelligence". "I set the date for the Singularity—representing a profound and disruptive transformation in human capability—as 2045"” Ray Kurzweil, Director of Engineering, Google (see Wikipedia on his 2005’s book “The singularity is near” ) Inventions: OCR, image scanners, text to voice synth, (orchestra) synthesizer, voice recognition, reader for the blind, …
Predictions on AI
Source: Ray Kurzweil and Kurzweil Technologies, Inc.
Índice
Turing Test
Python for AI Barcelona, September 25th, 2014
Turing Test
Blade Runner (Ridley Scott, 1982): Deckard and the Voight-Kampff machine in 2019. Inspired on Philip K. Dick's book "Do Android's Dream of Electric Sheep” (1968)
Índice
Passing the Turing Test: capabilities
Python for AI Barcelona, September 25th, 2014
Turing Test
(classic Turing Test)
Natural Language Processing - communication
Knowledge representation - knowledge storage (KS)
Automated reasoning - use KS to answer questions
Machine Learning - detect patterns, adapt
(total Turing Test)
Computer vision - perceive objects
Robotics - manipulate objects + move around
Source: „Artificial Intelligence, a modern approach“ by Stuart Russel & Peter Norvig.
Basic Concepts: Machine Learning paradigm
Machine Learning (ML) and Python
What do I need to do ML in Python?
Índice
Agenda
Python for AI Barcelona, September 25th, 2014
Agenda
Índice
“Classical” decision making (explicit instructions)
Python for AI Barcelona, September 25th, 2014
“Classical” decision making
Input
[0.8]
[0.2]
[0.9]
[0.2]
[0.0]
[0.4]
[0.3]
[0.1]
Output
“A”
“B”
“C”
Feature
F0
F1
F2
F3
F4
F5
F6
F7
Procedure: if F1 > 0.5 and F2 * F3 < 0.3: if (F4 – F5) / F6 < 1: do A else: if F7 * F0 < 0.3: do B else: do C else: do B
Requires ‘a priori’ knowledge
Índice
(some) ML methodologies
Python for AI Barcelona, September 25th, 2014
ML methodologies
• Linear Regression
• Logistic Regression
• SVM: Support Vector Machines
• ANN: Artificial Neural Networks
• Anomaly Detection
• Nearest Neighbor
• Principal Component Analysis (PCA)
Supervised Learning
Unsupervised Learning
Índice
ML decision making
Python for AI Barcelona, September 25th, 2014
ML decision making
Input
[0.8]
[0.2]
[0.9]
[0.2]
[0.0]
[0.4]
[0.3]
[0.1]
Output
“A”
“B”
“C”
Feature
F0
F1
F2
F3
F4
F5
F6
F7
Procedure:
Output = MATRIX * Input
(Linear Regression)
Output = g( M2 * f( M1 * Input) ) (Neural Network with one hidden layer)
Requires no (or very little) ‘a priori’ knowledge
Índice
ML supervised learning: training cases
Python for AI Barcelona, September 25th, 2014
ML supervised training
[0.8]
[0.2]
[0.9]
[0.2]
[0.0]
[0.4]
[0.3]
[0.1]
“C”
Input
Target
Case 0
[0.8]
[0.2]
[0.9]
[0.1]
[0.5]
[0.6]
[0.2]
[0.9]
Case 1
[0.7]
[0.1]
[0.2]
[0.8]
[0.2]
[0.1]
[0.4]
[0.0]
Case 2
“A” “B” “A”
[0.9]
[0.4]
[0.3]
[0.3]
[0.1]
[0.4]
[0.2]
[0.2]
Case 1000
[…]
[…]
Labels
Expected Output
Índice
ML training (when no exact solution available)
Python for AI Barcelona, September 25th, 2014
ML training
Generate MATRIX0 (≠ 0, stochastically generated)
Output0 = MATRIX0 * Input
Error0 = Target – Output0 => MATRIX1 = MATRIX0 + f(Error0, MATRIX0, …)
Iterate until: Errori < tolerance
Outputi = MATRIXi * Input
Errori = Target – Outputi => MATRIXi+1 = MATRIXi + f(Errori, MATRIXi, …)
=> Intensive number crunching may be required
Initialization
(error) Minimization
Índice
ML and Python [can I do (fast) ML with Python?]
Python for AI Barcelona, September 25th, 2014
ML and Python
Índice
ML and Python [can I do (fast) ML with Python?]
Python for AI Barcelona, September 25th, 2014
ML and Python
YES
… and I’d recommend it in most of the cases
Índice
Python tools for ML (just a personal recommendation for starters)
Python for AI Barcelona, September 25th, 2014
Python tools
• Take Andrew Ng’s course on Coursera (uses Matlab/Octave)
• Learn some basic linear algebra
• Use numpy
• Libraries: Scikit-learn, Theano, Pandas
• Use gnumpy for GPU number crunching (dependency: cudamat) https://twitter.com/xavier_arrufat/status/299810134627086336 (feb 2013)
• Play, play, play
Your brain is a more powerful pattern detector than you think!
Some additional (inspiring) info on ML:
http://yann.lecun.com/exdb/lenet/
http://www.iro.umontreal.ca/~bengioy/yoshua_en/index.html
http://www.cs.toronto.edu/~hinton/: check video links, specially:
http://www.youtube.com/watch?v=DleXA5ADG78
http://www.youtube.com/watch?v=AyzOUbkUf3M
http://www.youtube.com/watch?v=VdIURAu1-aU
Q & A