Artificial Intelligence[ECS 801] Presentation Subject Professor: Dr Y.N.Singh Topic: Introduction to Machine Learning Presented by: Akshay Kanchan(1205210006) Mohd Iqbal(1305210903) Institute of Engineering and Technology Lucknow
Artificial Intelligence[ECS 801] PresentationSubject Professor: Dr Y.N.Singh
Topic: Introduction to Machine Learning
Presented by:
Akshay Kanchan(1205210006)Mohd
Iqbal(1305210903)
Institute of Engineering and Technology Lucknow
In Artificial Intelligence, an intelligent machine should be able to:
1. Think and act Rationally2. Store and retrieve knowledge3. Adapt and Learn in new environment
and with new Data (Machine Learning)
"Field of study that gives computers theability to learn without being explicitlyprogrammed.”
What is machine learning?
Traditional Programming
Machine Learning
ComputerData
ProgramOutput
ComputerData
OutputProgram
-autonomous, self-driving car
- determining election results
- developing pharmaceutical drugs (combinatorial chemistry)
- predicting tastes in music (Pandora)
- predicting tastes in movies/shows (Netflix)
- search engines (Google)
- predicting interests (Facebook)
- predicting other books you might like (Amazon)
Where is Machine Learning being Used
ML in our daily lives
More Places where ML is being used
• 1950 — Alan Turing creates the “Turing Test” to determine if a computer has real intelligence.
• 1952 — Arthur Samuel wrote the first computer learning program. The program was the game of checkers.
• 1957 — Frank Rosenblatt designed the first neural network for computers.
• 1967 — The “nearest neighbour” algorithm was written, allowing computers to begin using very basic pattern recognition.
• 1979 — Students at Stanford University invent the “Stanford Cart” which can navigate obstacles in a room on its own
Brief History
• 1990s — Work on machine learning shifts from a knowledge-driven approach to a data-driven approach. Scientists begin creating programs for computers to analyze large amounts of data and draw conclusions — or “learn” — from the results.
• ASIMO, a Humanoid Robot designed and developed by Honda. Introduced in 2000.
• 2016, Google program AlphaGo beats Professional World Go champion by 4 games to 1.
In Machine Learning a computer program is said to learn from experience E with respect to some task T and performance metric P, if its performance at tasks in T, as measured by P, improves with experience E.
Formal Definition
Why is Machine Learning Important?
• Some tasks cannot be defined well, except by examples (e.g., recognizing people).
• Relationships and correlations can be hidden within large amounts of data. Machine Learning may be able to find these relationships.
11
Areas of Influence for Machine Learning
•Statistics: How best to use samples drawn from unknown probability distributions to help decide from which distribution some new sample is drawn.
•Psychology: How to model human performance on various learning tasks?
•Economics: How to write algorithms to maximum profits.
•Neural/Brain Models: How to model certain aspects of biological evolution to improve the performance of computer programs?
12
• Prepare DataRemove noise, smoothening, feature extraction, dimensionality reduction,
• Choose an AlgorithmLinear, non-linear, complexity, speed, accuracy.
• Train a ModelPrevent Over fitting and Under fitting
• Test the model
• Use for Prediction
Steps involved in Learning:
Learning: Training and Test Data
Prediction
Learning Example:Training Labels
Training Images
Training
Training
Image Features
Image Features
Testing
Test Image
Learned model
Learned model
Reinforcemet Learning
18
Supervised learning
Supervised learning
The correct classes of the training data are
known
1. Naïve Bayes2. k-Nearest Neighbours 3. Support Vector Machine4. Decision Tree5. Neural Network6. Bayesian Network7. Random ForestEtc.
Supervised Learning Algorithms
K-nearest neighbor
x x
x x
x
xx
xo
oo
o
o
oo
x2
x1
+
+
The principle behind nearest neighbour methods is to find a predefined number of training samples closest in distance to the new point, and predict the label from these
1-nearest neighbor
x x
x x
x
xx
xo
oo
o
o
oo
x2
x1
+
+
3-nearest neighbor
x x
x x
x
xx
xo
oo
o
o
oo
x2
x1
+
+
Naïve Bayes
• Naive Bayes methods are a set of supervised learning algorithms based on applying Bayes’ theorem with the “naive” assumption of independence between every pair of features
• Uses Probabilistic approach to assign label to data
• Based on Bayesian Probability rule.
• It uses prior probability, evidence and posterior probability for classification
Support Vector machine
Support vector machines (SVMs) are a set of supervised learning methods used for classification, regression and outliers detection.
• Effective in high dimensional spaces.• Still effective in cases where number of dimensions is greater
than the number of samples.• Uses a subset of training points in the decision function (called
support vectors), so it is also memory efficient.• Versatile: different Kernel functions can be specified for the
decision function. Common kernels are provided, but it is also possible to specify custom kernels.
SVM(cont’d)
• SVMs try to maximize margin of hyperplane.• SVM uses Kernel functions that take low-dimension input
space and map it to higher dimensional space.X,Y(Kernel)X1,X2,X3
• SVM uses parameters like Gamma, C, Kernel etc to define itself.
Kernel function
SVM(cont’d)
1. Kernel can be linear, non-linear etc
2. Gamma- describes how far the influence of a single training example reaches.
For low Gamma value influence is Farand for high Gamma values influence is low
3. C parameter: defines if decision boundary will be smooth or of high order. It is a trade-off between biasing and variance.Low C value: Smooth decision boundaryHigh C value: high order classification
Decision Tree
• Decision Trees (DTs) are a non-parametric supervised learning method. The goal is to create a model that predicts the value of a target variable by learning simple decision rules.
• Uses a white box model. If a given situation is observable in a model, the explanation for the condition is easily explained by Boolean logic.
• The problem of learning an optimal decision tree is known to be NP-complete so locally optimal decisions are made at each node.
Regression• Regression analysis is also used to understand which among the
independent variables are related to the dependent variable, and to explore the forms of these relationships.
• It includes many techniques for modelling and analysing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables (or 'predictors').
Classification vs Regression •Classification means to group the output into a class.
•classification to predict the type of tumor i.e. harmful or not harmful using training data
• if it is discrete/categorical variable, then it is classification problem
• Regression means to predict the output value using training data.
• regression to predict the house price from training data
• if it is a real number/continuous, then it is regression problem.
The correct classes of the training data are not known
Unsupervised Learning
Unsupervised learning
Clustering• Cluster analysis or clustering is the task of grouping a set of objects in
such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters).
K means clustering
• The algorithm clusters data by trying to separate samples in n groups of equal variance, minimizing a criterion known as the inertia or within-cluster sum-of-squares.
• This algorithm requires the number of clusters to be specified.
• It scales well to large number of samples and has been used across a large range of application areas in many different fields.
K means Clustering Example
That algorithm presents a state dependent on the input data in which a user rewards or punishes the algorithm via the action the user took, this continues over time
Reinforcement Learning
Reinforcement learning
Markov model
• It is a type of reinforcement learning.
• There are three fundamental problems for HMMs:1. Given the model parameters and observed data, estimate the
optimal sequence of hidden states.2. Given the model parameters and observed data, calculate the
likelihood of the data.3. Given just the observed data, estimate the model parameters.
HMM example for 2 classes
12
3 4
5
References
1. All definitions and explanations: http://scikit-learn.org/
2. Machine Learning History: http://www.forbes.com/
3. Images Online lectures of CMU Prof Sebastian Thrun.
Latest technologies in all field are being replaced by smart machines. Stock Market, Ecommerce, Personalized customer experience etc etc.
In future maybe presentations will be prepared and given by robots!!
Conclusion
Thank you