CS 60050 Machine Learning Classification: Logistic Regression Some slides taken from course materials of Andrew Ng
CS60050MachineLearning
Classification:LogisticRegression
Some slides taken from course materials of Andrew Ng
AndrewNg
Classification
Email:Spam/NotSpam?OnlineTransactions:Fraudulent /Genuine?Tumor:Malignant/Benign?
0:“NegativeClass”(e.g.,benigntumor)1:“PositiveClass”(e.g.,malignanttumor)
AndrewNg
TumorSize
Thresholdclassifieroutput at0.5:
If,predict“y=1”
If,predict“y=0”
Malignant?
(Yes)1
(No)0
Canwesolvetheproblemusinglinearregression?E.g., fitastraightlineanddefineathresholdat0.5
AndrewNg
TumorSize
Thresholdclassifieroutput at0.5:
If,predict“y=1”
If,predict“y=0”
Malignant?
(Yes)1
(No)0
Canwesolvetheproblemusinglinearregression?E.g., fitastraightlineanddefineathresholdat0.5
Failureduetoaddinganewpoint
AndrewNg
Classification:y=0or1
canbe>1or<0
LogisticRegression:
Anotherdrawbackofusinglinearregressionforthisproblem
Whatweneed:
AndrewNg
SigmoidfunctionLogisticfunction
LogisticRegressionModelWant
0
Ausefulproperty:easytocomputedifferentialatanypoint
AndrewNg
InterpretationofHypothesisOutput
=estimatedprobabilitythaty=1oninputx
Tellpatientthat70%chanceoftumorbeingmalignant
Example:If
“probabilitythaty=1,givenx,parameterized by”
AndrewNg
Separatingtwoclassesofpoints• Weareattemptingtoseparatetwogivensets/classesofpoints
• Separatetworegionsofthefeaturespace• ConceptofDecisionBoundary• Findingagooddecisionboundary=>learnappropriatevaluesfortheparameters𝛩
AndrewNg
x1
x2
DecisionBoundary
1 2 3
1
2
3
Predictif
Howtogettheparametervalues– willbediscussedsoon
AndrewNg
Non-linear decisionboundaries
x1
x2
1-1
-1
1
Wecanlearnmorecomplexdecisionboundarieswhere thehypothesisfunctioncontainshigherorderterms.
(rememberpolynomialregression)
AndrewNg
Non-linear decisionboundaries
x1
x2
Predictif
1-1
-1
1
Howtogettheparametervalues– willbediscussedsoon
AndrewNg
Costfunction
Linearregression:
Howeverthiscostfunction isnon-convexforthehypothesisoflogisticregression.
Squarederrorcostfunction:
AndrewNg
GradientDescent
Want:
(simultaneouslyupdateall)
Repeat
Algorithmlooksidenticaltolinearregression,butthehypothesisfunctionisdifferentforlogisticregression.
AndrewNg
Thuswecangradientdescenttolearnparametervalues,andhencecomputeforanewinput:
Output
Tomakeapredictiongivennew :
=estimatedprobabilitythaty=1oninputx
AndrewNg
Howtousetheestimatedprobability?• Refrainingfromclassifyingunlessconfident• Rankingitems• Multi-classclassification
AndrewNg
Multiclassclassification
Newsarticletagging:Politics,Sports,Movies,Religion,…
Medicaldiagnosis:Notill,Cold,Flu,Fever
Weather:Sunny,Cloudy,Rain,Snow
AndrewNg
One-vs-all
Trainalogisticregressionclassifierforeachclasstopredicttheprobabilitythat.
Onanewinput,tomakeaprediction,picktheclassthatmaximizes