Recitation - Penn Engineeringcis520/lectures/5_recitation.pdf · 2020. 10. 7. · Recitation Lyle Ungar Computer and information Science Learning Objectives Generalized linear models

RecitationLyle Ungar

Computer and information Science

Learning ObjectivesGeneralized linear models and RBFsLoss functions for non-parametric methodsSelection of loss functions Selection of regression penalties

Non-parametric loss functionu When doing k-nn with y a real number, what is

the loss function L(y, ŷ ) being minimized?

u When doing decision trees with y a Boolean, what is the loss function being minimized?

Breakout

Non-parametric loss functionu When doing k-nn with y a real number, what is

the loss function L(y, ŷ ) being minimized?l K-nn doesn’t really have a loss function that is is

minimizing. It is just an algorithm. l There is no learning/optimixation, so no gradient

descentu When doing decision trees with y a Boolean,

what is the loss function being minimized?l The conditional entropy of y given the features x

Model complexityu Increasing k in K-nn yields better-fitting, more

complex model

False; it gives a simplermodel

Which model to use?y = xTw

Predict income based on age, sex, and country you were born inWhat exactly are x and y?

y: incomex age, sex, and a “one hot”” vector indicating birth country

Which loss function to use?||y -.Xw||p

a) p=0b) p=1c) p=2

L1: data not Gaussian?

Which loss function to use?You are building a model to estimate the cost,y, of a software project that you are bidding on as a contractor (as a function of lots of features of the project, including estimates of lines of code, hours of meetings, complexity of specifications).

a) p=0b) p=1c) p=2

L1? The true cost is linear, not quadratic

Which loss function to use?You are writing a search algorithm that returns web pages as a function of the search query, the words on the web page the person is searching from, and the search history of that user.

You only care about getting a right answer among the top few. We’ll cover this later in the course

Which regression penalty to use?Error + l2||w||22 + l1||w||1 + l0||w||0

u If you want the model to be scale invariant?u If you want to have a small model?u If you want a convex optimization problem?a) p=0b) p=1c) p=2

Which regression penalty to use?Error + l2||w||22 + l1||w||1 + l0||w||0

u If you want the model to be scale invariant?u If you want to have a small model?u If you want a convex optimization problem?

L0 or L1L0

L1 and/or L2

u Your training error for ridge regression is substantially lower than your testing error.

u You shoulda) increase lb) decrease lc) no change in l

a)

u Your training error for ridge regression is the same as your testing error.

u You shouida) increase lb) decrease lc) no change in l

c)

What you should knowu Loss functions depend on the problemu Basis functions allow one to fit a nonlinear

function using linear regressionu Link functions give a nonlinear regression

Gather.townu https://gather.town/aQMGI0l1R8DP0Ovv/penn-

cis

Recitation - Penn Engineeringcis520/lectures/5_recitation.pdf · 2020. 10. 7. · Recitation Lyle Ungar Computer and information Science Learning Objectives Generalized linear models

Documents