Top Banner
14.1 Inference for Regression
26

14.1 Inference for Regression

Jan 03, 2016

Download

Documents

plato-coleman

14.1 Inference for Regression. Learning Objective:. -Perform a Linear Regression T-test and calculate and interpret a confidence interval for regression slope. a= y-intercept of our sample data b=slope of our sample data. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 14.1  Inference for Regression

14.1 Inference for Regression

Page 2: 14.1  Inference for Regression

-Perform a Linear Regression T-test and calculate and interpret a confidence interval for regression slope.

Learning Objective:

Page 3: 14.1  Inference for Regression

a= y-intercept of our sample data

b=slope of our sample data.

Estimating Parameters (we need to denote our population data differently than our sample data)

Let: α= true population y-intercept

ß= true population slope

Page 4: 14.1  Inference for Regression

Step 1: Create a scatter plot so you can visually see what this data looks like. Think about what is the explanatory and the response variable?

Page 5: 14.1  Inference for Regression

Suppose a local restaurant wanted to predict the amount of tip left based on the amount of the customer’s bill.

Find the LSRL in your calculator:

-0.7367+0.164xx=amount of billy=amount of tip(Don’t forget to define your variables!)

Page 6: 14.1  Inference for Regression

Whenever we have a linear regression test on the AP exam, they will give you computer output of the numbers all crunched for you! The first step with a Linear Regression t-test and interval is to learn how to read the computer output!!

So this is what you would get!

Page 7: 14.1  Inference for Regression

Let’s start off with the simple part:

Notice it’s the same equation we got when typing it in our calculator earlier.

Page 8: 14.1  Inference for Regression

After you get your LSRL, we don’t need any more data from the top row so cross it out!

(leave you’re y-intercept: -0.7367)

Page 9: 14.1  Inference for Regression

Our question of interest: Using a 5%significance level, is there evidence of a linear relationship between the amount of a bill and the amount that was tipped? (Assume the conditions for inference are met)

Remember: If they ask you “is there evidence”, you have to complete a test.

We will use a linear regression t-test, since we are determining if there is a relationship between 2 quantitative variables.

(** Chi-squared independence test was when we have categorical data)

Page 10: 14.1  Inference for Regression

In order to show a linear relationship, we can test to see if the slope is positive or negative (no slope=no association)

Since the sample data gives us a slope using “b”, we can denote the population slope using “ß”. 

ß= true slope of y per x (in context of the problem) 

Ho: ß=0 (this really means no association)

Ha: ß≠0 (this really means there is an association)

PHATACDS template for Linear Regression t-test

Page 11: 14.1  Inference for Regression

Assumptions: If you have a linear regression output on

the AP exam-it will always state- Assume your assumptions are met. (So don’t worry about them!)

Test Name: Linear Regression T-test

Alpha: 0.05

Page 12: 14.1  Inference for Regression

Calcualtions: P(t> ___)=p-value

Degrees of Freedom: (there are 2 variables so we use n-2, not n-1)

Decision and Statement: Since p<α, …….SAME THING WE’VE BEEN DOING!!

Page 13: 14.1  Inference for Regression

So let’s look at the output again:

Page 14: 14.1  Inference for Regression

ß= true slope of amount tipped per the amount of the bill  ß=0   ß≠0

Assumptions: stated in problem they are met.

Linear Regression T-test α = 0.05

Calculations (given in the table): 2P(t> 9.18)=0.0027 Degrees of Freedom: = 3

Decision and Statement: Since p<α, it is statistically significant, therefore we reject . There’s enough evidence to suggest there’s a relationship between the amount of a bill and the amount tipped.

Page 15: 14.1  Inference for Regression

Example: The following data was taken from 50 students in an AP Environmental class.

Page 16: 14.1  Inference for Regression

What is the slope? Interpret?

On average, for every point increase on a student’s quiz grade, the final grade will increase by 0.75 points.

What % of the variation in the final grade can be explained through the variation of the least-squares regression line of final grade on quiz grade?

r²=37%

What is the correlation? Interpret?

r=0.61 (It is positive b/c the slope is positive)There is a moderate positive linear relationship between quiz grades and a final grade.

Page 17: 14.1  Inference for Regression

Is there evidence of an association between a student’s quiz grade and their final grade.

ß= true slope of final grade per quiz grade ß=0   ß≠0

Assumptions: stated in problem they are met.

Linear Regression T-test α = 0.05

Calculations (given in the table): 2P(t> 5.31)=0.000 Degrees of Freedom: 50-2= 48

Decision and Statement: Since p<α, it is statistically significant, therefore we reject . There’s enough evidence to suggest there’s a relationship between the quiz grade and a students final grade.

Page 18: 14.1  Inference for Regression

A level C confidence interval for the slope of the true regression line is:

where =standard error of the slope

We find in the table in the back of your book (use the degrees of freedom and CI % to find it).

Confidence Intervals:

Page 19: 14.1  Inference for Regression

Ex: Compute a 95% confidence interval for the true slope of amount tipped per cost of bill.

Page 20: 14.1  Inference for Regression

Name: Linear Regression t-interval

Assumptions: Stated in the problem they are met

Calculations:First look up the value: Go to 95%, df=3 df=3

Statement: We are 95% confident that the true slope of amount tipped per cost of bill is between 0.107 and 0.221.

Page 21: 14.1  Inference for Regression

How well do golfers’ scores in the first round of a two-round tournament predict their scores in the second round? The data for 12 members of a college’s women’s golf team in a recent tournament are listed below. Is there good evidence that there is an association between first and second round scores? (Assume conditions for inference are met)

Golfer 1 2 3 4 5 6 7 8 9 10 11 12

Round A

89 90 87 95 86 81 102 105 83 88 91 79

Round B

94 85 89 89 81 76 107 89 87 91 88 80

Example:

Page 22: 14.1  Inference for Regression

ß= true slope of score on round B per score of round A ß=0   ß≠0 Assumptions: stated in problem they are met.

Linear Regression T-test α = 0.05

Calculations (given in the table): 2P(t> 2.99)=0.0136 Degrees of Freedom: 12-2= 10 Decision and Statement: Since p<α, it is statistically significant,

therefore we reject . There’s enough evidence to suggest there’s a relationship between the score on round A and round B.

Page 23: 14.1  Inference for Regression

Give a 95% confidence interval for the increased rate of golf scores.

Linear Regression t-interval

Assumptions: Stated in the problem they are metCalculations: df=10 Statement: We are 95% confident that the true

slope of score of round B per Round A is b/w 0.1753 and 1.200.

Page 24: 14.1  Inference for Regression

What is the line of best fit? Define any variables.  x=score of Round A

y=score of Round B

Interpret the slope:b=0.6877On average for every increase in score of Round A, we expect Round B to increase by 0.6877 points

Interpret the y-intercept:a= 26.332 When the score on round A is 0, we predict the score of round B to be 26.332

Page 25: 14.1  Inference for Regression

Extra Problem-95% conf. int.

Page 26: 14.1  Inference for Regression

Linear Regression t-interval

Assump: Stated in the problem they are met

df=10-2=8

We are 95% confident that the true slope of score of fuel consumption per # of railcars is b/w 1.889 and 2.409.