This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 15 Multiple Regression Learning Objectives 1. Understand how multiple regression analysis can be used to develop relationships involving one
dependent variable and several independent variables. 2. Be able to interpret the coefficients in a multiple regression analysis. 3. Know the assumptions necessary to conduct statistical tests involving the hypothesized regression
model. 4. Understand the role of computer packages in performing multiple regression analysis. 5. Be able to interpret and use computer output to develop the estimated regression equation. 6. Be able to determine how good a fit is provided by the estimated regression equation. 7. Be able to test for the significance of the regression equation. 8. Understand how multicollinearity affects multiple regression analysis. 9. Know how residual analysis can be used to make a judgement as to the appropriateness of the model,
identify outliers, and determine which observations are influential. 10. Understand how logistic regression is used for regression analyses involving a binary dependent
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
An estimate of y when x1 = 45 and x2 = 15 is y = -18.4 + 2.01(45) + 4.738(15) = 143.12
3. a. b1 = 3.8 is an estimate of the change in y corresponding to a 1 unit change in x1 when x2, x3, and x4
are held constant. b2 = -2.3 is an estimate of the change in y corresponding to a 1 unit change in x2 when x1, x3, and x4
are held constant. b3 = 7.6 is an estimate of the change in y corresponding to a 1 unit change in x3 when x1, x2, and x4
are held constant. b4 = 2.7 is an estimate of the change in y corresponding to a 1 unit change in x4 when x1, x2, and x3
are held constant. b. y = 17.6 + 3.8(10) – 2.3(5) + 7.6(1) + 2.7(2) = 57.1
4. a. y = 25 + 10(15) + 8(10) = 255; sales estimate: $255,000
b. Sales can be expected to increase by $10 for every dollar increase in inventory investment when
advertising expenditure is held constant. Sales can be expected to increase by $8 for every dollar increase in advertising expenditure when inventory investment is held constant.
5. a. Partial Minitab output follows:
Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value Regression 1 16.640 16.640 11.27 0.015 Televison_Advertising_($1000s) 1 16.640 16.640 11.27 0.015 Error 6 8.860 1.477 Lack-of-Fit 4 6.360 1.590 1.27 0.485 Pure Error 2 2.500 1.250 Total 7 25.500 Model Summary S R-sq R-sq(adj) R-sq(pred) 1.21518 65.26% 59.46% 28.39% Coefficients Term Coef SE Coef T-Value P-Value VIF Constant 88.64 1.58 56.02 0.000 Televison_Advertising_($1000s) 1.604 0.478 3.36 0.015 1.00 Regression Equation Weekly Gross_Revenue_($1000s) = 88.64 + 1.604 Televison_Advertising_($1000s)
c. No, it is 1.60 in part (a) and 2.29 above. In part (b) it represents the marginal change in revenue due
to an increase in television advertising with newspaper advertising held constant. d. Revenue = 83.2 + 2.290(3.5) + 1.301(1.8) = $93.5568 or $93,566.80 6. a. Partial Minitab output follows:
Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value Regression 1 4814.3 4814.3 19.11 0.001 Yds/Att 1 4814.3 4814.3 19.11 0.001 Error 14 3527.4 252.0 Lack-of-Fit 13 3037.6 233.7 0.48 0.829 Pure Error 1 489.8 489.8 Total 15 8341.7 Model Summary S R-sq R-sq(adj) R-sq(pred) 15.8732 57.71% 54.69% 44.88% Coefficients Term Coef SE Coef T-Value P-Value VIF Constant -58.8 26.2 -2.25 0.041 Yds/Att 16.39 3.75 4.37 0.001 1.00
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Regression Equation Win% = -58.8 + 16.39 Yds/Att Fits and Diagnostics for Unusual Observations Std Obs Win% Fit Resid Resid 14 81.30 47.77 33.53 2.19 R R Large residual
b. Partial Minitab output follows:
Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value Regression 1 3653 3652.8 10.91 0.005 Int/Att 1 3653 3652.8 10.91 0.005 Error 14 4689 334.9 Lack-of-Fit 11 3536 321.4 0.84 0.644 Pure Error 3 1153 384.4 Total 15 8342 Model Summary S R-sq R-sq(adj) R-sq(pred) 18.3008 43.79% 39.77% 26.48% Coefficients Term Coef SE Coef T-Value P-Value VIF Constant 97.5 13.9 7.04 0.000 Int/Att -1600 485 -3.30 0.005 1.00 Regression Equation Win% = 97.5 - 1600 Int/Att Fits and Diagnostics for Unusual Observations Obs Win% Fit Resid Std Resid 8 12.50 55.93 -43.43 -2.45 R R Large residual
c. Partial Minitab output follows:
Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value Regression 2 6277 3138.5 19.76 0.000 Yds/Att 1 2624 2624.2 16.52 0.001 Int/Att 1 1463 1462.8 9.21 0.010 Error 13 2065 158.8 Total 15 8342
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Model Summary S R-sq R-sq(adj) R-sq(pred) 12.6024 75.25% 71.44% 60.51% Coefficients Term Coef SE Coef T-Value P-Value VIF Constant -5.8 27.1 -0.21 0.835 Yds/Att 12.95 3.19 4.06 0.001 1.15 Int/Att -1084 357 -3.03 0.010 1.15 Regression Equation Win% = -5.8 + 12.95 Yds/Att - 1084 Int/Att Fits and Diagnostics for Unusual Observations Obs Win% Fit Resid Std Resid 8 12.50 38.57 -26.07 -2.28 R R Large residual
d. The predicted value of Win% for the Kansas City Chiefs is Win% = - 5.8 + 12.95(6.2) – 1084(.036) = 35.47% With 7 wins and 9 loses, the Kansas City Chiefs won 43.75% of the games they played. The
predicted value is somewhat lower than the actual value. 7. a. Partial Minitab output follows:
Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value Regression 1 66.34 66.343 9.87 0.014 Contrast Ratio 1 66.34 66.343 9.87 0.014 Error 8 53.76 6.720 Lack-of-Fit 7 41.26 5.894 0.47 0.811 Pure Error 1 12.50 12.500 Total 9 120.10 Model Summary S R-sq R-sq(adj) R-sq(pred) 2.59221 55.24% 49.65% 37.16% Coefficients Term Coef SE Coef T-Value P-Value VIF Constant 69.89 3.85 18.17 0.000 Contrast Ratio 0.1699 0.0541 3.14 0.014 1.00 Regression Equation Overall Rating = 69.89 + 0.1699 Contrast Ratio
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Regression Equation Total Distance = 124.72 + 1.3943 Club Head Speed
b. Partial Minitab output follows:
Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value Regression 1 6591.4 6591.42 557.21 0.000 Ball Speed 1 6591.4 6591.42 557.21 0.000 Error 188 2223.9 11.83 Lack-of-Fit 180 2106.8 11.70 0.80 0.728 Pure Error 8 117.1 14.64 Total 189 8815.3 Model Summary S R-sq R-sq(adj) R-sq(pred) 3.43937 74.77% 74.64% 74.13% Coefficients Term Coef SE Coef T-Value P-Value VIF Constant 117.14 7.02 16.68 0.000 Ball Speed 0.9876 0.0418 23.61 0.000 1.00 Regression Equation Total Distance = 117.14 + 0.9876 Ball Speed
c. The following scatter diagram illustrates the relationship between the two variables.
The scatter diagram shows a very strong linear relationship between the two variables. In fact, for
these data the coefficient of determination is approximately .99. As a result using both variables in the same model is not recommended because once the linear effect of one variable is accounted for the other variable will be of little additional value. This situation, referred to as multicollinearity, is discussed later in the chapter in the section on testing for significance.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
d. Using the estimated regression equation in part (c) we obtain R/IP = 0.5365 - 0.2483 SO/IP + 1.032 HR/IP = 0.5365 - 0.2483(.91) + 1.032(.16) = .4757 The predicted value for R/IP was less than the actual value. e. This suggestion does not make sense. If a pitcher gives up more runs per inning pitched this pitcher’s
earned run average also has to increase. For these data the sample correlation coefficient between ERA and R/IP is .964. The following partial Minitab output shows the results for part (c) using ERA as the dependent variable.
Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value Regression 2 5.174 2.5870 14.17 0.000 SO/IP 1 1.905 1.9052 10.44 0.005 HR/IP 1 2.190 2.1901 12.00 0.003 Error 17 3.103 0.1825 Total 19 8.276 Model Summary S R-sq R-sq(adj) R-sq(pred) 0.427204 62.51% 58.10% 50.29% Coefficients Term Coef SE Coef T-Value P-Value VIF Constant 3.878 0.647 6.00 0.000 SO/IP -1.843 0.570 -3.23 0.005 1.05 HR/IP 11.99 3.46 3.46 0.003 1.05 Regression Equation ERA = 3.878 - 1.843 SO/IP + 11.99 HR/IP
11. a. SSE = SST - SSR = 6,724.125 - 6,216.375 = 507.75
b. 2 SSR 6,216.375.924
SST 6,724.125R
c. 2 2 1 10 11 (1 ) 1 (1 .924) .902
1 10 2 1a
nR R
n p
d. The estimated regression equation provided an excellent fit.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
c. Yes; after adjusting for the number of independent variables in the model, we see that 90.5% of the variability in y has been accounted for.
13. a. 2 SSR 1760.975
SST 1805R
b. 2 2 1 30 11 (1 ) 1 (1 .975) .971
1 30 4 1a
nR R
n p
c. The estimated regression equation provided an excellent fit.
14. a. 2 SSR 12,000.75
SST 16,000R
b. 2 2 1 91 (1 ) 1 .25 .68
1 7a
nR R
n p
c. The adjusted coefficient of determination shows that 68% of the variability has been explained by
the two independent variables; thus, we conclude that the model does not explain a large amount of variability.
15. a. 2 SSR 23.435.919
SST 25.5R
2 2 1 8 11 (1 ) 1 (1 .919) .887
1 8 2 1a
nR R
n p
b. Multiple regression analysis is preferred since both R2 and 2
aR show an increased percentage of the
variability of y explained when both independent variables are used. 16. a. 2r = .577. Thus, the averages number of passing yards per attempt is able to explain 57.7% of the
variability in the percentage of games won. Considering the nature of the data and all the other factors that might be related to the number of games won, this is not too bad a fit.
b. The value of the coefficient of determination increased to R2 = .752, and the adjusted coefficient of
determination is 2aR = .714. Thus, using both independent variables provides a much better fit.
17. a. A portion of the Minitab output from part (d) of exercise 9 follows:
Model Summary S R-sq R-sq(adj) R-sq(pred) 2.84872 82.79% 82.60% 82.16%
The value of R-sq = 82.79% and the value of R-sq(adj) = 82.60% indicate that the estimated
regression equation provided a very good fit. b. A portion of the Minitab output part (b) of exercise 9 follows:
Model Summary S R-sq R-sq(adj) R-sq(pred) 3.43937 74.77% 74.64% 74.13%
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The value of R-sq = 74.77% indicates that using just ball speed can account for 74.77% of the variability in total distance. The addition of launch angle increases the percentage to almost 83%. Therefore, the estimated regression equation using both ball speed and launch angle will provide better predictions.
18. a. A portion of the Minitab output follows:
Model Summary S R-sq R-sq(adj) R-sq(pred) 0.0537850 56.35% 51.21% 41.25%
The Minitab output in part (c) of exercise 10 shows that R-sq = .5635 and R-sq(adj) = .5121. b. The fit is not great, but considering the nature of the data being able to explain slightly more than
50% of the variability in the number of runs given up per inning pitched using just two independent variables is not too bad.
c. Partial Minitab output using ERA as the dependent variable follows.
Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value Regression 2 5.174 2.5870 14.17 0.000 SO/IP 1 1.905 1.9052 10.44 0.005 HR/IP 1 2.190 2.1901 12.00 0.003 Error 17 3.103 0.1825 Total 19 8.276 Model Summary S R-sq R-sq(adj) R-sq(pred) 0.427204 62.51% 58.10% 50.29% Coefficients Term Coef SE Coef T-Value P-Value VIF Constant 3.878 0.647 6.00 0.000 SO/IP -1.843 0.570 -3.23 0.005 1.05 HR/IP 11.99 3.46 3.46 0.003 1.05 Regression Equation ERA = 3.878 - 1.843 SO/IP + 11.99 HR/IP
The Minitab output shows that R-sq = .6251 and R-sq(adj) = .5810 Approximately 60% of the variability in the ERA can be explained by the linear effect of HR/IP and
SO/IP. This is not too bad considering the complexity of predicting pitching performance. 19. a. MSR = SSR/p = 6,216.375/2 = 3,108.188
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
b. F = MSR/MSE = 3,108.188/72.536 = 42.85 Using F table (2 degrees of freedom numerator and 7 denominator), p-value is less than .01 Actual p-value = .0001 Because p-value = .05, the overall model is significant. c. t = .5906/.0813 = 7.26 Using t table (7 degrees of freedom), area in tail is less than .005; p-value is less than .01 Actual p-value = .0002 Because p-value , is significant. d. t = .4980/.0567 = 8.78 Using t table (7 degrees of freedom), area in tail is less than .005; p-value is less than .01 Actual p-value = .0001 Because p-value , is significant. 20. A portion of the Minitab output follows.
Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value Regression 2 14052 7026.1 43.50 0.000 X1 1 10689 10688.7 66.17 0.000 X2 1 4031 4030.9 24.95 0.002 Error 7 1131 161.5 Total 9 15183 Model Summary S R-sq R-sq(adj) R-sq(pred) 12.7096 92.55% 90.42% 87.95% Coefficients Term Coef SE Coef T-Value P-Value VIF Constant -18.4 18.0 -1.02 0.341 X1 2.010 0.247 8.13 0.000 1.00 X2 4.738 0.948 5.00 0.002 1.00 Regression Equation Y = -18.4 + 2.010 X1 + 4.738 X2
a. Since the p-value corresponding to F = 43.50 is .000 < = .05, we reject H0: = = 0; there is a
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
b. Since the p-value corresponding to t = 8.13 is .000 < = .05, we reject H0: = 0; is significant. c. Since the p-value corresponding to t = 5.00 is .002 < = .05, we reject H0: = 0; is significant. 21. a. In the two independent variable case the coefficient of x1 represents the expected change in y
corresponding to a one unit increase in x1 when x2 is held constant. In the single independent variable case the coefficient of x1 represents the expected change in y corresponding to a one unit increase in x1.
b. Yes. If x1 and x2 are correlated one would expect a change in x1 to be accompanied by a change in
x2. 22. a. SSE = SST - SSR = 16000 - 12000 = 4000
2 SSE 4000571.43
- -1 7s
n p
SSR 12000
MSR 60002p
b. F = MSR/MSE = 6000/571.43 = 10.50 Using F table (2 degrees of freedom numerator and 7 denominator), p-value is less than .01 Actual p-value = .008 Because p-value , we reject H0. There is a significant relationship among the variables. 23. a. F = 28.38 Using F table (2 degrees of freedom numerator and 5 denominator), p-value is less than .01 Actual p-value = .002 Because p-value , there is a significant relationship. b. t = 7.53 Using t table (5 degrees of freedom), area in tail is less than .005; p-value is less than .01 Actual p-value = .001 Because p-value , is significant and x1 should not be dropped from the model. c. t = 4.06 Actual p-value = .010 Because p-value , is significant and x2 should not be dropped from the model.
b. Because the p-value for the F test = .000 < = .05, there is a significant relationship. c. For OffPassYds/G: Because the p-value = .000 < = .05, OffPassYds/G is significant. For DefYds/G: Because the p-value = .011 < = .05, DefYds/G is significant. 25. a. Partial Minitab output follows.
Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value Regression 3 92.352 30.784 15.98 0.000 Itineraries/Schedule 1 1.398 1.398 0.73 0.407 Shore Excursions 1 61.261 61.261 31.81 0.000 Food/Dining 1 30.539 30.539 15.86 0.001 Error 16 30.813 1.926 Total 19 123.166 Model Summary S R-sq R-sq(adj) R-sq(pred) 1.38775 74.98% 70.29% 58.09%
b. Because the p-value corresponding to F = 15.98, 0.000, is less than .05, the level of significance,
overall there is a significant relationship. c. Because the p-value for Itineraries/Schedule (.407) is greater than the level of significance (.05),
Itineraries/Schedule is not significant. Shore Excursions (p-value = .000) and Food/Dining (p-value = .001) are both significant because the p-value for each of these independent variables is less than the level of significance (.05).
d. After removing Itineraries/Schedule from the model, we obtained the following Minitab output.
With Itineraries/Schedule in the model, the R-sq was .7498, while the R-sq after Itineraries/Schedule
was removed from the model was .7385. Removing Itineraries/Schedule from the model resulted in almost no loss in the model’s ability to explain variability in the Overall Score.
a. The p-value associated with F = 10.97 is .001. Because the p-value < .05, there is a significant
overall relationship. b. For SO/IP, the p-value associated with t = -3.46 is .003. Because the p-value < .05, SO/IP is
significant. For HR/IP, the p-value associated with t = 2.37 is .030. Because the p-value < .05, HR/IP is also significant.
27. a. y = 29.1270 + .5906(180) + .4980(310) = 289.8150
b. The point estimate for an individual value is y = 289.8150, the same as the point estimate of the
mean value. 28. Partial Minitab output follows:
Regression Equation Y = -18.4 + 2.010 X1 + 4.738 X2 Variable Setting X1 45 X2 15 Fit SE Fit 95% CI 95% PI 143.157 4.64909 (132.164, 154.151) (111.156, 175.158)
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
a. The 95% confidence interval is 132.164 to 154.151. b. The 95% prediction interval is 111.156 to 175.158. 29. a. y = 83.2 + 2.29(3.5) + 1.30(1.8) = 93.555 or $93,555
Note: In Exercise 5b, the Minitab output also shows that b0 = 83.23, b1 = 2.290, and b2 = 1.301;
hence, y = 83.23 + 2.23x1 + 1.301x2. Using this estimated regression equation, we obtain
y = 83.23 + 2.29(3.5) + 1.301(1.8) = 93.5868 or $93,586.80
The difference ($93,586.80 - $93,555 = $31.80) is simply due to the fact that additional significant
digits are used in the computations. From a practical point of view, however, the difference is not enough to be concerned about. In practice, a computer software package is always used to perform the computations and this will not be an issue.
b. Partial Minitab output follows:
Regression Equation Weekly Gross_Revenue_($1000s) = 83.23 + 2.290 Televison_Advertising_($1000s) + 1.301 Newspaper_Advertising_($1000s) Variable Setting Televison_Advertising_($1000s) 3.5 Newspaper_Advertising_($1000s) 1.8 Fit SE Fit 95% CI 95% PI 93.5875 0.290886 (92.8398, 94.3353) (91.7743, 95.4007)
Confidence interval estimate: 92.8398 to 94.3353 or $92,839.80 to $94,335.30 c. From the partial Minitab output provided in art (b) Prediction interval estimate: 91.7743 to 95.4007 or $91,774.30 to $95,400.70 30. a. Partial Minitab output follows:
Analysis of Variance Source DF Seq SS Contribution Adj SS Adj MS F-Value P-Value Regression 2 6179 47.62% 6179 3089.6 13.18 0.000 OffPassYds/G 1 4466 34.42% 6079 6079.5 25.94 0.000 DefYds/G 1 1713 13.20% 1713 1712.6 7.31 0.011 Error 29 6797 52.38% 6797 234.4 Total 31 12976 100.00% Model Summary S R-sq R-sq(adj) PRESS R-sq(pred) 15.3096 47.62% 44.01% 8636.13 33.45%
y = -0.783 + 0.558 Trade Price + 0.734 Speed of Execution
b. Satisfaction Electronic Trades = - 0.783 + 0.558(3) + 0.734(3) = 3.093 c./d. A portion of the Minitab output follows.
Regression Equation Satisfaction Electronic Trades = -0.783 + 0.558 Trade Price + 0.734 Speed of Execution Variable Setting Trade Price 3 Speed of Execution 3 Fit SE Fit 95% CI 95% PI 3.09292 0.111486 (2.84754, 3.33830) (2.15596, 4.02989) Predicted Values for New Observations
For part (c) the 95% confidence interval is 2.84754 to 3.33830 For part (d) the 95% prediction interval is 2.155596 to 4.02989; but, because the highest possible
rating is 4, the upper end of the prediction interval is treated as 4. 32. a. E(y) = + x1 + x2 where x2 = 0 if level 1 and 1 if level 2 b. E(y) = + x1 + (0) = + x1 c. E(y) = + x1 + (1) = + x1 + d. = E(y | level 2) - E(y | level 1) is the change in E(y) for a 1 unit change in x1 holding x2 constant.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
b. The estimated regression equation did not provide a good fit. In fact, the p-value of .408 shows that the relationship is not significant for any reasonable value of .
c. Let Person = 0 if Bob Jones performed the service Person = 1 if Dave Newton performed the service Partial Minitab output follows:
Analysis of Variance Source DF Seq SS Contribution Adj SS Adj MS F-Value P-Value Regression 1 6.400 61.09% 6.400 6.4000 12.56 0.008 Person 1 6.400 61.09% 6.400 6.4000 12.56 0.008 Error 8 4.076 38.91% 4.076 0.5095 Total 9 10.476 100.00% Model Summary S R-sq R-sq(adj) PRESS R-sq(pred) 0.713793 61.09% 56.23% 6.36875 39.21% Coefficients Term Coef SE Coef 95% CI T-Value P-Value VIF Constant 4.620 0.319 ( 3.884, 5.356) 14.47 0.000 Person -1.600 0.451 (-2.641, -0.559) -3.54 0.008 1.00 Regression Equation Repair Time_(hours) = 4.620 - 1.600 Person 9 10.4760
d. We see that 61.09% of the variability in repair time has been explained by the repair person that
performed the service; an acceptable, but not good, fit. 36. a. The Minitab output follows:
Analysis of Variance Source DF Seq SS Contribution Adj SS Adj MS F-Value P-Value Regression 3 9.4305 90.02% 9.43049 3.14350 18.04 0.002 Months 1 5.5960 53.42% 2.11783 2.11783 12.15 0.013 Type 1 3.4049 32.50% 2.30138 2.30138 13.21 0.011 Person 1 0.4296 4.10% 0.42957 0.42957 2.47 0.167 Error 6 1.0455 9.98% 1.04551 0.17425 Lack-of-Fit 5 1.0455 9.98% 1.04551 0.20910 * * Pure Error 1 0.0000 0.00% 0.00000 0.00000 Total 9 10.4760 100.00% Model Summary S R-sq R-sq(adj) PRESS R-sq(pred) 0.417434 90.02% 85.03% 3.38309 67.71%
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Coefficients Term Coef SE Coef 95% CI T-Value P-Value VIF Constant 1.860 0.729 ( 0.077, 3.643) 2.55 0.043 Months 0.2914 0.0836 (0.0869, 0.4960) 3.49 0.013 2.43 Type 1.102 0.303 ( 0.360, 1.845) 3.63 0.011 1.27 Person -0.609 0.388 (-1.558, 0.340) -1.57 0.167 2.16 Regression Equation Repair Time_(hours) = 1.860 + 0.2914 Months + 1.102 Type - 0.609 Person
b. Since the p-value corresponding to F = 18.04 is .002 < = .05, the overall model is statistically
significant. c. The p-value corresponding to t = -1.57 is .167 > = .05; thus, the addition of Person is not
statistically significant. Person is highly correlated with Months (the sample correlation coefficient is -.691); thus, once the effect of Months has been accounted for, Person will not add much to the model.
37. a. A portion of the Minitab output follows:
Analysis of Variance Source DF Seq SS Contribution Adj SS Adj MS F-Value P-Value Regression 1 91.29 34.42% 91.29 91.290 9.97 0.005 Price ($) 1 91.29 34.42% 91.29 91.290 9.97 0.005 Error 19 173.95 65.58% 173.95 9.155 Lack-of-Fit 10 96.45 36.36% 96.45 9.645 1.12 0.437 Pure Error 9 77.50 29.22% 77.50 8.611 Total 20 265.24 100.00% Model Summary S R-sq R-sq(adj) PRESS R-sq(pred) 3.02575 34.42% 30.97% 214.718 19.05% Coefficients Term Coef SE Coef 95% CI T-Value P-Value VIF Constant 69.28 3.40 (62.16, 76.39) 20.37 0.000 Price ($) 0.559 0.177 (0.188, 0.929) 3.16 0.005 1.00 Regression Equation Score = 69.28 + 0.559 Price ($)
b. Because the p-value = .005 < α = .05, there is a significant relationship. c. Let Type_Italian = 1 if the restaurant is an Italian restaurant; 0 otherwise
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Model Summary S R-sq R-sq(adj) PRESS R-sq(pred) 5.75657 87.35% 84.98% 799.476 80.92% Coefficients Term Coef SE Coef 95% CI T-Value P-Value VIF Constant -91.8 15.2 (-124.0, -59.5) -6.03 0.000 Age 1.077 0.166 ( 0.725, 1.429) 6.49 0.000 1.46 Pressure 0.2518 0.0452 (0.1559, 0.3477) 5.57 0.000 1.25 Smokers 8.74 3.00 ( 2.38, 15.10) 2.91 0.010 1.36 Regression Equation Risk = -91.8 + 1.077 Age + 0.2518 Pressure + 8.74 Smokers
b. Since the p-value corresponding to t = 2.91 is .010 < = .05, smoking is a significant factor. c. Partial Minitab output follows
Regression Equation Risk = -91.8 + 1.077 Age + 0.2518 Pressure + 8.74 Smokers Variable Setting Age 68 Pressure 175 Smokers 1 Fit SE Fit 95% CI 95% PI 34.2661 1.99785 (30.0309, 38.5014) (21.3487, 47.1836)
The point estimate is 34.2661; the 95% prediction interval is 21.3487 to 47.1836. Thus, the
probability of a stroke (.213487 to .471836 at the 95% confidence level) appears to be quite high. The physician would probably recommend that Art quit smoking and begin some type of treatment designed to reduce his blood pressure.
39. a. Partial Minitab output follows:
Analysis of Variance Source DF Seq SS Contribution Adj SS Adj MS F-Value P-Value Regression 1 67.60 84.50% 67.60 67.600 16.35 0.027 x 1 67.60 84.50% 67.60 67.600 16.35 0.027 Error 3 12.40 15.50% 12.40 4.133 Total 4 80.00 100.00% Model Summary S R-sq R-sq(adj) PRESS R-sq(pred) 2.03306 84.50% 79.33% 23.8635 70.17%
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
c. Using Minitab, we obtained the following values:
xi
yi
Studentized Deleted Residual
1 3 .13 2 7 .91 3 5 - 4.42 4 11 .19 5 14 .54
t.025 = 4.303 (n - p - 2 = 5 - 1 - 2 = 2 degrees of freedom) Since the studentized deleted residual for (3, 5) is -4.42 < -4.303, we conclude that the third
observation is an outlier. 40. a. Partial Minitab output follows:
Analysis of Variance Source DF Seq SS Contribution Adj SS Adj MS F-Value P-Value Regression 1 1934.42 98.76% 1934.42 1934.42 238.03 0.001 x 1 1934.42 98.76% 1934.42 1934.42 238.03 0.001 Error 3 24.38 1.24% 24.38 8.13 Total 4 1958.80 100.00% Model Summary S R-sq R-sq(adj) PRESS R-sq(pred) 2.85073 98.76% 98.34% 243.374 87.58% Coefficients Term Coef SE Coef 95% CI T-Value P-Value VIF Constant -53.28 5.79 (-71.69, -34.87) -9.21 0.003 x 3.110 0.202 ( 2.468, 3.752) 15.43 0.001 1.00 Regression Equation y = -53.28 + 3.110 x
b. Using the Minitab we obtained the following values:
With the relatively few observations, it is difficult to determine if any of the assumptions regarding
the error term have been violated. For instance, an argument could be made that there does not appear to be any pattern in the plot; alternatively an argument could be made that there is a curvilinear pattern in the plot.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
c. The values of the standardized residuals are greater than -2 and less than +2; thus, using test, there are no outliers. As a further check for outliers, we used Minitab to compute the following studentized deleted residuals:
t.025 = 2.776 (n - p - 2 = 8 - 2 - 2 = 4 degrees of freedom) Since none of the studentized deleted residuals is less than -2.776 or greater than 2.776, we conclude
that there are no outliers in the data. d. Using Minitab we obtained the following values:
Since none of the values exceed 1.125, we conclude that there are no influential observations.
However, using Cook’s distance measure, we see that D1 > 1 (rule of thumb critical value); thus, we conclude the first observation is influential. Final Conclusion: observations 1 is an influential observation.
42. a. Partial Minitab output follows:
Analysis of Variance Source DF Seq SS Contribution Adj SS Adj MS F-Value P-Value Regression 2 915.66 91.94% 915.66 457.828 74.12 0.000 Price ($1000s) 1 406.39 40.80% 46.22 46.222 7.48 0.017 Horsepower 1 509.27 51.13% 509.27 509.266 82.45 0.000 Error 13 80.30 8.06% 80.30 6.177 Total 15 995.95 100.00% Model Summary S R-sq R-sq(adj) PRESS R-sq(pred) 2.48532 91.94% 90.70% 142.697 85.67%
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Fitted Value
Stan
dard
ized
Res
idua
l
1201151101051009590
2
1
0
-1
-2
c. The Minitab output shown in part (a) did not identify any observations with a large standardized
residual; thus, there does not appear to be any outliers in the data. d. The Minitab output shown in part (a) identifies observation 2 as an influential observation. 43. a. The Minitab output follows:
Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value Regression 2 229.676 114.838 969.26 0.000 Greens in Reg. 1 151.431 151.431 1278.11 0.000 Putting Avg. 1 123.274 123.274 1040.46 0.000 Error 131 15.521 0.118 Lack-of-Fit 130 15.483 0.119 3.10 0.429 Pure Error 1 0.038 0.038 Total 133 245.197 Model Summary S R-sq R-sq(adj) R-sq(pred) 0.344209 93.67% 93.57% 93.26% Coefficients Term Coef SE Coef T-Value P-Value VIF Constant 57.148 0.989 57.76 0.000 Greens in Reg. -23.106 0.646 -35.75 0.000 1.04 Putting Avg. 1.0320 0.0320 32.26 0.000 1.04 Regression Equation Scoring Avg. = 57.148 - 23.106 Greens in Reg. + 1.0320 Putting Avg.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Fits and Diagnostics for Unusual Observations Scoring Obs Avg. Fit Resid Std Resid 11 73.8000 72.8612 0.9388 2.74 R 25 72.1250 72.9445 -0.8195 -2.41 R 34 74.7500 73.7888 0.9612 2.81 R 36 75.5000 75.2114 0.2886 0.88 X 56 73.2500 73.8035 -0.5535 -1.67 X 59 74.8330 74.4215 0.4115 1.24 X 62 74.1670 73.4551 0.7119 2.10 R 64 75.7500 76.1522 -0.4022 -1.22 X 102 74.0000 75.1150 -1.1150 -3.33 R 122 73.6250 72.2792 1.3458 4.01 R 129 73.2780 72.5712 0.7068 2.09 R R Large residual X Unusual X
b. The standardized residual plot follows:
The standardized residual plot does not support the assumption about . There are several unusual
observations and the variance of the residuals appears to be increasing for larger values of y .
c. The Minitab output in part (a) identified seven outliers: observations 11, 25, 34, 62, 102, 122, and
129. Observations 25 and 102 correspond to Charley Hull and P.K. Kongkraphan, respectively; their
scoring averages were was much lower than other golfers with similar percentage of time hitting the green in regulation and average number of putts taken on greens hit in regulation.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Observations 11, 34, 62, 122, and 129 correspond to Ashleigh Simon, Dori Carter, Karin Sjodin, Sophia Popov, and Tiffany Joh, respectively; their scoring averages were each much higher than other golfers with similar percentage of time hitting the green in regulation and average number of putts taken on greens hit in regulation.
d. The Minitab output in part (a) identified four influential observations: observations 36, 56, 59, and 64. Observation 36 corresponds to Garrett Phillips, observation 56 corresponds to Jing Yan, Observation 59 corresponds to Ju Young Park, and observation 64 corresponds to Karlin Beck.
44. a. 0
0( )
1
x
x
eE y
e
b. It is an estimate of the probability that a customer that does not have a Simmons credit card will
make a purchase. c. A portion of the Minitab binary logistic regression output follows:
Coefficients Term Coef SE Coef VIF Constant -0.944 0.315 Card 1.025 0.423 1.00 Odds Ratios for Continuous Predictors Odds Ratio 95% CI Card 2.7857 (1.2147, 6.3886) Regression Equation P(1) = exp(Y')/(1 + exp(Y')) Y' = -0.944 + 1.025 Card
Thus, the estimated logit is ˆ( )g x -0.944 + 1.025x
d. For customers that do not have a Simmons credit card (x = 0) ˆ (0)g -0.945 + 1.25(0) = 0.945
and
ˆ (0) 0.945
ˆ 0.945(0)
0.38868ˆ 0.279
1 0.3886811
g
g
e ey
ee
For customers that have a Simmons credit card (x = 1) ˆ (1)g -0.945 + 1.025(1) = 0.0800
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
ˆ (1) 0.08
ˆ 0.08(1)
1.0833ˆ 0.52
1 1.083311
g
g
e ey
ee
e. Using the Minitab output shown in part (c), the estimated odds ratio is 2.7857. We can conclude that
the estimated odds of making a purchase for customers who have a Simmons credit card are 2.7857 times greater than the estimated odds of making a purchase for customers that do not have a Simmons credit card.
45. a. odds =.3148
.45941 .3148
b. odds1 =.5796
1.37871 .5796
odds0 = .4594 (from part (a))
odds ratio = 1
0
odds 1.37873.00
odds .4594
c. The odds ratio for x2 computed holding annual spending constant at $4000 is also 3.00. This shows
that the odds ratio for x2 is independent of the value of x1.
46. a. 0
0( )
1
x
x
eE y
e
b. A portion of the Minitab binary logistic regression output follows:
Coefficients Term Coef SE Coef VIF Constant -2.633 0.799 Balance 0.2202 0.0900 1.00 Odds Ratios for Continuous Predictors Odds Ratio 95% CI Balance 1.2463 (1.0447, 1.4868) Regression Equation P(1) = exp(Y')/(1 + exp(Y')) Y' = -2.633 + 0.2202 Balance
Thus, the estimated logistic regression equation is
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
c. A portion of the Minitab binary logistic regression output follows:
Deviance Table Source DF Adj Dev Adj Mean Chi-Square P-Value Regression 1 9.460 9.460 9.46 0.002 Balance 1 9.460 9.460 9.46 0.002 Error 48 51.626 1.076 Total 49 61.086
Significant result: the p-value corresponding to the 2 test statistic is 0.002. d. For an average monthly balance of $1000, x = 10
2.633 0.2202 2.633 0.2202(10) 0.431
2.633 0.2202 2.633 0.2202(10) 0.431
0.6499( ) 0.3939
1.64991 1 1
x
x
e e eE y
e e e
Thus, an estimate of the probability that customers with an average monthly balance of $1000 will
sign up for direct payroll deposit is 0.3939. e. Repeating the calculations in part (d) using various values for x, a value of x = 12 or an average
monthly balance of approximately $1200 is required to achieve this level of probability. f. Using the Minitab output shown in part (b), the estimated odds ratio is 1.2463. Because values of x
are measured in hundreds of dollars, the estimated odds of signing up for payroll direct deposit for customers that have an average monthly balance of $600 is 1.2463 times greater than the estimated odds of signing up for payroll direct deposit for customers that have an average monthly balance of $500. Moreover, this interpretation is true for any one hundred dollar increment in the average monthly balance.
47. a. 0 1 1 2 2
0 1 1 2 2( )
1
x x
x x
eE y
e
b. For a given GPA, it is an estimate of the probability that a student who did not attend the orientation
program will return to Lakeland for the sophomore year. c. A portion of the Minitab binary logistic regression output follows:
Coefficients Term Coef SE Coef VIF Constant -6.89 1.75 GPA 2.539 0.673 1.01 Program 1.561 0.563 1.01 Odds Ratios for Continuous Predictors Odds Ratio 95% CI GPA 12.6644 (3.3872, 47.3515) Program 4.7624 (1.5794, 14.3607) Regression Equation P(1) = exp(Y')/(1 + exp(Y'))
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Y' = -6.89 + 2.539 GPA + 1.561 Program Thus, the estimated logit is 1 2 1 2ˆ ( , ) 6.89 2.539 1.561g x x x x
d. A portion of the Minitab binary logistic regression output follows:
Deviance Table Source DF Adj Dev Adj Mean Chi-Square P-Value Regression 2 47.869 23.9347 47.87 0.000 GPA 1 20.966 20.9663 20.97 0.000 Program 1 7.862 7.8616 7.86 0.005 Error 97 80.338 0.8282 Total 99 128.207
Significant result: the p-value corresponding to the 2 test statistic is 0.000. e. From the portion of the Minitab binary logistic regression output shown in the solution to part (d),
both variables are significant at = .01: the p-value for x1 is 0.000 and the p-value for x2 is 0.005. f. For x1 =2.5 and x2 = 0 g (2.5, 0) = -6.89 + 2.539(2.5) + 1.561(0) = -0.5425
and
ˆ (2.5,0) 0.5425
ˆ 0.5425(2.5,0)
0.5813ˆ 0.3676
1 0.581311
g
g
e ey
ee
For x1 =2.5 and x2 = 1 g (2.5, 1) = -6.89 + 2.539(2.5) + 1.561(1) = 1.0185
and
ˆ (2.5,1) 1.0185
ˆ 1.0185(2.5,1)
2.769ˆ 0.7347
1 2.76911
g
g
e ey
ee
g. From the Minitab output in part (c) we see that the estimated odds ratio is 4.7624 for the orientation
program. This means that the odds of students who attended the orientation program continuing are 4.7624 times greater than for students who did not attend the program.
h. We recommend making the orientation program required. From part (e), we see that the odds of
continuing are much higher for students who have attended the orientation program.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
b. A portion of the Minitab binary logistic regression output follows:
Coefficients Term Coef SE Coef VIF Constant -39.5 12.5 Wet 3.37 1.26 1.03 Noise 1.816 0.831 1.03 Odds Ratios for Continuous Predictors Odds Ratio 95% CI Wet 29.2095 (2.4521, 347.9413) Noise 6.1489 (1.2059, 31.3540) Regression Equation P(1) = exp(Y')/(1 + exp(Y')) Y' = -39.5 + 3.37 Wet + 1.816 Noise
Thus, the estimated logit is ˆ ( )g x -39.5 + 3.37Wet + 1.816Noise
c. For tires that have a Wet performance rating of 8 and a Noise performance rating of 8 ˆ ( )g x -39.5 + 3.37Wet + 1.816Noise
ˆ ( )g x -39.5 + 3.37(8) + 1.816(8) = 1.988
1.988
1.988
7.30092ˆ 0.8795
1 7.300921
ey
e
The probability that a customer will probably or definitely purchase a particular tire again with these
performance characteristics is .8795. d. For tires that have a Wet performance rating of 7 and a Noise performance rating of 7 ˆ ( )g x -39.5 + 3.37Wet + 1.816Noise
ˆ ( )g x -39.5 + 3.37(7) + 1.816(7) = -3.198
3.198
3.198
.04084ˆ 0.0392
1 .040841
ey
e
The probability that a customer will probably or definitely purchase a particular tire again with these
performance characteristics is .0392. e. Wet and Noise performance ratings of 7 are both considered Excellent performance ratings using the
Tire Rack performance scale. Nonetheless, the probability that the customer will repurchase a tire with these characteristics is very low. But, a one point increase in both ratings increases the probability to .8795. So, achieving the highest possible levels of performance is essential if the manufacture wants to have the greatest chance of having an existing customer buy their tire again.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
49. a. The expected increase in final college grade point average corresponding to a one point increase in high school grade point average is .0235 when SAT mathematics score does not change. Similarly, the expected increase in final college grade point average corresponding to a one point increase in the SAT mathematics score is .00486 when the high school grade point average does not change.
b. y = -1.41 + .0235(84) + .00486(540) = 3.19
50. a. Job satisfaction can be expected to decrease by 8.69 units with a one unit increase in length of
service if the wage rate does not change. A dollar increase in the wage rate is associated with a 13.5 point increase in the job satisfaction score when the length of service does not change.
b. y = 14.4 - 8.69(4) + 13.5(13) = 155.14
51. a. The computer output with the missing values filled in is as follows:
Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value Regression _2 1612 806 71.820 0.000 x1 1 146.366 146.366 13.042 0.004 x2 1 289.047 289.047 25.756 0.000 Error 12 134.67 11.223 Total 14 1746.67 Model Summary S R-sq R-sq(adj) R-sq(pred) 3.35 92.30% 91.02% 85.12% Coefficients Term Coef SE Coef T-Value P-Value VIF Constant 8.103 2.667 3.04 0.010 x1 7.602 2.105 3.61 0.004 1.62 x2 3.111 0.613 5.08 0.000 1.62 Regression Equation y = 8.103 + 7.602 X1 + 3.111 X2
b. F.05 = 3.89 F = 71.82 > F.05; significant relationship Actual p-value = .000 Because p-value = .05, the overall relationship is significant c. Using t table (12 degrees of freedom), area in tail corresponding to t = 3.61 is less than .005; p-value
is less than .01 Actual p-value = .004 Because p-value , reject H0: = 0
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Using t table (12 degrees of freedom), area in tail corresponding to t = 5.08 is less than .005; p-value is less than .01
Actual p-value = .000 Because p-value , reject H0: = 0 d. See computer output.
e. 2 141 (1 .9230) .9102
12aR
52. a. The computer output with the missing values filled in is as follows
Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value Regression 2 1.76209 0.88105 52.29 0.000 X1 1 0.12389 0.12389 7.35 0.030 X2 1 0.34308 0.34308 20.36 0.003 Error 7 0.11794 0.01685 Total 9 1.88003 Model Summary S R-sq R-sq(adj) R-sq(pred) 0.1298 93.73% 91.93% 74.53% Coefficients Term Coef SE Coef T-Value P-Value VIF Constant -1.41 0.4848 -2.91 0.023 x1 0.0235 0.0087 2.71 0.030 1.54 x2 0.00486 0.0011 4.51 0.003 1.54 Regression Equation y = -1.41 + 0.0235 X1 + 0.00486 X2
b. F.05 = 4.74 F = 52.29 > F.05; significant relationship Actual p-value = .000 Because p-value = .05, the overall relationship is significant c. for 1 : p-value = .030; reject H0: 1 = 0
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Source DF Adj SS Adj MS F-Value P-Value Regression 1 60.787 60.7866 85.93 0.000 Steering 1 60.787 60.7866 85.93 0.000 Error 16 11.318 0.7074 Lack-of-Fit 11 10.092 0.9174 3.74 0.078 Pure Error 5 1.227 0.2453 Total 17 72.105 Model Summary S R-sq R-sq(adj) R-sq(pred) 0.841071 84.30% 83.32% 81.08% Coefficients Term Coef SE Coef T-Value P-Value VIF Constant -7.52 1.47 -5.13 0.000 Steering 1.815 0.196 9.27 0.000 1.00 Regression Equation Buy Again = -7.52 + 1.815 Steering
Because the p-value = .000 < α = .05, there is a significant relationship. b. The estimated regression equation provided a good fit; 84.3 % of the variability in the Buy Again
rating was explained by the linear effect of the Steering rating. c. A portion of the Minitab output follows:
Analysis of Variance Source DF Adj SS Adj MS F-Value P-Value Regression 2 67.185 33.5924 102.41 0.000 Steering 1 1.888 1.8880 5.76 0.030 Tread Wear 1 6.398 6.3982 19.51 0.001 Error 15 4.920 0.3280 Total 17 72.105 Model Summary S R-sq R-sq(adj) R-sq(pred) 0.572723 93.18% 92.27% 90.18% Coefficients Term Coef SE Coef T-Value P-Value VIF Constant -5.39 1.11 -4.86 0.000 Steering 0.690 0.288 2.40 0.030 4.65 Tread Wear 0.911 0.206 4.42 0.001 4.65 Regression Equation Buy Again = -5.39 + 0.690 Steering + 0.911 Tread Wear
e. Since the p-value corresponding to F = 207.3108 is .0000 < = .05, there is a significant overall relationship. Because the p-values for each independent variable are also < = .05, each of the independent variables is significant.
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
56. a. Type of Fund is a categorical variable with three levels. Let FundDE = 1 for a domestic equity fund and FundIE = 1 for an international fund. The Excel output follows:
Since the p-value corresponding to F = 33.4584 is .0000 < = .05, there is a significant relationship. b. R Square = .6144. A reasonably good fit using only Type of Fund. c. The Excel output follows:
Regression Statistics Multiple R 0.8135 R Square 0.6617 Adjusted R Square 0.6279 Standard Error 5.3726
Observations 45
ANOVA
Df SS MS F Significance F Regression 4 2258.3432 564.5858 19.5598 5.48647E-09 Residual 40 1154.5827 28.8646
May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%Intercept 1.1899 2.3781 0.5004 0.6196 -3.6164 5.9961FundDE 6.8969 2.7651 2.4942 0.0169 1.3083 12.4854FundIE 17.6800 3.3161 5.3315 4.096E-06 10.9778 24.3821Net Asset Value ($) 0.0265 0.0670 0.3950 0.6950 -0.1089 0.1619
Expense Ratio (%) 6.4564 2.7593 2.3399 0.0244 0.8798 12.0331 Since the p-value corresponding to F = 19.5558 is .0000 < = .05, there is a significant relationship. For Net Asset Value ($), the p-value corresponding to t = .3950 is .6950 > = .05, Net Asset Value
($) is not significant and can be deleted from the model. d. Morningstar Rank is a categorical variable. The data set only contains funds with four ranks (2-Star
through –5Star), so three dummy variables are needed. Let 3StarRank = 1 for a 3-StarRank, 4StarRank = 1 for a 4-StarRank, and 5StarRank = 1 for a 5-StarRank. The Excel output follows:
e. Hourly ($1000s): Significant because the p-value = .000 < α = .05 Size-Midsize: Not significant because the p-value = .802 > α = .05 Size-Small: Significant because the p-value = .003 < α = .05 f. A portion of the Minitab output using Hourly ($1000s) and Size-Small as the independent variables