Diploma in Statistics Introduction to Regression Lecture 2.2 1 Introduction to Regression Lecture 2.2 1. Review of Lecture 2.1 – Homework – Multiple regression – Job times case study 2. Job times continued – residual analysis – model fitting and testing 3. Model fitting and testing procedure 4. t-tests 5. Analysis of Variance
53
Embed
Diploma in Statistics Introduction to Regression Lecture 2.21 Introduction to Regression Lecture 2.2 1.Review of Lecture 2.1 –Homework –Multiple regression.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Diploma in StatisticsIntroduction to Regression
Lecture 2.2 1
Introduction to RegressionLecture 2.2
1. Review of Lecture 2.1
– Homework– Multiple regression– Job times case study
2. Job times continued
– residual analysis– model fitting and testing
3. Model fitting and testing procedure
4. t-tests
5. Analysis of Variance
Diploma in StatisticsIntroduction to Regression
Lecture 2.2 2
Update: Accessing data files
• Access the data in mstuart's get folder:– in ISS Public Access labs, click Start, then
Network Shortcuts, open Get– on your own computer with TCD network
access, navigate to Ntserver-usr / get– once in get, type ms, open mstuart, Diploma
Homework 2.1.1The shelf life of packaged foods depends on many factors. Dry cereal (such as corn flakes) is considered to be a moisture-sensitive product, with the shelf life determined primarily by moisture. In a study of the shelf life of one brand of cereal, packets of cereal were stored in controlled conditions (23°C and 50% relative humidity) for a range of times, and moisture content was measured. The results were as follows.
Draw a scatter diagram. Comment. What action is suggested? Why?
Draw a scatter diagram. Comment. What action is suggested? Why?
2 exceptional cases; delete and investigate
Storage Time
Mois
ture
Conte
nt
403020100
5.0
4.5
4.0
3.5
3.0
Scatterplot of Moisture Content vs Storage Time
Diploma in StatisticsIntroduction to Regression
Lecture 2.2 5
Following appropriate action, the following regression was computed.
The regression equation isMoisture = 2.86 + 0.0417 Storage
Predictor Coef SE Coef T PConstant 2.86122 0.02488 115.01 0.000Storage 0.041660 0.001177 35.40 0.000
S = 0.0493475
Calculate a 95% confidence interval for the daily change in moisture content; show details.
)04401.0,03931.0(00235.004166.0)ˆ(SE2ˆ
Diploma in StatisticsIntroduction to Regression
Lecture 2.2 6
Was the action you suggested on studying the scatter diagram in part (a) justified? Explain.
Predict the moisture content of a packet of cereal stored under these conditions for 5 weeks; calculate a prediction interval.
What would be the effect on your interval of not taking the action you suggested on studying the scatter diagram? Why?
Taste tests indicate that this brand of cereal is unacceptably soggy when the moisture content exceeds 4. Based on your prediction interval, do you think that a box of cereal that has been on the shelf for 5 weeks will be acceptable? Explain.
What about 4 weeks? 3 weeks? What is acceptable?
Diploma in StatisticsIntroduction to Regression
Lecture 2.2 7
Introduction to RegressionLecture 2.2
1. Review of Lecture 2.1
– Homework– Multiple regression– Job times case study
2. Job times continued
– residual analysis– model fitting and testing
3. Model fitting and testing procedure
4. t-tests
5. Analysis of Variance
Diploma in StatisticsIntroduction to Regression
Lecture 2.2 8
Example 5A production prediction problem
Erie Metal Products: The problem
Metal products fabrication:
customers order varying quantities of products of varying complexity;
customers demand accurate and precise order delivery times.
Diploma in StatisticsIntroduction to Regression
Lecture 2.2 9
Table 8.1 Times, in hours, to complete jobs with varying numbers of units, numbers of operations per unit and priority status (normal or rushed)
Order Jobtime Units Operations Normal (0)
number (hours) per unit or Rushed (1)? 1 153 100 6 0
Original fit 77 –0.15 7.2 0.11 –25 Revised fit 42 –0.08 10 0.11 –38 Final fit 44 –0.07 9.8 0.11 –38
Final s.e. 9 0.03 0.9 0.004 4
Diploma in StatisticsIntroduction to Regression
Lecture 2.2 35
Homework 2.2.1
Extend table of predictions of small medium and large jobs to include predictions based on the final fit.
Compare and contrast.
Diploma in StatisticsIntroduction to Regression
Lecture 2.2 36
Introduction to RegressionLecture 2.2
1. Review of Lecture 2.1
– Homework– Multiple regression– Job times case study
2. Job times continued
– residual analysis– model fitting and testing
3. Model fitting and testing procedure
4. t-tests
5. Analysis of Variance
Diploma in StatisticsIntroduction to Regression
Lecture 2.2 37
The model fitting and testing procedure
• Step 1: Initial data analysis:
• Step 2: Least squares fit and interpretation:
• Step 3: Diagnostic analysis of residuals:
• Step 4: Iterate fit and check:
Diploma in StatisticsIntroduction to Regression
Lecture 2.2 38
Step 1: Initial data analysis
• standard single variable summaries
– to determine extent of variation
– possible exceptional values;
• scatter plot matrix
– to view pair wise relationships between the response and the explanatory variables
and– to view pair wise relationships between the
explanatory variables themselves.
Diploma in StatisticsIntroduction to Regression
Lecture 2.2 39
Step 2: Least squares fit and interpretation
• calculate the best fitting regression coefficients
– check meaningfulness and statistical significance;
• calculate s
– check its usefulness for prediction
– its usefulness relative to alternative estimates of standard deviation.
Diploma in StatisticsIntroduction to Regression
Lecture 2.2 40
Step 3: Diagnostic analysis of residuals
• diagnostic plot
– check for exceptional residuals or patterns of residuals,
– possible explanations in terms of the fitted values;
• Normal plot
– check for exceptional residuals or non-linear patterns in the residuals
Diploma in StatisticsIntroduction to Regression
Lecture 2.2 41
Step 4: Iterate fit and check
• determine cases for deletion
– repeat steps 2 and 3 until checks are passed.
Diploma in StatisticsIntroduction to Regression
Lecture 2.2 42
Homework 2.2.2You have been asked to comment, as a statistical consultant, on a prediction formula for forecasting job completion times prepared by a former employee. The formula is, effectively, the one derived from the first fit discussed above. Write a report for management. Your report should refer to
(i) the practical usefulness of the employee's prediction formula, from a customer's perspective,
(ii) the significance of the exceptional cases from the customer's and management's perspectives, and
(iii) your recommended formula, with its relative advantages.
Diploma in StatisticsIntroduction to Regression
Lecture 2.2 43
Introduction to RegressionLecture 2.2
1. Review of Lecture 2.1
– Homework– Multiple regression– Job times case study
2. Job times continued
– residual analysis– model fitting and testing
3. Model fitting and testing procedure
4. t-tests
5. Analysis of Variance
Diploma in StatisticsIntroduction to Regression
Lecture 2.2 44
t-tests
First fit
The regression equation isJobtime = 77.2 – 0.151 Units + 7.15 Ops + 0.115 T_Ops