MMT BATCH 36 1 QUANTITY DEMAND ANALYSIS Joseph Winthrop B. Godoy
MMT BATCH 36*QUANTITY DEMAND ANALYSISJoseph Winthrop B. Godoy
MMT BATCH 36
INTRODUCTIONShows how a manager can use elasticities of demand as a quantitative forecasting tool
Describes regression analysis, which is the technique economists use to estimate the parameters of demand functions
THE ELASTICITY CONCEPTElasticity AnalysisThe primary tool used to determine the magnitude of such a change
Elasticity Measures the responsiveness of one variable to changes in another variable
THE ELASTICITY CONCEPTTwo aspects of Elasticity
(1) Whether it is positive or negative(2) Whether it is greater than 1 or less than 1 in absolute value
OWN PRICE ELASTICITY OF DEMAND
measures the responsiveness of quantity demanded to a change in price;
the percentage change in quantity demanded divided by the percentage change in the price of the good
ELASTIC DEMAND
demand is said to be elastic if the absolute value of the own price elasticity is greater than 1:
INELASTIC DEMAND
demand is said to be inelastic if the absolute value of the own price elasticity is less than 1:
UNITARY ELASTIC
if the absolute value of the own price elasticity is equal to 1:
ELASTICITY & TOTAL REVENUE
Price of Software Quantity of Software SoldOwn Price ElasticityTotal RevenueA$ 0800.00$ 0B 570-0.14350C 1060-0.33600D 1550-0.60750E 2040-1.00800F 2530-1.67750G 3020-3.00600H 3510-7.00350I 4000
TOTAL REVENUE TEST
If demand is elastic, an increase (decrease) in price will lead to a decrease (increase) in total revenue. If demand is inelastic, an increase (decrease) in price will lead to an increase (decrease) in total revenue. Finally, total revenue is maximized at the point where demand is unitary elastic.
THE ELASTICITY CONCEPTPerfectly ElasticIf the own price elasticity of demand is indefinite on absolute value
Perfectly InelasticIf the own price elasticity of demand is zero
FACTORS AFFECTING THE OWN PRICE ELASTICITY
FACTORS AFFECTING THE OWN PRICE ELASTICITYAvailable Substitutes2. Time3. Expenditure Share
FACTORS AFFECTING THE OWN PRICE ELASTICITYAvailable SubstitutesOne key determinant of the elasticity of demand for a good is the number of close substitutes for the good.
The more substitutes available for the good, the more elastic the demand for it
FACTORS AFFECTING THE OWN PRICE ELASTICITYTime The more time consumers have to react to a price change, the more elastic the demand for the good
Time allows the consumer to seek out available substitutes
FACTORS AFFECTING THE OWN PRICE ELASTICITYExpenditure ShareGoods that comprise a relatively small share of consumers budgets tend to be more inelastic than goods for which consumers spend a sizable portion of their incomes.
When a good comprises only a small portion of the budget, the consumer can reduce the consumption of other goods when the price of the good increases.
MARGINAL REVENUE AND THE OWN PRICE ELASTICITY OF DEMAND Marginal RevenueThe change in total revenue due to a change in output, and that to maximize profits, a firm should produce where marginal revenue equals marginal cost.
Cross-Price Elasticity
Cross-Price ElasticityA measure of the responsiveness of the demand for a good to changes in the price of a related good: the percentage change in the quantity demanded of one good divided by the percentage change in the price of a related good.
Whenever goods X and Y are substitutes, an increase in the price of Y leads to an increase in the demand for X.When goods X and Y are complements, an increase in the price of Y leads to a decrease in the demand for X.
ExampleIf the cross-price elasticity of demand between Corel WordPerfect and Microsoft Word processing software is 3, a 10% hike in the price of Word will increase the demand for WordPerfect by 30 percent, since 30%/10% = 3.
This demand increase for WordPerfect occurs because consumers substitute away from Word and toward WordPerfect, due to the price increase.
Cross-Price ElasticityCross-price elasticities play an important role in the pricing decisions of firms that sell multiple products.
ExampleIn fastfood chains, hamburgers and sodas are complements. When customers buy hamburgers, they buy sodas as well. If the fastfood chain decides to lower the price on hamburgers, the fastfood chains revenues from both hamburgers and sodas are affected. In addition, reducing the price of hamburgers increases the quantity demanded on sodas, thus increasing soda revenues.
Income ElasticityIncome Elasticity is a measure of the responsiveness of consumer demand to changes in income.
Income ElasticityWhen good X is a normal good, an increase in income leads to an increase in the consumption of X. When X is an inferior good, an increase in income leads to a decrease in the consumption of X.
Income ElasticityThe formula forincomeelasticityis:IncomeElasticity= (% change in quantity demanded) / (% change inincome)
Example 1An example of a product with positive incomeelasticitycould be Ferraris. Let's say theeconomyis booming and everyone's income rises by 400%. Because people have extramoney, the quantity of Ferraris demanded increases by 15%.We can use the formula to figure out the income elasticity forthis Italian sports car:Income Elasticity = 15% / 400% = 0.0375
Example 2An example of a good with negative income elasticity could be cheap shoes. Let's again assume the economy is doing well and everyone's income rises by 30%. Because people have extra money and can afford nicer shoes, the quantity of cheap shoes demanded decreases by 10%.The income elasticity of cheap shoes is:Income Elasticity = -10% / 30% = -0.33
Log-Linear DemandDemand is log-linear if the logarithm of demand is a linear function of the logarithms of prices, income, and other variables.
MMT BATCH 36*ECONOMETRICS & REGRESSION ANALYSISJoseph Winthrop B. Godoy
MMT BATCH 36
MMT BATCH 36*EconometricsJoseph Winthrop B. Godoy
MMT BATCH 36
MMT BATCH 36*IntroductionManagers may obtain estimates of demand and elasticity from published studies available in the library or from a consultant hired to estimate the demand function based on the specifics of their company product.The primary job of a manager is to use the information to make decisions
MMT BATCH 36
MMT BATCH 36*IntroductionRegardless of how the manager obtains the estimates, it is useful to have a general understanding of how demand functions are estimated and what the various diagnostic statistics that accompany the reported output mean. This entails knowledge of a branch of economics called econometrics. Econometrics is simply the statistical analysis of economic phenomena.
MMT BATCH 36
EconometricsLets briefly examine the basic ideas underlying the estimation of the demand for a product. Suppose there is some underlying data on the relation between a dependent variable, Y, and some explanatory variable, X. Suppose that when the values of X and Y are plotted, they appear as points A, B, C, D, E, and F in Figure 34.
MMT BATCH 36*
MMT BATCH 36
EconometricsMMT BATCH 36*
MMT BATCH 36
EconometricsClearly, the points do not lie on a straight line, or even a smooth curve (try alternative ways of connecting the dots if you are not convinced). The job of the econometrician is to find a smooth curve or line that does a good job of approximating the points.
MMT BATCH 36*
MMT BATCH 36
EconometricsFor example, suppose the econometrician believes that, on average, there is a linear relation between Y and X, but there is also some random variation in the relationship. Mathematically, this would imply that the true relationship between Y and X is
Y = a + bX + eMMT BATCH 36*
MMT BATCH 36
Econometricswhere a and b are unknown parameters and e is a random variable (an error term) that has a zero mean. Because the parameters that determine the expected relation between Y and X are unknown, the econometrician must find out the values of the parameters a and b.
MMT BATCH 36*
MMT BATCH 36
EconometricsNote that for any line drawn through the points, there will be some discrepancy between the actual points and the line. For example, consider the line in slide 6 or the Figure 34, which does a reasonable job of fitting the data.
MMT BATCH 36*
MMT BATCH 36
EconometricsIf a manager used the line to approximate the true relation, there would be some discrepancy between the actual data and the line. For example, points A and D actually lie above the line, while points C and E lie below it.
MMT BATCH 36*
MMT BATCH 36
EconometricsThe deviations between the actual points and the line are given by the distance of the dashed lines in Figure 34, namely A, C, D, and E. Since the line represents the expected, or average, relation between Y and X, these deviations are analogous to the deviations from the mean used to calculate the variance of a random variable.
MMT BATCH 36*
MMT BATCH 36
EconometricsThe econometrician uses a regression software package to find the values of a and b that minimize the sum of the squared deviations between the actual points and the line. In essence, the regression line is the line that minimizes the squared deviations between the line (the expected relation) and the actual data points. These values of a and b, which frequently are denoted and b, are called parameter estimates, and the corresponding line is called the least squares regression.
MMT BATCH 36*
MMT BATCH 36
Regression Output in Excel
Data
Heating OilTemperatureInsulation
275.3403
363.8273
164.34010Data SheetData for developing a regression model in order to predict the
40.8736consumption of heating oil by single-family homes during January.
94.3646VariableRangeValues
230.9346Heating oilA2:A16(in gallons)
366.796TemperatureB2:B16(average daily, in degrees F)
300.6810InsulationC2:C16(in inches)
237.82310
121.4633
31.46510
203.5416
441.1213
323.0383
52.55810
Correlation
Heating OilTemperatureInsulation
Heating Oil1TemperatureHeating OilInsulationHeating Oil
Temperature-0.8697411704140275.33275.3
Insulation-0.46508252720.0089220394127363.83363.8
40164.310164.3
7340.8640.8
6494.3694.3
34230.96230.9
9366.76366.7
8300.610300.6
23237.810237.8
63121.43121.4
6531.41031.4
41203.56203.5
21441.13441.1
38323.03323.0
5852.51052.5
Correlation
Heating Oil Vs. Temperature
Temperature
Heating Oil
regr
Heating Oil
Insulation
Heating Oil
SUMMARY OUTPUT
Regression Statistics
Multiple R0.9826547566
R Square0.9656103706
Adjusted R Square0.9598787657
Standard Error26.0137832312
Observations15
ANOVA
dfSSMSFSignificance F
Regression2228014.62631736114007.31315868168.47120284210.0000000017
Residual128120.6030159735676.7169179978
Total14236135.229333333
CoefficientsStandard Errort StatP-valueLower 95%Upper 95%
Intercept562.151009228521.093104328626.65093769370516.1930836858608.1089347712
Temperature-5.4365805880.3362161666-16.16989641960.0000000016-6.1691326727-4.7040285033
Insulation-20.01232066622.3425052266-8.54312743430.0000019073-25.1162010201-14.9084403122
Correlation coefficients measure the magnitude and direction of the linear relationship between any two variables. You want high correlation between independent and dependent variables, and low correlations among independent variables. Check the two scatter plots below that demonstrate the correlation idea.
magnitude and direction of the linear relationship between two variables.
Estimated Heating Oil = 562.15 - 5.436 (Temperature) - 20.012 (Insulation)
Y = B0 + B1 X1 + B2X2 + B3X3 - - - +/- ErrorTotal = Estimated/Predicted +/- Error
Correlation between actual y and predicted y
Proportion of variance in Y that is explained by the model (independent variables)
Adjusted for number of independent variables. Useful when comparing different models (with different or different number of independent variables.
Measure of variability around the regression line - (difference between observed and predicted values)
degrees of freedom
Sum of Squared deviations from the mean
Number of Independent Variables (k). DFR
Sum of squared deviations of predicted values from the mean.Regression Sum of Squares (SSR)
Mean SS= SSR / dfRMeasure of variance of the predicted part
Variance ratio.Indicates the variance explained by the model per unit error
p-Value of FProbability of Type I error (rejecting a true null)Null: None of the independent variables is a significant predictor of y. (All coefficients = 0)
Error degrees of freedom= (n-1) - kDFE
Difference between total and regression sum of squares of error (SSE)
Mean SS of Error= SSE / DFEMeasure of Error Variance
Total degrees of freedom (n-1)DFT
Sum of squared deviations of actual y form the mean. Total Sum of Squares (SST)
Estimate of the coefficient of the variable in the regression equation
Standard error of the coefficient
Coefficient devided by its standard error
Probability of Type I error (of rejecting a true null).Null: The variable has no impact on the dependent variable (b=0)
lower limit of 95% confidence interval of estimate
upperlimit of 95% confidence interval of estimate
MMT BATCH 36*
MMT BATCH 36
MMT BATCH 36*Regression AnalysisJoseph Winthrop B. Godoy
MMT BATCH 36
MMT BATCH 36*IntroductionMany problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis is a statistical technique that is very useful for these types of problems. For example, in a chemical process, suppose that the yield of the product is related to the process-operating temperature. Regression analysis can be used to build a model to predict yield at a given temperature level.
MMT BATCH 36
MMT BATCH 36*Regression AnalysisRegression Analysis: the study of the relationship between variables
Regression Analysis: one of the most commonly used tools for business analysis
Easy to use and applies to many situations
MMT BATCH 36
MMT BATCH 36*Regression Modeling PhilosophyNature of the relationshipsModel Building Procedure
Determine dependent variable (y)Determine potential independent variable (x)Collect relevant dataHypothesize the model formFitting the modelDiagnostic check: test for significance
MMT BATCH 36
Basic idea:Use data to identify relationships among variables and use these relationships to make predictions
MMT BATCH 36*
MMT BATCH 36
Linear RegressionFocus:
Gain some understanding of the mechanics. the regression line regression error Learn how to interpret and use the results. Learn how to setup a regression analysis.MMT BATCH 36*
MMT BATCH 36
Linear RegressionRegression is the attempt to explain the variation in a dependent variable using the variation in independent variables.Regression is thus an explanation of causation.If the independent variable(s) sufficiently explain the variation in the dependent variable, the model can be used for prediction.
Independent variable (x)Dependent variable
Linear RegressionLinear dependence: constant rate of increase of one variable with respect to another (as opposed to, e.g., diminishing returns).Regression analysis describes the relationship between two (or more) variables.Examples:
Income and educational level Demand for electricity and the weather Home sales and interest ratesMMT BATCH 36*
MMT BATCH 36
MMT BATCH 36*Regression AnalysisSimple Regression: single explanatory variable
Multiple Regression: includes any number of explanatory variables.
MMT BATCH 36
Single RegressionMultiple Regression
MMT BATCH 36*Regression AnalysisLinear Regression: straight-line relationship
Form: y = mx + b (linear equation)
Non-linear: implies curved relationships, for example logarithmic or curvilinear relationships
MMT BATCH 36
Scatter plotsRegression analysis requires interval and ratio-level data.To see if your data fits the models of regression, it is wise to conduct a scatter plot analysis.The reason?
Regression analysis assumes a linear relationship. If you have a curvilinear relationship or no relationship, regression analysis is of little use.
Scatter plotThis is a linear relationshipIt is a positive relationship.As population with BAs increases so does the personal income per capita.
Regression LineRegression line is the best straight line description of the plotted points and can use it to describe the association between the variables.If all the lines fall exactly on the line then the line is 0 and you have a perfect relationship.
Types of Lines
Scatter Plots of Data with Various Correlation Coefficients
Y
X
YX
Y
X
Y
X
Y
Xr = -1r = -.6r = 0r = +.3r = +1
YXr = 0
YX
Y
X
Y
Y
XXLinear relationshipsCurvilinear relationshipsLinear Correlation
YX
YX
Y
Y
XXStrong relationshipsWeak relationships
Linear Correlation
Linear Correlation
YX
YX
No relationship
Things to remember Regressions are still focuses on association, not causation.Association is a necessary prerequisite for inferring causation, but also:
The independent variable must preceded the dependent variable in time.The two variables must be plausibly lined by a theory,Competing independent variables must be eliminated.
Regression TableThe regression coefficient is not a good indicator for the strength of the relationship.Two scatter plots with very different dispersions could produce the same regression line.
Simple Linear Regression
Independent variable (x)Dependent variable (y)
The output of a regression is a function that predicts the dependent variable based upon values of the independent variables.Simple regression fits a straight line to the data.y = b0 + b1X b0 (y intercept)
b1 = slope= y/ x
Simple Linear Regression
Independent variable (x)Dependent variable
The function will make a prediction for each observed data point. The observation is denoted by y and the prediction is denoted by y.
ZeroPrediction: yObservation: y^^
Simple Linear RegressionFor each observation, the variation can be described as: y = y + Actual = Explained + Error
ZeroPrediction error: ^Prediction: y ^
Observation: y
SIMPLE REGRESSIONMMT BATCH 36*
Relationship
YXDependent VariableIndependent Variable
y = mx + bLinear Equation
= b0 + b1xb0 = y-interceptb1 = slopeb0 = - b1
MMT BATCH 36
SIMPLE REGRESSIONMMT BATCH 36*
VARIABLE DATAXYX2Y2XY1214224416835925154716492858256440SXSYSX2SY2SXY15265515893 35.2n =5
MMT BATCH 36
SIMPLE REGRESSION
MMT BATCH 36*
MMT BATCH 36
SIMPLE REGRESSIONMMT BATCH 36*
MMT BATCH 36
Calculating SSE
Independent variable (x)Dependent variable
The line that minimizes the sum of squared deviations between the line and the actual data points is the least squares regression.A least squares regression selects the line with the lowest total sum of squared prediction errors. This value is called the Sum of Squares of Error, or SSE(2)
Calculating SSR
Independent variable (x)Dependent variable
The Sum of Squares Regression (SSR) is the sum of the squared differences between the prediction for each observation and the population mean.Population mean: y
Regression FormulasCalculating SSTThe Total Sum of Squares (SST) is equal to SSR + SSE.Mathematically,
SSR = ( y y ) (measure of explained variation)
SSE = ( y y ) (measure of unexplained variation)
SST = SSR + SSE = ( y y ) (measure of total variation in y)MMT BATCH 36*
MMT BATCH 36
Regression CoefficientThe regression coefficient is the slope of the regression line and tells you what the nature of the relationship between the variables is.How much change in the independent variables is associated with how much change in the dependent variable.The larger the regression coefficient the more change.
The Coefficient of Determination
Standard Error of RegressionThe Standard Error of a regression is a measure of its variability. It can be used in a similar manner to standard deviation, allowing for prediction intervals.y 2 standard errors will provide approximately 95% accuracy, and 3 standard errors will provide a 99% confidence interval.Standard Error is calculated by taking the square root of the average prediction error.Standard Error = SSEn-kWhere n is the number of observations in the sample and k is the total number of variables in the model
The output of a simple regression is the coefficient and the constant A. The equation is then:y = A + * x + where is the residual error. is the per unit change in the dependent variable for each unit change in the independent variable. Mathematically:
= y x
Multiple Linear RegressionMore than one independent variable can be used to explain variance in the dependent variable, as long as they are not linearly related.
A multiple regression takes the form:
y = A + X + X + + k Xk +
where k is the number of variables, or parameters. 1 1 2 2
MulticollinearityMulticollinearity is a condition in which at least 2 independent variables are highly linearly correlated. It will often crash computers.A correlations table can suggest which independent variables may be significant. Generally, an ind. variable that has more than a .3 correlation with the dependent variable and less than .7 with any other ind. variable can be included as a possible predictor.
Example table of CorrelationsYX1X2Y1.000X10.8021.000X20.8480.5781.000
Nonlinear Regression
Nonlinear functions can also be fit as regressions. Common choices include Power, Logarithmic, Exponential, and Logistic, but any continuous function can be used.
Some Aplications
MMT BATCH 36*
MMT BATCH 36
Sample 15 houses from the region.
House NumberY: Actual Selling Price ($1,000s)X: House Size (100s ft2)189.520.0279.914.8383.120.5456.912.5566.618.0682.514.37126.327.5879.316.59119.924.31087.620.211112.622.012120.8.0191378.512.31474.314.01574.816.7Averages88.8418.17
MMT BATCH 36*Simple Regression Modely = a + bx + e (Note: y = mx + b)Coefficients: a and bVariable a is the y intercept Variable b is the slope of the line
MMT BATCH 36
MMT BATCH 36*Simple Regression ModelPrecision: accepted measure of accuracy is mean squared errorAverage squared difference of actual and forecast
MMT BATCH 36
MMT BATCH 36*Simple Regression ModelAverage squared difference of actual and forecastSquaring makes difference positive, and severity of large errors is emphasized
MMT BATCH 36
MMT BATCH 36*Simple Regression ModelError (residual) is difference of actual data point and the forecasted value of dependant variable y given the explanatory variable x.
Error
MMT BATCH 36
MMT BATCH 36*Simple Regression Model y = mx + b
Y= a + bX + e = 56,104 + 63.11(Sq ft) + e
If X = 2,500 Square feet, then
$213,879 = 56,104 + 63.11(2,500)
MMT BATCH 36
MMT BATCH 36*Simple Regression ModelLinearity
MMT BATCH 36
Chart2
231000238178.237070537
170000196840.681588645
217000206117.965032673
209000217099.239313359
218000219686.78095421
218000266956.748978541
310000222211.211823333
248000247581.742058021
186000185606.964221047
204000186490.515025241
185000194505.583034707
216000207064.626608594
146000186111.850394872
293000274277.598498998
234000210725.051368823
219000202709.983359357
267000263043.8811314
297000264874.093511515
192000191160.712133118
217000184155.416471302
234000237799.572440169
177000208074.398956244
234000259825.231773268
249000217667.236258912
206000213880.589955227
112000197976.675479751
283000226250.30121393
153000207569.512782419
251000194694.915349891
191000237862.683211897
Cost
Predicted Cost
Square Feet
Cost
Square Feet Line Fit Plot
Source
Square FeetCost
2,885231,000
2,230170,000
2,377217,000
2,551209,000
2,592218,000
3,341218,000
2,632310,000
3,034248,000
2,052186,000
2,066204,000
2,193185,000
2,392216,000
2,060146,000
3,457293,000
2,450234,000
2,323219,000
3,279267,000
3,308297,000
2,140192,000
2,029217,000
2,879234,000
2,408177,000
3,228234,000
2,560249,000
2,500206,000
2,248112,000
2,696283,000
2,400153,000
2,196251,000
2,880191,000
Residuals
SUMMARY OUTPUT
Regression Statistics
Multiple R0.5990411755
R Square0.35885033
Adjusted R Square0.3359521275
Standard Error36878.6098986207
Observations30
ANOVA
dfSSMSFSignificance F
Regression121313807694.469921313807694.469915.67155020050.000469324
Residual2838080892305.53011360031868.05465
Total2959394700000
CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%
Intercept56103.660635029341670.92073372391.34635039610.1889884842-29255.446855601141462.76812566-29255.446855601141462.76812566
Square Feet63.110771728115.94217278223.95873088260.00046932430.454674473895.766868982330.454674473895.7668689823
RESIDUAL OUTPUT
ObservationPredicted CostResidualsStandard Residuals
1238178.237070537-7178.2370705372-0.198090313
2196840.681588645-26840.6815886455-0.7406942631
3206117.96503267310882.03496732690.3003001561
4217099.239313359-8099.2393133588-0.2235062502
5219686.78095421-1686.7809542101-0.0465483327
6266956.748978541-48956.7489785412-1.3510082815
7222211.21182333387788.78817666682.4226155194
8247581.742058021418.2579419790.0115422277
9185606.964221047393.03577895260.010846198
10186490.51502524117509.48497475950.4831909737
11194505.583034707-9505.5830347065-0.2623156494
12207064.6266085948935.37339140570.2465801693
13186111.850394872-40111.850394872-1.1069248511
14274277.59849899818722.40150100160.5166625645
15210725.05136882323274.94863117720.642294454
16202709.98335935716290.01664064320.4495385794
17263043.88113143956.11886859970.1091728815
18264874.09351151532125.90648848540.8865450959
19191160.712133118839.28786688160.0231609509
20184155.41647130232844.58352869840.9063776757
21237799.572440169-3799.5724401687-0.1048528332
22208074.398956244-31074.3989562435-0.8575277405
23259825.231773268-25825.2317732683-0.7126719549
24217667.23625891231332.76374108850.8646575637
25213880.589955227-7880.5899552268-0.2174724122
26197976.675479751-85976.6754797509-2.3726085375
27226250.3012139356749.69878606971.5660621801
28207569.512782419-54569.5127824189-1.5058978635
29194694.91534989156305.08465010921.553792628
30237862.683211897-46862.6832118968-1.2932205352
Residuals
2885
2230
2377
2551
2592
3341
2632
3034
2052
2066
2193
2392
2060
3457
2450
2323
3279
3308
2140
2029
2879
2408
3228
2560
2500
2248
2696
2400
2196
2880
Square Feet
Residuals
Square Feet Residual Plot
Final
2310002885
1700002230
2170002377
2090002551
2180002592
2180003341
3100002632
2480003034
1860002052
2040002066
1850002193
2160002392
1460002060
2930003457
2340002450
2190002323
2670003279
2970003308
1920002140
2170002029
2340002879
1770002408
2340003228
2490002560
2060002500
1120002248
2830002696
1530002400
2510002196
1910002880
Cost
Predicted Cost
Square Feet
Cost
Square Feet Line Fit Plot
Sheet3
Square FeetCost
2,885231,000Square FeetCost
2,230170,000Square Feet1
2,377217,000Cost0.59901
2,551209,000
2,592218,0000.5990
3,341218,000
2,632310,000
3,034248,000
2,052186,000
2,066204,000
2,193185,000
2,392216,000
2,060146,000
3,457293,000
2,450234,000
2,323219,000
3,279267,000
3,308297,000
2,140192,000
2,029217,000
2,879234,000
2,408177,000SUMMARY OUTPUT
3,228234,000
2,560249,000Regression Statistics
2,500206,000Multiple R0.599
2,248112,000R Square0.359
2,696283,000Adjusted R Square0.336
2,400153,000Standard Error36,878.61
2,196251,000Observations30
2,880191,000
Square FeetCostANOVA
dfSSMSFSignificance F
Mean2,579.53Mean218,900Regression121313807694.469921313807694.469915.67155020050.000469324
Standard Error78.43Standard Error8,263Residual2838080892305.53011360031868.05465
Median2,475.00Median217,500Total2959394700000
ModeMode234,000
Standard Deviation429.56Standard Deviation45,256CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%
Sample Variance184,525.36Sample Variance2,048,093,103Intercept56,103.6641,670.921.350.1890(29,255.45)141,462.77(29,255.45)141,462.77
Kurtosis-0.69Kurtosis0.22Square Feet63.1115.943.960.000530.4595.7730.4595.77
Skewness0.64Skewness-0.04
Range1,428.00Range198,000
Minimum2,029.00Minimum112,000
Maximum3,457.00Maximum310,000
Sum77,386.00Sum6,567,000
Count30.00Count30
Bin
1500
2000
2500
3000
3500
Sheet3
Scatter Plot
Square Footage
Cost
Scatter Plot of Housing Cost/Sqaure Foot
MMT BATCH 36*Simple Regression ModelLinearity
MMT BATCH 36
Chart1
-7178.2370705372
-26840.6815886455
10882.0349673269
-8099.2393133588
-1686.7809542101
-48956.7489785412
87788.7881766668
418.257941979
393.0357789526
17509.4849747595
-9505.5830347065
8935.3733914057
-40111.850394872
18722.4015010016
23274.9486311772
16290.0166406432
3956.1188685997
32125.9064884854
839.2878668816
32844.5835286984
-3799.5724401687
-31074.3989562435
-25825.2317732683
31332.7637410885
-7880.5899552268
-85976.6754797509
56749.6987860697
-54569.5127824189
56305.0846501092
-46862.6832118968
Square Feet
Residuals
Square Feet Residual Plot
Source
Square FeetCost
2,885231,000
2,230170,000
2,377217,000
2,551209,000
2,592218,000
3,341218,000
2,632310,000
3,034248,000
2,052186,000
2,066204,000
2,193185,000
2,392216,000
2,060146,000
3,457293,000
2,450234,000
2,323219,000
3,279267,000
3,308297,000
2,140192,000
2,029217,000
2,879234,000
2,408177,000
3,228234,000
2,560249,000
2,500206,000
2,248112,000
2,696283,000
2,400153,000
2,196251,000
2,880191,000
Residuals
SUMMARY OUTPUT
Regression Statistics
Multiple R0.5990411755
R Square0.35885033
Adjusted R Square0.3359521275
Standard Error36878.6098986207
Observations30
ANOVA
dfSSMSFSignificance F
Regression121313807694.469921313807694.469915.67155020050.000469324
Residual2838080892305.53011360031868.05465
Total2959394700000
CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%
Intercept56103.660635029341670.92073372391.34635039610.1889884842-29255.446855601141462.76812566-29255.446855601141462.76812566
Square Feet63.110771728115.94217278223.95873088260.00046932430.454674473895.766868982330.454674473895.7668689823
RESIDUAL OUTPUT
ObservationPredicted CostResidualsStandard Residuals
1238178.237070537-7178.2370705372-0.198090313
2196840.681588645-26840.6815886455-0.7406942631
3206117.96503267310882.03496732690.3003001561
4217099.239313359-8099.2393133588-0.2235062502
5219686.78095421-1686.7809542101-0.0465483327
6266956.748978541-48956.7489785412-1.3510082815
7222211.21182333387788.78817666682.4226155194
8247581.742058021418.2579419790.0115422277
9185606.964221047393.03577895260.010846198
10186490.51502524117509.48497475950.4831909737
11194505.583034707-9505.5830347065-0.2623156494
12207064.6266085948935.37339140570.2465801693
13186111.850394872-40111.850394872-1.1069248511
14274277.59849899818722.40150100160.5166625645
15210725.05136882323274.94863117720.642294454
16202709.98335935716290.01664064320.4495385794
17263043.88113143956.11886859970.1091728815
18264874.09351151532125.90648848540.8865450959
19191160.712133118839.28786688160.0231609509
20184155.41647130232844.58352869840.9063776757
21237799.572440169-3799.5724401687-0.1048528332
22208074.398956244-31074.3989562435-0.8575277405
23259825.231773268-25825.2317732683-0.7126719549
24217667.23625891231332.76374108850.8646575637
25213880.589955227-7880.5899552268-0.2174724122
26197976.675479751-85976.6754797509-2.3726085375
27226250.3012139356749.69878606971.5660621801
28207569.512782419-54569.5127824189-1.5058978635
29194694.91534989156305.08465010921.553792628
30237862.683211897-46862.6832118968-1.2932205352
Residuals
2885
2230
2377
2551
2592
3341
2632
3034
2052
2066
2193
2392
2060
3457
2450
2323
3279
3308
2140
2029
2879
2408
3228
2560
2500
2248
2696
2400
2196
2880
Square Feet
Residuals
Square Feet Residual Plot
Final
2310002885
1700002230
2170002377
2090002551
2180002592
2180003341
3100002632
2480003034
1860002052
2040002066
1850002193
2160002392
1460002060
2930003457
2340002450
2190002323
2670003279
2970003308
1920002140
2170002029
2340002879
1770002408
2340003228
2490002560
2060002500
1120002248
2830002696
1530002400
2510002196
1910002880
Cost
Predicted Cost
Square Feet
Cost
Square Feet Line Fit Plot
Sheet3
Square FeetCost
2,885231,000Square FeetCost
2,230170,000Square Feet1
2,377217,000Cost0.59901
2,551209,000
2,592218,0000.5990
3,341218,000
2,632310,000
3,034248,000
2,052186,000
2,066204,000
2,193185,000
2,392216,000
2,060146,000
3,457293,000
2,450234,000
2,323219,000
3,279267,000
3,308297,000
2,140192,000
2,029217,000
2,879234,000
2,408177,000SUMMARY OUTPUT
3,228234,000
2,560249,000Regression Statistics
2,500206,000Multiple R0.599
2,248112,000R Square0.359
2,696283,000Adjusted R Square0.336
2,400153,000Standard Error36,878.61
2,196251,000Observations30
2,880191,000
Square FeetCostANOVA
dfSSMSFSignificance F
Mean2,579.53Mean218,900Regression121313807694.469921313807694.469915.67155020050.000469324
Standard Error78.43Standard Error8,263Residual2838080892305.53011360031868.05465
Median2,475.00Median217,500Total2959394700000
ModeMode234,000
Standard Deviation429.56Standard Deviation45,256CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%
Sample Variance184,525.36Sample Variance2,048,093,103Intercept56,103.6641,670.921.350.1890(29,255.45)141,462.77(29,255.45)141,462.77
Kurtosis-0.69Kurtosis0.22Square Feet63.1115.943.960.000530.4595.7730.4595.77
Skewness0.64Skewness-0.04
Range1,428.00Range198,000
Minimum2,029.00Minimum112,000
Maximum3,457.00Maximum310,000
Sum77,386.00Sum6,567,000
Count30.00Count30
Bin
1500
2000
2500
3000
3500
Sheet3
Scatter Plot
Square Footage
Cost
Scatter Plot of Housing Cost/Sqaure Foot
MMT BATCH 36*Simple Regression ModelIndependence:
Errors must not correlateTrials must be independent
MMT BATCH 36
MMT BATCH 36*Simple Regression ModelHomoscedasticity:
Constant varianceScatter of errors does not change from trial to trialLeads to misspecification of the uncertainty in the model, specifically with a forecastPossible to underestimate the uncertaintyTry square root, logarithm, or reciprocal of y
MMT BATCH 36
MMT BATCH 36*Simple Regression ModelNormality:
Errors should be normally distributedPlot histogram of residuals
MMT BATCH 36
MMT BATCH 36*Multiple Regression ModelY = + 1X1 + + kXk +
MMT BATCH 36
Example: An Empirical Model MMT BATCH 36*
MMT BATCH 36
Empirical Model
Figure 1 Scatter Diagram of oxygen purity versus hydrocarbon level from Table 11-1.
Empirical Model
Based on the scatter diagram, it is probably reasonable to assume that the mean of the random variable Y is related to x by the following straight-line relationship:where the slope and intercept of the line are called regression coefficients.The simple linear regression model is given bywhere is the random error term.
Empirical Models
We think of the regression model as an empirical model.Suppose that the mean and variance of are 0 and 2, respectively, thenThe variance of Y given x is
Empirical Models
The true regression model is a line of mean values:
where 1 can be interpreted as the change in the mean of Y for a unit change in x. Also, the variability of Y at a particular value of x is determined by the error variance, 2. This implies there is a distribution of Y-values at each x and that the variance of this distribution is the same at each x.
Empirical Models
Figure 2 The distribution of Y for a given value of x for the oxygen purity-hydrocarbon data.
Simple Linear Regression
The case of simple linear regression considers a single regressor or predictor x and a dependent or response variable Y. The expected value of Y at each level of x is a random variable:
We assume that each observation, Y, can be described by the model
Simple Linear Regression
Suppose that we have n pairs of observations (x1, y1), (x2, y2), , (xn, yn).
Figure 3 Deviations of the data from the estimated regression model.
Simple Linear Regression
The method of least squares is used to estimate the parameters, 0 and 1 by minimizing the sum of the squares of the vertical deviations in Figure 3.
Figure 3 Deviations of the data from the estimated regression model.
Simple Linear Regression
Using the following Equation, the n observations in the sample can be expressed as
The sum of the squares of the deviations of the observations from the true regression line is