Top Banner

of 13

PPOL Solution

Jun 01, 2018

Download

Documents

Vivek Agarwal
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 8/9/2019 PPOL Solution

    1/13

    PPOL 502: Problem Set #2

    Vivek Agarwal

    February 11, 2015

    Time Taken: 2.5 hours

    Solution 1

    1. A 1% in the distance of house from a recently built garbage incineratorresults in an increase of 0.312% in price of the house sold. One wouldexpect that as the distance between a house and garbage incinerator in-creases its price would go up. Harmful gases released by an incinerator aremore likely to be found in houses closer to the incinerator and are there-fore likely to reduce to the demand for such house. Hence, the positivesign of the coefficient on log(dist ) is expected.

    2. No. Controlling for several other factors has been ignored. It is likelythat the incinerators are actually installed by the local authorities basedon other criterion such as availability of land, distance from water sources(to prevent contamination, etc.). Clearly these factors also determine theprice of the houses, therefore ignoring them creates a bias.

    3. As discussed above, distance from say a clean water body (lake, sea, etc.)can increase the price of a house dramatically. Further, incinerators areusually not installed close to water bodies to prevent contamination, etc.Therefore, proximity to a water body increases the price, but also has aneff ect by the way increasing the distance.

    Solution 2

    1. The average salary is $ 957.9455, while the average IQ of the populationis 101.2824. The standard deviation of IQ in the sample is 15.05264. SeeFigure 1.

    2. The simple regression model has been presented in Figure2. The equationis: d wage = 116.9916 + 8.303064  c IQ   (1)

    1

  • 8/9/2019 PPOL Solution

    2/13

    Figure 1: Summary Results

    Figure 2: Regression Results

    2

  • 8/9/2019 PPOL Solution

    3/13

    Figure 3: Regression Results

    Further, an increase of 15 points in IQ will result in an increase of 15times the coefficient of IQ in the value of  wage . This can be calculated as$124.54596.

    R2 values can help us determine if most of the variation in the dependentvariable is explained by the independent variable. Here, the  R2 value is0.0955. Hence, only 9.55%, a very small part, of the variation in  wage   isexplained by IQ .

    3. The regression model has been presented in Figure3. The equation is:

     d ln(wage) = 5.886994 + 0.0088072  c IQ   (2)

    This model predicts that a one unit change in IQ will result in a 0.88072%change in wage. Further, an increase of 15 points in IQ will result in anincrease which is 15 times the coefficient of  IQ   in the model, i.e. 13.2108%.However, note that this is an approximation and might not be accurate.

    Solution 3

    1. A lower value of  rank  reflects a higher academic superiority of the school.Further, a higher academic superiority is expected to result in studentswith higher abilities - a valued asset for employers - leading to higherstarting salaries. Therefore, rank  is inversely related to salary . This resultsin the expectation of  β 5  ≤ 0.

    2. An increase in LSAT  reflects a higher ability for a better job performanceand consequently would result in higher salaries. Therefore, one would

    3

  • 8/9/2019 PPOL Solution

    4/13

    expect  β 1  ≥ 0. Similarly,  GPA would result in higher salaries. Therefore,one would expect β 2  ≥ 0.

    Higher the number of volumes in a law school library,   libvol , greater arethe opportunities for students to gain more expertise. This expertise isof value to employers and is expected to result in higher starting salaries.Therefore, one would expect  β 3  ≥ 0.

    Students typically perform a cost-benefit perspective while making thechoice of a college. A college with higher cost  would be favored only if thefuture payoff s, including starting  salary , are high. It could also be arguedthat schools that charge more, have greater capacity to administer bettereducation that would result in better academic outcomes for the students.This is again of value to employers and results in the same conclusion.Therefore, one would expect  β 4  ≥ 0.

    3. A one unit change in  GPA  results in a 100 × β 2% change in  salary , i.e.24.8% increase in salary.

    4. A β 3% increase results in salary  from a 1% change in the value of  libel . Theestimated equation suggests that a 0.095% increase results in the value of salary  from every 1% increase in the value of  libel .

    5. For every bettered rank, one can expect an average increase of 0.33%(= −β 5× 100) in their salaries. Therefore, an increase in ranking by 20 isexpected to result in a 6.6% (= 20 × 0.33%) increase in the salary.

    The substantiveness of the 0.33% increase in salary for every unit ‘better-ment’ rank has to be interpreted considering trade-off s involved in increas-ing salary by changing the other parameters. However, overall choosing a

    ‘better’ ranked (lower value) college is expected to result in higher salaries.

    Solution 4

    1. The estimated equation is presented below. Also, see Figure??.

     d  price  = −19.315 + 0.1284362  d sqrft + 15.19819   d bdrms   (3)

    2. The estimated increase in the price for a house is $0 with addition of onemore bedroom, holding square footage constant. This results from thefact that the coefficient of  bdrms  is not statistically significant even at asignificance level of 10%. Hence, its coefficient is indistinguishable from 0.

    3. The estimated increase in the price for a house with an additional bedroomthat is 140 square feet in size is $33,179 (= (β 1δ (sqrft ) +  β 2δ (bdrms )) ×1000).

    4

  • 8/9/2019 PPOL Solution

    5/13

    Figure 4: Regression Results

    Figure 5: Summary Results

    4. R-Squared value suggests the percentage variation in the dependent vari-

    able that is explained by the independent variables. The R-Squared valuefor this estimated model is 0.6319 or 63.19%. Therefore, 63.19% of thevariation in price  for a house is explained by the variation in square footageand number of bedrooms.

    5. Substituting for sqrft  = 2438 and  bdrms  = 4 in Equation 3, we have  price = 354.60522. Hence, the price is $ 354,605.

    (vi) Residual is the diff erence between the actual (price ) and the estimated

    value(  d  price). Therefore, the residual here is $-54,605. Hence, the buyerunderpaid by $54,605according to the predictions from the model.

    Solution 5

    1. The average value of  prpblck  is 0.1134864, and the average value of  income is $47053.78. See Figure 5.   prpblck   is unit-less and is ratio while  income has the units of $ per year and is ratio.

    2. The estimated equation is presented below (See Equation 4). Also, see

    5

  • 8/9/2019 PPOL Solution

    6/13

    Figure 6: Regression Results

    Figure6. The R-Squared is 0.0642 and the sample size is 401.

     d  psoda = 0.9563196 + 0.1149882   d  prpblck + 0.0000016   d income   (4)

    For every 1% increase in the proportion of black population the price of medium soda increases by $0.1149882. This increase is almost 10% themean of   psoda   (1.044876) and greater than the its standard deviation(0.0886873). Therefore, this change can be considered to be substantiallysignificant.

    3. The estimated equation (without controlling for income) is presented be-low (See Equation 5). Also, see Figure7. .

     d  psoda = 1.037399 + 0.0649269   d  prpblck   (5)

    The previous model estimates a larger eff ect of $0.1149882 increase inprice of soda for every percentage increase in black population comparedto $0.0649269 predicted by the modified model. The discrimination eff ectis larger when income is controlled in the model.

    4. For every 1% increase in the proportion of black population the price of soda increases by β 1×100, i.e. 12.15803%. See Figure 8. Therefore, a 0.2%

    increase will increase the price of soda by 2.431606% (= 12.15803%×0.20).

    5. On adding the variable prppov  to the regression model we notice that thevalue of    ˆβ  prpblck  decreases from 0.1215803 to 0.0728072. See Figures 8 &9

    6

  • 8/9/2019 PPOL Solution

    7/13

    Figure 7: Regression Results

    Figure 8: Regression Results

    7

  • 8/9/2019 PPOL Solution

    8/13

    Figure 9: Regression Results

    Figure 10: Correlation Results

    6. The correlation between logincome   and  prppov   is found be -0.8385. SeeFigure 10. A negative correlation was expected because median familyincome,   income , in poor areas is ’roughly’ expected to be low. However,it must be noted that areas with low poverty proportion might still havelow median family incomes.

    7. The statement “because log(income ) and prppov  are so highly correlated,they have no business being in the same regression” is partly unjustified.log(income ) and  prppov  are not perfectly collinear and measure povertybut using diff erent measures. Therefore, although this might lead to mul-ticollinearity, it will not lead to perfect collinearitity and they can be

    included in the same regression.

    Solution 6

    1.   MLR1: Although the few points corresponding to higher values of ln(income)& prpblck  tend to skew the best fit line; however generally a trend can be

    8

  • 8/9/2019 PPOL Solution

    9/13

    Figure 11: Matrix Plot

    observed that is linear. See Figure 11.

    2.   MLR2: No information on random sampling has been provided. However,for the purposes of this exercise it has been assumed that random samplingwas infact adhered to.

    3.   MLR3: From Figure 11 it is clear that none of the Independent Variables(ln(income)   &  prpblck ) are constant. Also, the number of observations(401) are far greater than the number of Independent Variables (2). In ad-dition, STATA doesn’t report a problem with perfect collinearity. Hence,it can be comfortably concluded that MLR3 is satisfied.

    4.   MLR4: This condition essentially requires for the   rvp  plots for all in-dependent variables to be a mirror image across residual equals zero (as-suming one dot represents one point).

    Studying the rvp plot for  prpblck , we can conclude that almost across allthe values of  prpblck  we see that the residuals add up to zero. See Figure12. Similarly, on analyzing the rvp plot for ln(income)   we notice thatalthough for a few low values of ln(income)  the residuals are net negative,they generally cancel each other out for most other values of ln(income).See Figure 13.

    9

  • 8/9/2019 PPOL Solution

    10/13

    Figure 12: Residual versus Predictor Plot for  prpblck 

    This can also be interpreted from the   rvf   plot. Except for the lowerand higher fitted values the residuals approximately cancel each other

    out. This deviation in lower and higher values is not surprising and wasexpected based on the interpretations of the rvp plots. See Figure 14.

    5.   Unbiasedness:Since MLR1-4 are satisfied, it can be concluded that theestimated equation in unbiased. Further, not we can explore the efficiencyof the estimated equation by evaluating MLR 5.

    6.   MLR5: This condition essentially requires for that the envelope of pointsbe rectangular in shape and be symmetric around the residuals = 0 linefor all independent variables.

    Studying the rvp plot for  prpblck , we see that the residuals are boxed inbetween -0.2 and 0.2 for values of  prpblck   between 0.1 to 0.7. For othervalues of  prpblck  though the values are either higher or below this rectan-

    gle, they are very few in number. Therefore, we can conclude that MLR5 is almost satisfied for  prpblck . See Figure 12. Similarly, on analyzingthe rvp plot for ln(income)  we notice for values ln(income)  between 10.25and 11.25 the residuals are boxed between -0.2 and 0.2. However for theother values they are much lower that 0.2. Therefore, we can concludethat MLR 5 is not satisfied for ln(income). See Figure 13.

    10

  • 8/9/2019 PPOL Solution

    11/13

    Figure 13: Residual versus Predictor Plot for lnincome 

    11

  • 8/9/2019 PPOL Solution

    12/13

    Figure 14: Residual versus Fitted Value Plot

    12

  • 8/9/2019 PPOL Solution

    13/13

    Figure 15: Kernel Density Plot

    This can also be interpreted from the  rvf   plot. Though boxed for fittedvalues between 0 and 0.07 between -0.2 and 0.2, the values are much lower

    for higher and lower fitted values. This deviation in lower and highervalues is not surprising and was expected based on the interpretations of the  rvp plots. See Figure 14.

    7.   Efficiency: Since, MLR 5 is violated overall we can conclude that theestimated equation is not an efficient estimator.

    8.   MLR6: This condition requires for the population residual   u   to be in-dependent of the independent variables and be normally distributed withzero mean and constant variance. This can be analyzed from the plotin Figure 15. Clearly, the residuals do not follow a normal distribution.Therefore, it can be concluded that MLR 6 is not satisfied.

    9.   Reliability of Standard Errors: Since MLR 6 is not satisfied we canconclude that the standard errors are not reliable.

    13