Top Banner
Student Number 100799832 A CROSS-COUNTRY ANALYSIS OF THE EFFECT OF “CO2 EMISSION”, “IMPROVED WATER SOURCE” AND “IMPROVED SANITATION FACILITIES” “ON LIFE EXPECTANCY” Word count: 2614 Carlo Armillis
22

Final Project Paper Ec2203

Apr 13, 2017

Download

Documents

Carlo Armillis
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Final Project Paper Ec2203

Student Number 100799832

A cross-country analysis of the effect of “Co2 emission”, “improved water source” and “improved sanitation facilities” “on Life expectancy”

Word count: 2614 Carlo Armillis

Page 2: Final Project Paper Ec2203

Introduction

Life expectancy can be a suitable measure of overall quality of life for a country. In the last century, we experienced a dramatic increase in life expectancy throughout the world.“The World Health Organization”(2014) states that since 1990, life expectancy increased overall by 6 years, reaching an average of 71 years in 2013 for both sex globally. “The National Institute of Aging” (2011) affirms that life expectancy exceeds now 83 years in japan and at least 81 in several other countries.I suppose that longevity is mainly explained by the percentage of population that have access to improved water source. “Africa Health, Human & Social Develpment”(2014) in another article about the 25 countries with least access to clean water source, shows how those nations have a life expectancy with an average of 55 years with some countries having it lower than 50. The same article also shows the influence of another variable, access to “improved sanitation facilities”, and i could easily figure that lack of sanitation has a high explanatory power on adult life expectancy. Therefore I chose to investigate further the relationship between the two variables and the expected years of living, and i also decide to add a third independent variable, access to electricity, which i thought was needed to explain the longevity of a person. In order to demonstrate this, this paper will adopt the Ordinary Least Square method which provide the best linear unbiased estimates.

Theoretical Framework

With this model we expect to show a linear relationship between the dependent variable (“LifeExpectancy”) and the independent variables “Access to electricity” (Percentage of population with access), Improved Water Source (Percentage of population with access) and improved sanitation (percentage of population using Improved Sanitation Facilities) also called regressors. My aim is to demonstrate that those 3 variables have a positive coefficient which will let us conclude that an increase in any of those will lead to an increase life expectancy.The model is the following:

Life Expectancy = 𝛽0+ 𝛽1 Access to Electricity + 𝛽2 Improved water source + 𝛽3 Improved sanitation facilities

Where 𝛽0 is the constant for life expectancy when the variables variations are zero, 𝛽1 , 𝛽2 and 𝛽3 are respectively

the coefficients of Access to Electricity, Improved Water Source and Improved Sanitation, and show the changes in our

dependent variable following a unit change in our independent variables

Data

The data found regarding my variables were all provided from http://databank.worldbank.org/. The reason for this was that I personally find The world data bank as one of the more reliable and complete gathering of data over the web. This allowed me to analyse the information of 165 countries which I believe is a great sample size when you are talking about countries. For full information about my data, see Appendix A

The following table is a summary of the data that will be tested with several significance tests.

Page 3: Final Project Paper Ec2203

Econometric Method

Regression:

We can then rewrite the model as the following.

Life Expectancy = 46.84119 + 0.0509704 (Electricity) + 0.102448 (Water) + 0.1508989(Sanitation)

If Access to Electricity increases by 1 percentage point, Life Expectancy increases by 0.0509704 years.

if Access to Water Source increases by 1 percentage point, Life Expectancy increases by 0.102448 years.

if Access to Sanitation facilities increases by 1 percentage point, Life Expectancy increases by 0.1508989 years.

In absence of Electricity, Water Source and Sanitation, Life Expectancy would average 47.08574 years.

We compute our first test, the T tests which will give us results that are to be interpreted as the individual significance of each explanatory variable on the dependent variable.We want to reject the null hypothesis that the coefficients have no explanatory power.

The Tcritical (Critical Value) of t with 5%, 2-tailed level of significance and N-K(161) degrees of freedom is

Tcrit = 1.9748

For Electricity: |t| = 2.83 > 1.9748 we reject the null hypothesis that 𝛽1 equals 0. We are able to say that Access to Electricity has statistically significant influence on Life Expectancy.

For Water: |t| = 3.08 >1.9748 we reject the null hypothesis that 𝛽2 equals 0. We are able to say that Access to Improved water source has statistically significant influence on Life Expectancy

For Sanitation: |t| = 7.67 > 1.9748 we reject the null hypothesis that 𝛽3 equals 0. We can again conclude that access to improved sanitation has statistically significant influence on Life Expectancy.

Page 4: Final Project Paper Ec2203

Our value Rsquared in our regression is 0.8052 indicates the proportion of the variation in the dependent variable explained by the regression with the following formula. In other words, the regression explains 80,52 % of the variation in Life Expectancy. Although there could be a really small problem: if the Rsquared is bigger than 0.80, it means that there is a problem in the specification of the model. Since the Rsquared is 0.8052 which is slightly bigger than 0.80 we will not consider this as a problem and we will proceed our investigation with the F test.

The F test is similar to the t test but take in consideration the joint rather than individual explanatory power of the independent variables (regressors). Again we want to reject the hypothesis that the model has no explanatory power.

The F statistic formula is the following:

Where n is the number of observation and k the number of the parameters.

In the Model, F (3,161) = 221.89 which is higher than the critical value Fcrit = 2.66 We can conclude with a rejection of the null hypothesis that the model has no explanatory power and that the regressors have a joint explanatory power on Life Expectancy.

Among the test we have to run going through the Gauss-Markov theorem and its assumptions, we have to check whether or not our model suffers from heteroskedasticity, using the Breusch-Pagan/Cook-Weisberg test for H. aiming to reject the null hypothesis of constant variance.

In order to proceed with the Breusch-Pagan Test for heteroskedasticity, we have to square our residual and perform a regression with the squared residual on the left hand side and keep all the right hand side variable the same.

You then have to easily compute the Chi value which is Sample size (162) times Rsquared (0.0223).

The test that you can find in Appendix C yield a value of chi2 = 3.6795 which lies below the critical value of 7.815 for 3 degrees of freedom. Therefore, we can state that we fail to reject the null hypothesis of constant variance and declare the absence of heteroskedasticity. However, if we would have found a presence of H. could have been easily solved using robust regressions in order to adjust the standard errors.

Checking for normality of residuals

Page 5: Final Project Paper Ec2203

The histogram above shows MyResiduals and clearly follow the pattern of a normal distribution (green line).

This last deduction also tells us that the level of skewness and kurtosis are approximatively close to the ones in the normal distribution which are respectively 0 and 3. As our histogram compared with the normal distribution and our value of skewness and kurtosis respectively -0.19662909 and kurtosis 2.445363 shown in Appendix E we can confirm that the residuals are normally distributed.When we look at our model we need to pay attention at the correlation among our explanatory variables. High correlation between the explanatory variables can lead to a risk of obtaining erratic estimates of the coefficients (Multicollinearity). This is shown in the correlation table in Appendix F. Analyzing the table, I found out that access to electricity and access to improved sanitation facilities have a strong correlation since the value obtained 0.8536 is bigger than the very maximum threshold of 0.80. We can conclude that multicollinearity is indeed an issue.

Fortunately, this problem can be solved in 3 different methods:

1) Gathering more data in order to reduce standard errors2) Adding more variables3) If the previous do not work out, we might have to drop one of the variables.

1) Regarding the sample size, I tried to include some other countries, but it was impossible for me to find all the data necessary since I already began the study, including 165 countries and however, the few data I tried to add did not change the correlation, but instead making it worse.

2) I then decide to add the variable (co2 emission) which I believe could have a strong explanatory power.However, the correlation between Access to electricity and access to improved sanitation facilities still seems a problem. (Appendix D).

We now want to check if there are any missing higher order powers of the existing variables that may would help the model.

The Ramsey reset test (ovtest), available in appendix (H) performs a regression specification error test (RESET) for omitted variables. The aim is to fail to reject the hypothesis that the model has no omitted variables. The result of this test gives us a Fstatistic = 6.33 while the Fcrit = 2.66 at 5% level of significance

Fstatistic > Fcrit therefore we reject the hypothesis that the model has no omitted variables.

The model is then mis-specified.

Following the results of the last two tests (correlation) and (omitted variables), I decided to keep the new variable “co2” and to drop one of the correlated. Since Access to sanitation facilities had more explanatory power and a greater coefficient than access to electricity, my choice moved forward to drop the latter.

Page 6: Final Project Paper Ec2203

Let’s now check how the model changed:

Life Expectancy = 𝛽0+ 𝛽1 Co2 + 𝛽2 Improved water source + 𝛽3 Improved sanitation facilities.

Figure 1 Figure 2

Analyzing the new scatter diagram (figure 1), I noticed that the relationship between life expectancy and Co2 was not

linear so I decided to take the logarithm function of Co2 (LogCo2) and the result of the scatter was far more linear than

the initial (figure 2). Following the result of the scatter, I considered the problem dealt with.

We now want to test our new final model which data’s summary can be found on appendix F

Life Expectancy = 𝛽0+ 𝛽1 LogCo2 + 𝛽2 Improved water source + 𝛽3 Improved sanitation facilities

New Regression

Page 7: Final Project Paper Ec2203

When I first chose the explanatory variable Co2 I thought it was going to have a negative effect on life expectancy

because the air pollution deaths would have increased. But then, analyzing the joint study of Julia K. Steinberger, J.

Timmons Roberts, Glen P. Peters and Giovanni Baiocchi (2012) Called “The pathways of human development and

carbon emission” I realized that higher Co2 emission are in the most developed socio-economic countries and it’s

correlated with human development, which also have high life expectancy.

The results of the new regression are the following:

1) The T-values are still all statistically significant at the 5% level of significance.

2) The F value is still statistically significant at the 5% level of significance

3) The Rsquared remained significantly the same

4) The standard errors for water and sanitation are slightly smaller now which indicates more precision in the

model.

Levels of skewness, kurtosis and the histogram of the residual compared to the normal distribution still indicate that we

are close to a normal distribution (Appendix G).

Let’s now check if our model suffer from heteroskedasticity

The Breusch-Pagan test gives us a ChiSquared equal to 3.036 which is far below our region of rejection of 7.815 for 3

degrees of freedom and we can conclude that this new model does not suffer from heteroskedasticity. Again for further

details see Appendix C

We want to test the correlation between our explanatory variables to see if there could be any risk of obtaining erratic

estimates of the coefficients.

Following the correlation test, we don’t find any correlation problem among the variables since the values reported on

the table are all below the rejection region of 0.80 and we can conclude stating that our model does not suffer anymore

Page 8: Final Project Paper Ec2203

from multicollinearity.

We can now proceed with our last test, the Ramsey reset for omitted variable.

Again we fail to reject the hypothesis of no omitted variables in the model. From the result of the test in appendix H,

which give us a Fstatistic = 3.49 we can finally fail to reject the hypothesis as the Fcrit at 1% level of significance

equals 3.91 and it’s higher than the value found. Fortunately, the change in variable allowed the model not to be mis-

specified as it was before.

Results

The following table shows the Coefficients of all my variables, the relative standard errors (in bracket), the Rsquared

and the sample size of my two models.

Variable

Model 1

Life Expectancy = 𝛽0+ 𝛽1Access to Electricity + 𝛽2 Improved water source + 𝛽3 Improved

sanitation facilities.

Model 2

Life Expectancy = 𝛽0+ 𝛽1 logCo2 + 𝛽2

Improved water source + 𝛽3 Improved

sanitation facilities.

Access to Electricity 0.0509704 (0.0179838) Not included in the model

Improved Water Source 0.102448 (0.0333132)

0.1286791 (0.0319922)

Improved Sanitation Facilities 0.1508989 (0.0196812)

0.1722761 (0.016165)

LogCo2 Not included in the model 0.337433 (0.1223625)

Constant 47.08574 (2.097714)

44.00451 (2.222511)

Rsquared 0.8052

0.8048

Sample size 165 165

Page 9: Final Project Paper Ec2203

The table above shows the Coefficients of all my variables, the relative standard errors (in bracket), the Rsquared and

the sample size of my two models.

Comparing my two models, we can see that Model 1 has lower coefficients and higher standard errors for the the

regressors which means that Model 2 explains more accurately the variation in my dependent variable.

Also, looking back at the econometric method you can see that Model 1 did not perform well in the correlation test and

in the Ramsey reset test of omitted variable. In detail, we found a correlation between Access to electricity and

Improved Sanitation Facilities and after several attempt, I could not be able to solve it, neither by increasing the sample

size, adding more observation, nor adding a new variable (Co2). Furthermore, the Ramsey Reset test performs a mis-

specification of the model. Following the result of these two test, I choose to drop the Electricity variable since

Sanitation had higher Coefficient and higher explanatory power. I then decided to keep the Co2 variable, since after

several test, I saw that the model was better explained with this new variable taken in account.

I now present my final model (2):

Life Expectancy = 𝛽0+ 𝛽1 LogCo2 + 𝛽2 Improved water source + 𝛽3 Improved sanitation facilities

Model 2, compared to Model 1 performs better in every test you can find in “Econometric Method” and even in the

correlation and in the Ramsey reset resolving the mis-specification problem, tests that Model 1 both previously failed.

Taking this in consideration, together with the fact that ours new coefficients are higher and ours standard errors result

lower than the first model, we conclude that Model 2 is under any aspect better than Model 1.

Conclusion

My initial hypothesis was that my three explanatory variables, Access to Electricity, Improved Water Source and

Improved Sanitation Facilities would have a positive effect (coefficients) on Life Expectancy; The results of the first

regression states that a percentage point increase in Electricity, leads to a 0.051 increase in years, or around 20 days

while a percentage point increase in Improved water source or sanitation increase longevity by around 1-2 months.

After several problem though, I decided to continue with a new model that better fits my investigation. My new

variables are now logCo2, Improved Water Source and Improved Sanitation Facilities. As a result of this study, we can

derive that an increase in percentage point in either Water Source or Sanitation, will increase Life Expectancy by

around 2 months and a 1% increase in Co2 emission, will yield a 0.337433 (4 months) increase in Life Expectancy

across the 165 countries included in the study.

Page 10: Final Project Paper Ec2203

List of

Countries

1.Afghanistan45.El Salvador 89.Luxembourg

133.Slovak Republic

2.Albania 46.Eritrea 90.Macedonia 134.Slovenia

3.Algeria 47.Estonia 91.Malawi135.SolomonIslands

4.Argentina 48.Ethiopia 92.Malaysia 136.Spain5.Armenia 49.Fiji 93.Maldives 137.SriLanka6.Aruba 50.Finland 94.Mali 138.St.Lucia7.Australia 51.France 95.Malta 139.Sudan

8.Austria52.French Polynesia 96.Mauritania 140.Suriname

9.Azerbaijan 53.Gabon 97.Mauritius 141.Sweden10.Bahamas 54.Georgia 98.Mexico 142.Switzerland

11.Bahrain 55.Germany 99.Micronesia143.Syrian Arab Republic

12.Bangladesh 56.Ghana 100.Moldova 144.Tajikistan13.Barbados 57.Greece 101.Mongolia 145.Thailand14.Belarus 58.Grenada 102.Montenegro 146.TimorLeste15.Belgium 59.Guatemala 103.Morocco 147.Togo16.Belize 60.Guinea 104.Mozambique 148.Tonga

17.Benin61.GuineaBissau 105.Myanmar

149.Trinidad and Tobago

18.Bhutan 62.Guyana 106.Namibia 150.Tunisia19.Bolivia 63.Haiti 107.Nepal 151.Turkey20.Bosnia & Herz 64.Honduras 108.Netherlands 152.Uganda

21.Botswana 65.Hungary109.NewCaledonia 153.Ukraine

22.Brazil 66.Iceland 110.Nicaragua154.United ArabEmirates

23.Bulgaria 67.India 111.Niger155.UnitedKingdom

24.BurkinaFaso 68.Indonesia 112.Norway 156.UnitedStates25.Burundi 69.Iran 113.Oman 157.Uruguay26.CaboVerde 70.Iraq 114.Pakistan 158.Uzbekistan27.Cambodia 71.Ireland 115.Panama 159.Vanuatu28.Canada 72.Israel 116.Paraguay 160.Venezuela29.Chad 73.Italy 117.Peru 161.Vietnam

30.Chile 74.Jamaica 118.Philippines162.WestBank & Gaza

31.China 75.Japan 119.Poland 163.Yemen,Rep.32.Colombia 76.Jordan 120.Portugal 164.Zambia

33.Comoros77.Kazakhstan 121.Qatar 165.Zimbabwe

34.Congo,Dem.Rep. 78.Kenya 122.Romania

35.Congo,Rep. 79.Kiribati123.Russian Federation

36.Croatia80.Korea,Dem.Rep. 124.Rwanda

37.Cuba 81.Korea,Rep. 125.Samoa

38.Cyprus 82.Kuwait126.SaoTome & Principe

39.CzechRepublic83.Kyrgyz Republic 127.SaudiArabia

40.Denmark 84.Lao 128.Senegal41.Djibouti 85.Latvia 129.Serbia42.DominicanRepublic 86.Lebanon 130.Seychelles43.Ecuador 87.Liberia 131.SierraLeone

Page 11: Final Project Paper Ec2203

Appendix AName of Variable Source of Data Website link to the data Year of the Data DefinitionLife Expectancy (At Birth)

http://databank.worldbank.org/http://data.worldbank.org/indicator/SP.DYN.LE00.IN

2012 Measure for the average years a person is expected to live, at birth, cross-country.Access to Electricity http://databank.worldbank.org/

http://data.worldbank.org/indicator/EN.ATM.CO2E.KT

2012 % of population of a country with access to electricity

Improved Water Source http://databank.worldbank.org/http://data.worldbank.org/indicator/SH.H2O.SAFE.ZS

2012 % of population of a country with access to improved water source (Drinkable water)

Improved Sanitation Facilitieshttp://databank.worldbank.org/

http://data.worldbank.org/indicator/2012 % of population of a country using improved sanitation facilities

Page 12: Final Project Paper Ec2203

SH.STA.ACSNCo2 Emission http://databank.worldbank.org/

http://data.worldbank.org/indicator/EN.ATM.CO2E.KT

2011 Carbon dioxide emissions measured in

kilotons(kt).

Appendix B

Scatter diagrams

Scatter lifeexp accesselettr Scatter lifeexp watersource

Page 13: Final Project Paper Ec2203

Scatter lifeexp sanfacilities

Appendix C

Regression MyResidual electricity water sanitation (Breusch-Pagan)

Page 14: Final Project Paper Ec2203

Regression MyResidual2 logCo2 water sanitation (Breusch-Pagan)

Appendix D

Correlation model 1 Correlation model 2

Page 15: Final Project Paper Ec2203

Correlation model 1 (With Co2)

Appendix E

Histogram of my residual and normal distribution Model 2

Summary details of residuals

Appendix F

Summarize data model 2

Page 16: Final Project Paper Ec2203

Appendix G

Summary of details of residuals

Appendix H

Ramsey reset test model 1

Ramsey reset test model 2

Page 17: Final Project Paper Ec2203

Bibliografy

AfricaPublicHealth.info, (n.d.). Combined Global and African Ranking - 25 Country Populations with the Least Sustainable Access to Improved / Clean Water Sources. [online] Available at: http://www.who.int/pmnch/media/news/2012/201205_africa_scorecard.pdf [Accessed 31 Jan. 2016]

National Institute on Aging, (2011). Living Longer. [online] Available at: https://www.nia.nih.gov/research/publication/global-health-and-aging/living-longer [Accessed 29 Jan. 2016].

ScienceDaily, (2012). What is the connection between carbon emissions, life expectancy and income?[online] Available at:

Page 18: Final Project Paper Ec2203

http://www.sciencedaily.com/releases/2012/01/120126100641.htm [Accessed 1 Feb. 2016].

Who.int, (n.d.). WHO | Life expectancy. [online] Available at: http://www.who.int/gho/mortality_burden_disease/life_tables/situation_trends_text/en/ [Accessed 29 Jan. 2016].