Empirical Essays on Poverty, Inequality, and Social Welfare · PDF fileEmpirical Essays on Poverty, Inequality, ... Empirical Essays on Poverty, Inequality, ... description of the
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Empirical Essays on Poverty, Inequality, and Social Welfare
by
Brian Daniel McCaig
A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy
2 CHAPTER 2 THE EVOLUTION OF INCOME INEQUALITY IN VIETNAM, 1993-2006 29
2.1 INTRODUCTION 29 2.2 BACKGROUND 30 2.2.1 LAND AND AGRICULTURAL REFORM 31 2.2.2 PRIVATE BUSINESS REFORM 31 2.2.3 SOE AND GOVERNMENT REFORM 32 2.2.4 TRADE REFORM 32 2.2.5 INVESTMENT 32 2.2.6 LABOUR 33 2.3 LITERATURE REVIEW 33 2.3.1 GENERAL EVIDENCE ON POVERTY 34 2.3.2 EVIDENCE ON INEQUALITY, 1993-1998 35
vi
2.3.2.1 Urban-rural inequality from 1993 to 1998 35 2.3.2.2 Regional inequality 1993 to 1998 37 2.3.2.3 Inequality based on household characteristics 38 2.3.2.4 Some evidence relating to income and employment 40 2.3.3 INEQUALITY EVIDENCE SINCE 1998 42 2.4 OUR CONTRIBUTION 42 2.5 DATA 43 2.5.1 HOUSEHOLD SURVEY DESIGN 44 2.5.2 HOUSEHOLD MEMBERSHIP AND MISSING MEMBERS 46 2.6 ESTIMATION OF HOUSEHOLD INCOME 49 2.7 MAIN EVIDENCE 51 2.7.1 IMPACTS OF REFORMS ON EMPLOYMENT 52 2.7.2 EVOLUTION OF HOUSEHOLD AND INDIVIDUAL ENDOWMENTS 55 2.7.2.1 Household size 55 2.7.2.2 Distribution of land 56 2.7.2.3 Distribution of human capital (education) 57 2.7.3 INCOMES 59 2.7.3.1 Mean per capita income 60 2.7.3.2 Comparison of Consumption and Income Behaviour 62 2.7.3.3 Composition of mean per capita income 63 2.8 DISTRIBUTIVE DIMENSIONS OF VIETNAM’S GROWTH 71 2.8.1 POVERTY TRENDS 72 2.8.2 INEQUALITY 73 2.8.2.1 Initial levels in 1993 73 2.8.2.2 Inequality trends 76 2.8.2.3 Comparison of consumption and income inequality 80 2.8.2.4 Role of spatial differences 81 2.8.2.5 Decompositions by source of income 83 2.9 ROBUSTNESS CHECKS 92 2.10 DISCUSSION AND CONCLUSION 96 2.11 DATA APPENDIX 100 2.11.1 ESTIMATION OF HOUSEHOLD INCOME 100 2.11.1.1 Income from crops 100 2.11.1.2 Income from agricultural sidelines 104 2.11.1.3 Household business income 105
vii
2.11.1.4 Wage income 109 2.11.1.5 Remittances and gifts 112 2.11.1.6 “Other” residual sources of income 113
3 CHAPTER 3 EXPORTING OUT OF POVERTY: PROVINCIAL POVERTY IN VIETNAM
AND U.S. MARKET ACCESS 115
3.1 INTRODUCTION 115 3.2 BACKGROUND 118 3.3 OVERVIEW OF THE U.S.-VIETNAM BILATERAL TRADE AGREEMENT 121 3.4 DATA 122 3.4.1 TARIFF DATA 122 3.4.2 HOUSEHOLD SURVEYS 124 3.4.3 EMPLOYMENT DATA 126 3.5 EMPIRICAL METHODOLOGY 127 3.5.1 EXOGENEITY OF U.S. TARIFF CUTS 129 3.5.2 UNDERLYING TRENDS AND CONTEMPORARY SHOCKS 131 3.6 EMPIRICAL RESULTS 131 3.6.1 ROBUSTNESS OF RESULTS 135 3.7 LABOUR MARKET TRANSMISSION MECHANISMS 137 3.7.1 WAGES 138 3.7.2 JOB CREATION 140 3.8 DISCUSSION OF RESULTS 144 3.9 CONCLUDING REMARKS 145 3.10 APPENDIX A: MEASUREMENT ERROR 146 3.11 APPENDIX B: DATA 150
REFERENCES 153
viii
List of Tables Table 1.1 Parameter values of the simulated bivariate normal distributions................................ 13
Table 1.2 Level and power of testing procedure........................................................................... 14
Table 1.3 Summary statistics for all individuals 25 years of age or older.................................... 16
Table 1.4 Estimated p-values for tests of stochastic dominance for individuals 25 years of age
and older........................................................................................................................................ 19
Table 1.5 Estimated p-values for tests of restricted first-order stochastic dominance for
individuals 25 years of age and older ........................................................................................... 21
Table 1.6 Summary statistics for all working individuals ............................................................ 22
Table 1.7 Estimated p-values for tests of stochastic dominance for all working individuals....... 23
Table 1.8 Estimated p-values for tests of first-order stochastic dominance for all working
In recent decades development economics has moved beyond its historical focus on
growth, broadening the concept of development to include distributional issues and non-
monetary outcomes. A strong proponent of this shift in focus has been Amartya Sen (see for
example Sen (1982)). This change in focus has been motivated by a number of issues. These
include, but are not limited to, examples of countries that experienced sustained growth in mean
income, but stagnation in other welfare measures such as poverty, malnutrition, infant mortality,
etc., and theoretical findings suggesting that inequality and poverty may hamper overall growth.
The broadening of the concept of economic development requires better tools for analyzing
distributional issues, more detailed high-quality data on changes in the distribution of important
welfare indicators in developing countries, and the analysis of the distributional impacts of
economic reforms.
This thesis contributes to the literature focusing on distributional issues in three ways: by
proposing a statistical testing procedure useful for comparisons of multidimensional social
welfare and poverty, by providing a detailed description of the evolution of income inequality in
Vietnam between 1993 and 2006, and by examining the impacts of a major policy shock on
poverty in Vietnam between 2002 and 2004.
In the first chapter, which is co-authored with Adonis Yatchew, we address the need for
better tools for analyzing changes in the distribution of welfare across populations. As the focus
of development economics has moved beyond simply using mean outcomes, to including
measures sensitive to the distribution of welfare, such as inequality and poverty, it has also
expanded to include the use of multiple measures of welfare simultaneously. The increased
dimensionality and attention to the distribution of welfare introduce numerous challenges for
comparing changes in welfare in a statistical sense. First, how important are gains in one welfare
indicator relative to another indicator? Second, how sensitive are the comparisons to changes in
the chosen index or, in the case of poverty, to the poverty line used? These shortcomings make
approaches based on stochastic dominance highly appealing. Not only have these approaches
proven useful for robust rankings of univariate distributions but, as we show in Chapter 1, tests
for stochastic dominance can be extended to multivariate distributions. We propose a statistical
2
testing procedure that can be applied to tests of multidimensional stochastic dominance in the
context of comparing social welfare and poverty between two populations.
The second and third chapters both focus on Vietnam during its period of transition from
a centralized to a decentralized economy. Ever since Vietnam’s major economic reforms began
in 1986 under Doi Moi, Vietnam has grown exceptionally fast, whether measured by GDP per
capita in the national accounts or income per capita in household surveys. Vietnam’s rapid
growth has been accompanied by a rapid reduction in absolute poverty and a slight rise in
relative inequality, as measured by per capita consumption. Chapter 2, which is co-authored with
Dwayne Benjamin and Loren Brandt, contributes to the literature analyzing the trends in
Vietnam’s distributional outcomes by shifting the focus to income per capita as a measure of
welfare. While many researchers argue that consumption is preferred to income as a measure of
welfare, it is not a sufficient statistic for identifying all dimensions of welfare (hence the appeal
of multivariate approaches such as the one proposed in Chapter 1). Moreover, in an economy
that is growing as rapidly as Vietnam’s, consumption and income may diverge in systematic
ways as savings rates change throughout the distribution over time. Finally, using income allows
for additional sets of inequality decompositions that cannot be performed using consumption
data. It is with these reasons in mind that we present a detailed set of facts related to the
evolution of income inequality within Vietnam between 1993 and 2006. We find that Vietnam
experienced a period of initially falling inequality, between 1993 and 2002, followed by a period
of stable levels of inequality, between 2002 and 2006.
During the 1990s and the first decade of the new century, a number of large policy
shocks have hit the Vietnamese economy. Many of these shocks are likely to have both
efficiency and distributional implications for the population. In this context, the third chapter
compliments the descriptive nature of Chapter 2 with an econometric analysis of the distributive
implications of a major trade agreement. Specifically, the goal of the third chapter, the core
chapter of the thesis, is to explore the poverty impacts of the 2001 U.S.-Vietnam Bilateral Trade
Agreement. In a cross-province framework, I find that provinces that were most exposed to the
U.S. tariff cuts experienced the most rapid reductions in poverty. One of the mechanisms at work
appears to be an increase in the relative demand for low-skilled workers in response to the large
increase in low-skilled-labour-intensive exports from Vietnam to the U.S. The increase in
relative demand for low-skilled workers may be one of the reasons for relatively equal growth
3
between 2002 and 2006 in Vietnam as it helped to promote income growth for those most likely
to be in the lower end of the income distribution.
This thesis aims to contribute to the growing literature concerning the distributional and
multidimensional aspects of economic development. It does so in three distinct, but connected
chapters. Chapter 3 explores the poverty implications of a major trade shock to the Vietnamese
economy. As such, it is a very nice exploration of one of the possible mechanisms underlying
Vietnam’s growth with equality, which is documented in a stylized manner in Chapter 2. By
comparison, Chapter 1 is slightly more abstract, but presents a statistical testing procedure useful
for exploring multidimensional welfare analysis. It is hoped that all three chapters can be useful
additions to the literature.
4
1 Chapter 1 International Welfare Comparisons and
Nonparametric Testing of Multivariate Stochastic Dominance
1.1 Introduction
Cross-country welfare comparisons commonly use statistics such as median or mean per-
capita incomes, average hours worked, the proportion of the population living below the poverty
line and so on. The statistical theory for testing hypotheses about one or more such statistics is
generally available. Individual statistics, however, capture only one characteristic of a
distribution. Often, there is interest in making point-wise comparisons of entire distributions to
each other, for example using measures of stochastic dominance in the context of comparing
social welfare, inequality, and poverty. In the simplest case the distribution of income in country
“a” first-order stochastically dominates country “b” if for any income level “x” the proportion of
the population with income at or below “x” is lower in country “a” than in country “b”.
More generally, one may be interested in simultaneous comparisons between multiple
indicators of welfare. For example, one might want to test whether one country stochastically
dominates another with respect to several variables. Or, one might want to test whether one
country dominates in some dimensions while another dominates in others. For example, while
U.S. per capita GDP is substantially higher than in France, the French work fewer hours per
week.1
The primary advantage of stochastic dominance criteria is their robustness to changes in
the functional form of the social welfare or poverty index. Unlike indices, stochastic dominance
conditions do not require specification of the functional form of the utility or poverty function.
This allows practitioners to draw strong conclusions when stochastic dominance conditions are
met. If stochastic dominance holds, one can make robust inferences over all indices that share a
common set of properties. On the other hand, stochastic dominance conditions do not always
1 Such comparisons have filtered into the media and inform the debate on societal as well as individual choices. See e.g., Krugman (2005).
5
allow for two populations to be ranked. It is in this sense that orderings provided by stochastic
dominance are partial as compared to the complete rankings provided by indices. However,
when stochastic dominance does not hold over the entire support, it may hold over subsets of the
support that are of particular interest for poverty comparisons.
This paper outlines a class of statistical procedures that permit testing of a broad range of
multi-dimensional stochastic dominance hypotheses and more generally, hypotheses that rely
upon multiple stochastic dominance conditions. We conduct a small Monte Carlo study to
examine the size and power properties of the test procedure. We then apply the procedures to
data on income and leisure hours from Germany, the U.K. and the U.S. For individuals 25 years
of age or older we find that no country first-order stochastically dominates the others in both
dimensions. Furthermore, while the U.S. stochastically dominates Germany and the U.K. with
respect to income, in most periods Germany is stochastically dominant with respect to leisure
hours. In addition, we find evidence of lower bivariate poverty in Germany as compared to both
the U.K. and the U.S. We check the robustness of our main results using alternative population
groups and find similar results. Before proceeding, we provide a brief and selective outline of
related work.
Atkinson and Bourguignon (1982), working in a bivariate context, summarize the
theoretical relationship between comparisons of social welfare and first-order and second-order
stochastic dominance conditions. Specifically, assume a bivariate social welfare function of the
form , where ( ) (1 2 1 2,W U z z dG z z= ∫ ∫ ), ( )1 2,G z z is a cumulative distribution function defined
over the rectangle ( ) [ ] [ ]1 2 1 2, 0, 0,z z z∈ × z .2 Atkinson and Bourguignon (1982) show that the
difference in social welfare between two populations can be written as:
( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( )
1 2
1
2
12 1 2 1 2 1 2 2 10 0
1 1 2 1 2 1 2 10
2 1 2 1 2 1 2 20
, , ,
, , ,
, , ,
z z
a b a b
z
a b
z
a b
W W U z z G z z G z z dz dz
U z z G z z G z z dz
U z z G z z G z z dz
− = −⎡ ⎤⎣ ⎦
− −⎡⎣
− −⎡ ⎤⎣ ⎦
∫ ∫
∫
∫ .
⎤⎦
(0.1)
2 It is not necessary that the lower bound equal 0, only that it be finite. However, for most indicators of social welfare 0 is a natural lower bound.
6
This leads to a set of sufficient, but not necessary, conditions for social welfare in population “a”
to be at least as great as that for population “b”. If ( ) ( )1 2 1 2,a bG z z G z z≤ , ) for all ( 1 2,z z within
the support of the cumulative distribution functions, that is, if first-order stochastically
dominates , then social welfare is at least as large in population “a” as in population “b” for
all
aG
bG
{ }1 2 12: , 0, 0U U U U U∈ ≥ ≤ where subscripts denote own and cross-partial derivatives.
An alternative set of sufficient conditions exist when the cross-partial derivative of U is
non-negative. Define ( ) ( ) ( ) ( )1 2 1 2 1 2 1 2, , ,K z z G z z G z z G z z≡ + − , . Then the difference in social
welfare can be expressed as:
( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( )
1 2
1
2
12 1 2 1 2 1 2 2 10 0
1 1 1 1 10
2 2 2 2 20
, , ,
,0 ,0 ,0
0, 0, 0, .
z z
a b a b
z
a b
z
a b
W W U z z K z z K z z dz dz
U z K z K z dz
U z K z K z dz
− = − −⎡ ⎤⎣ ⎦
− −⎡ ⎤⎣ ⎦
− −⎡ ⎤⎣ ⎦
∫ ∫
∫
∫
(0.2)
Now, if for all ( ) (1 2 1 2,a bK z z K z z≤ ), ( )1 2,z z within the support of the cumulative distribution
functions, that is, if first-order stochastically dominates , then social welfare is at least as
large in population “a” as in population “b” for all
aK bK
{ }1 2 12: , , 0U U U U U∈ ≥ .3 In the former case
of a non-positive cross-partial, one can think of the two indicators of well-being as substitutes.
Hence, a social planner could increase social welfare by decreasing the correlation of indicators
across individuals while holding the marginal distributions constant. In the latter case of a non-
negative cross-partial, indicators are complements and a social planner could increase social
welfare by increasing the correlation of indicators across individuals.
More recently, Bourguignon and Chakravarty (2002) and Duclos et al. (2004) adapt this
framework to the study of multidimensional poverty. Both papers use a similar motivation to
Atkinson and Bourguignon (1982), but arrive at somewhat different stochastic dominance
conditions. Bourguignon and Chakravarty (2002) integrate from 0 to the poverty cutoff in both
3 Atkinson and Bourguignon (1982) also provide the various derivative conditions required for social welfare comparisons based on second-order stochastic dominance, while Crawford (2005) extends these results to comparisons based on third-order stochastic dominance.
7
dimensions, whereas Duclos et al. (2004) integrate over the poverty space where the poverty
threshold in one dimension is a function of the level of well-being in the other dimension. For a
poverty space that exists strictly within the interior of the support, the Duclos et al. (2004)
conditions are weaker, as they only require stochastic dominance within this interior space,
whereas the Bourguignon and Chakravarty (2002) conditions also require stochastic dominance
along the marginal distributions, up to the respective poverty thresholds.
These bivariate conditions, along with the univariate counterparts, are part of the
motivation for employing statistical tests of stochastic dominance. Recently, several such tests
have been proposed in the literature. Tests of univariate stochastic dominance available to date
can be broadly divided into two groups: those that use a modified Kolmogorov-Smirnov (KS)
statistic and those that perform simultaneous tests at multiple points. In the first category are the
tests proposed by McFadden (1989), Klecan et al. (1991), Kaur et al. (1994), Maasoumi and
Heshmati (2000), Barrett and Donald (2003), and Linton et al. (2005). A central issue for KS-
type tests of stochastic dominance is the determination of critical values. The second category of
tests includes those suggested by Anderson (1996) and Davidson and Duclos (2000). These tests
suffer from possible inconsistency, but they do not require bootstrapping, simulation, or
subsampling procedures to determine critical values.
Multivariate tests of stochastic dominance may be found in Crawford (2005) and Duclos
et al. (2004). Both procedures involve multiple comparisons on a grid of points. Crawford
extends Anderson (1996) to multiple dimensions while Duclos et al. similarly extend Davidson
and Duclos (2000). For both procedures, at each point of comparison a t-statistic is calculated
and compared to the appropriate critical value obtained from the studentized maximum modulus
(SMM). As stated by Tse and Zhang (2004), the nominal size of the SMM test overstates the
actual size when the tests are not independent.
The present paper is organized as follows. Section 1.2 establishes notation and sets out
the statistical procedures. These are an extension of tests found in Hall and Yatchew (2005).
Section 1.3 describes the results of simulations and Section 1.4 discusses empirical results.
1.2 Notation and Statistical Procedure
The test procedures outlined in this paper are applicable in univariate and multivariate
settings. They require use of the bootstrap and involve integration over the support, thereby
8
avoiding the multiple test problem described above. In practice, this allows the researcher to test
for stochastic dominance over as dense a grid of points as desired, without affecting the size and
power properties of the test. Furthermore, in contrast to the univariate KS-type test procedures,
our procedure should have more power over deviations from the null hypothesis that are small in
magnitude, but persist over a large portion of the support. However, KS-type test procedures
would be expected to have greater power against alternatives that feature a large deviation from
the null that is not very persistent.
Let and denote two right-continuous k-dimensional cumulative distribution functions (CDFs). For convenience, assume that the support of the CDFs is , the unit cube in
.
aG bGΛ
kR 4 We are interested in testing hypotheses of the form
0 : a s bH G G
where s denotes weak stochastic dominance of order s. Let ( ) (1a aD G=z )z and define
( ) ( )1s sa aD D d−= ∫
z
0
z u u
for integers . (An analogous definition applies to 2s ≥ ( )sbD z .) Weak stochastic dominance of
order holds iff while strong stochastic dominance additionally
requires a strict inequality over some region of
s ( ) ( ) for all s sa bD D≤z z z∈Λ
Λ , denoted by s . For ∈Λλ , let
( ) ( ) ( ){ }max ,0s sa bD Dψ = −λ λ λ . Then the null hypothesis is true iff ( ) 0ψ =λ for all ∈Λλ .
Define
( )1/ 2
2T ψ
Λ
d⎧ ⎫
⎡ ⎤= ⎨ ⎬⎣ ⎦⎩ ⎭∫ λ λ . (0.3)
The objective is to estimate and to test whether it is statistically different from zero. Let T
( )1,..., aa anw w and ( )1,..., bb bnw w be independently and identically distributed observations from
the two respective populations with corresponding empirical distribution functions aG and .
Our test statistic, say bG
T , is obtained by substituting numerical analogues of saD and s
bD , say saD
and sbD , into T . When examining indicators of well-being one may want to form hypotheses based on
subsets of the indicators. For example, one might want to test that the income distribution of
k
4 Rescaling of the data, which may be implemented in a variety of ways, becomes important later when we introduce hypotheses involving more than one stochastic dominance condition. In large samples, the method should not affect the conclusions, however, in modestly sized samples one approach or another may possess superior finite sample properties.
9
country “a” dominates that of country “b”, but the leisure-time distribution of country “b”
dominates that of country “a”. In this spirit, partition the k-dimensional vectors into sub-
vectors of dimension and ( respectively, where 0 '
,aw wb
)'k 'k k− k k< < ; i.e., ( ),a a a=w x y and
. A more general hypothesis is given by ( ,b b b=w x y )
:x yx x ys so a b bH G G and G G
ya
1
where we write and the other distribution functions are defined similarly. We
allow for the order of dominance to vary between the two subsets of indicators as denoted by
( ) ( ),xa aG G=x x
xs
and . In the multivariate case, the above hypothesis asserts stochastic dominance with respect
to marginal distributions (which may themselves be multi-dimensional) without requiring
stochastic dominance everywhere. Let
ys
xΛ be the unit cube in and let be the unit cube in 'kR yΛ
( )'k k−R . For x x∈Λλ , let ( ) ( ) ( ){ }max ,x x
x x
s sx x a x b xD Dψ = −λ λ 0λ y and for y ∈Λλ , let
( ) ( ) ( ){ }max ,y y
y y
s sy y b y a yD Dψ = −λ λ 0λ
d
. Define
( ) ( )1/ 21/ 2
22
x y
x x x y y yT dψ ψΛ Λ
⎧ ⎫⎧ ⎫⎪ ⎪ ⎪ ⎡ ⎤⎡ ⎤= +⎨ ⎬ ⎨⎣ ⎦ ⎣ ⎦⎪ ⎪
⎪⎬
⎪ ⎪⎩ ⎭ ⎩ ⎭∫ ∫λ λ λ λ . (0.4)
Defining the support of and as unit cubes in and x y 'kR ( )'k k−R ensures that the additive terms
in (2.2) are not of radically different orders of magnitude. This becomes important for ,
as integrating over the CDFs would otherwise introduce the units of the variable into the
integrated values.
, 2x ys s ≥
As noted above, Atkinson and Bourguignon (1982) derive second-order stochastic
dominance conditions that are sufficient to identify a positive difference in social welfare in the
bivariate case. One example would be the compound hypothesis:
2 2:x x yo a b a b aH G G and G G and G G2 yb .
To properly test this null hypothesis requires the simultaneous testing of all three conditions.5 To
test the above null hypothesis we define xΛ and yΛ as unit intervals in and R Λ as the unit
5 This is not generally a concern for first-order stochastic dominance conditions as the conditions on the marginal distributions are a part of the condition on the bivariate distribution. However, the second-order stochastic
10
square in . For 2R x xλ ∈Λ , let ( ) ( ) ( ){ }2 2max ,0x xx x a x b xD Dψ λ λ= − λ y; for yλ ∈Λ , let
( ) ( ) ( ){ }2 2max ,0y yy y a y b yD Dψ λ λ= − λ ; and for ∈Λλ , let ( ) ( ) ( ){ }2 2max ,a bD Dψ = −λ λ 0λ
In practice we estimate the value of from (2.1), (2.2) or (2.3) over a finite number of points.
Throughout the paper we report results based on an evenly spaced, 25 by 25 grid of points. The
empirical counterparts of
T
saD and s
bD are calculated at each of these points, before being
substituted into the expression for ψ and then T .
The test statistics in (2.1), (2.2) and (2.3) do not have known asymptotic distributions.
However, following Hall and Yatchew (2005), we obtain consistent bootstrap critical values
using the following algorithm. Combine the two datasets into one bootstrap dataset. Draw two
samples of size and for bootstrap samples “a” and “b” respectively. The data generating
mechanisms for the two bootstrap samples will weakly satisfy the null hypothesis since they are
drawn from the same distribution. We use an elevated asterisk, *, to denote a bootstrap estimate.
From the bootstrap samples calculate , and insert these into (2.1), (2.2) or (2.3)
to obtain
T
an bn
( )*
saD z ( )
*sbD z
*T . Repeating this procedure many times (we use 200 bootstrap iterations throughout
the paper), allows one to bootstrap the distribution of under the respective null hypothesis.
From the bootstrap distribution of we calculate the critical values.
T̂
T̂
Our bootstrap procedure imposes what is known in the literature as the “least favourable
case.” This approach is common to many of the stochastic dominance tests that require
calibration of critical values such as McFadden (1989) and Barrett and Donald (2003). Consider
two univariate, empirical distribution functions and the null hypothesis 1:o aH G Gb
. Suppose
dominance conditions require dominance over the integrals of the marginal distribution functions and the double integral of the bivariate distribution function. It is no longer the case that the conditions on the marginal distributions are embedded within the condition on the bivariate distribution.
11
that bG crosses once from below. Then the null hypothesis would appear to be true to the
right of the “crossing” but false to the left of the crossing. Hence, to satisfy the null hypothesis
our bootstrap procedure only needs to alter the relative positions of the functions below their
crossing point. However, we impose a stronger condition of equality of distributions throughout
the support. It is in this sense that we impose the “least favourable case” and it can be expected
that the power of the test procedure would be lower. Among the procedures requiring calibration
of critical values, Linton et al. (2005) avoid this problem by subsampling while Maasoumi and
Heshmati (2000) similarly avoid imposing the “least favourable case” by resampling from the
ordinates.
aG
A second comment on our bootstrapping procedure is warranted for tests of second- or
higher order stochastic dominance. To satisfy the null hypothesis at a higher order, we impose
equality of the distribution functions as opposed to equality of the integrated distribution
functions. This is stronger than is strictly necessary, but also common to many of the stochastic
dominance tests found in the literature. 6,7
1.3 Simulation results
To examine the properties of our testing procedure we conduct simulations using various
data generating mechanisms (DGMs). In each case, the simulated data are generated using a
6 We would like to thank an anonymous referee for bringing this to our attention. 7 One potential heuristic method to deal with both of the problems identified above would be to divide the support into regions where the empirical functions satisfy and do not satisfy the null hypothesis. For :
so aH G G
b, define the
region of the support for which the null hypothesis is empirically false as ( ) ( ){ }: s s
a bD DΛ = ∈Λ >λ λ λ . Next,
calculate the bootstrapped version of ( )ψ λ according to ( ) ( ) ( ){ }* **
max , 0s s
a bD Dψ = −λ λ λ for ∈ Λλ and
( ) ( ) ( ){ }*
max , 0s s
a bD Dψ = −λ λ λ for ∉ Λλ . That is, where the null hypothesis is not satisfied use the bootstrapped
empirical functions derived from pooling the observations to estimate ψ and use the original empirical functions otherwise. Though the statistical theory of this approach is beyond the scope of the current paper, some univariate simulations suggest dramatic improvement in power against certain alternatives.
12
bivariate lognormal pair ( where the underlying joint normal random variables ),X Y ( ),x y have
means ,x yμ μ , variances 2 , 2x yσ σ and covariance xyσ .
For each DGM we run 1000 simulations, with sample sizes and 500. We
conduct tests of the following five hypotheses:
50a bn n= =
10
10
10
1 10
1 10
:
:
:
:
: .
x x
y y
x x y y
x x y
Aa b
Ba b
Ca b
Da b b a
Eb a a b
H G G
H G G
H G G
H G G and G G
H G G and G Gy
Under the first DGM the two distributions are identical. The parameter values of the underlying
normal distribution are set to ( ) ( )2 2, , , , 0.85,0.85,0.36,0.36,0.2x y x y xyμ μ σ σ σ = for both
populations. We chose the means and variances to allow for easy comparability with the
univariate simulations of Barrett and Donald (2003). We expect the rate of rejection for all five
null hypotheses to be approximately at the level of the test. The second DGM maintains the same
parameter values for population “b” but those for population “a” change to
. These parameter values imply that all five null
hypotheses are false since the marginal distributions cross for both variables
( ) (2 2, , , , 0.6,0.6,0.64,0.64,0.2x y x y xy aμ μ σ σ σ = )
X and Y . The
third DGM uses the original parameter values for population “a” and
for population “b.” That is, both populations
have the same parameters except for the covariance parameter. Thus, the marginal distributions
are identical, in which case, hypotheses B, C, D and E are true; however, hypothesis A which
tests bivariate stochastic dominance of “a” over “b” is false.
( ) (2 2, , , , 0.85,0.85,0.36,0.36, 0.2x y x y xy bμ μ σ σ σ = )−
8 The fourth DGM uses the original
parameter values for population “b” and ( ) ( )0.22 2, , , , 0.65,2.1,0.36,0.36,x y x y xy aμ μ σ σ σ =
for
population “a.” These parameters imply that hypotheses A, B and D are false while C and E are
true. Table 1.1 summarizes the parameter values of the underlying normal distribution for each
DGM. Paramet valu of the simul biv norm distr
8 Though, bivariate stochastic dominance of “b” over “a” is true.
13
Table 1.1 Parameter values of the simulated bivariate normal distributions
Table 1.2 shows the results of the simulations described above.9 We find that our test
procedure performs well for all four DGMs. The power of the test procedure improves
substantially as the sample size increases from 50 to 500 observations. Our results for hypotheses
B and C, which involve only the marginal distributions, are very similar to those of Hall and
Yatchew (2005). Where hypothesis B or C is weakly true, as under DGMs 1 and 3, the null
hypothesis is rejected at approximately the test level. Furthermore, when hypothesis B or C is
false, as under DGM 2, the test procedure has substantial power even with a sample size of 50.
Finally, the test procedure correctly fails to reject hypothesis C when it is strongly true, as is the
case with DGM 4.
Our results concerning the combined marginal hypotheses, D and E, show similar
patterns. When hypothesis D or E is weakly true, such as with DGMs 1 and 3, the rejection rate
is approximately equal to the test level. Furthermore, when hypothesis D or E is false, there is
substantial power even with a sample size of 50. Finally, when hypothesis E is strongly true, as
under DGM 4, the rejection rate is close to zero.
We now turn to a discussion of tests of hypothesis A, which asserts bivariate stochastic
dominance. When this hypothesis is weakly true, as is the case for DGM 1, the rejection rates
remain close to the nominal significance levels. Under DGM 2 where univariate stochastic
dominance fails at both margins, it is perhaps not surprising that the bivariate test has substantial
9 All simulations were run using R, which is freely available from http://www.r-project.org. The simulation code is available at http://www.chass.utoronto.ca/~bmccaig.
14
power. In this instance, one could have rejected bivariate stochastic dominance by performing
DGM 3 is more interesting. In this case, univariate stochastic dominance holds (weakly)
at both margins but bivariate dominance does not. The bivariate test has substantial power even
for a sample size of 50.
For DGM 4, bivariate dominance does not hold and indeed univariate dominance fails on
one of the margins. In this case, bivariate dominance is rejected with greater power by
performing univariate tests. Under DGM 4, bivariate dominance actually holds for a substantial
portion of the support, which would appear to underlie the weaker power of the bivariate test.
15
1.4 Empirical results
The empirical analysis in this section is motivated by the literature on the differences in
time spent working between continental European countries, (in particular Germany and France),
and the U.S. This literature is largely focussed on trying to understand the determinants of the
differences in average hours worked (see Alesina, Glaeser and Sacerdote (2005), Prescott (2004),
and Schettkat (2003)). Thus far, it has not been concerned with trying to robustly compare the
social welfare and poverty consequences of the associated differences in income and non-labour
market time across the countries.
We compare the joint distributions of income and non-labour market time in Germany,
the United Kingdom, and the United States using data from the Cross-National Equivalent File
(CNEF). 10,11 The German data within the CNEF dataset originate from the German Socio-
Economic Panel Study (GSOEP), the U.K. data come from the British Household Panel Survey
(BHPS), and the U.S. data come from the Panel Study of Income Dynamics (PSID). The years of
comparison are 1983, 1990 and 2000 for Germany and the U.S. but only 1990 and 2000 for
comparisons involving the U.K. (Consistent earlier data for the U.K. are not available in the
CNEF.) We define leisure as the difference between total hours in a year and the reported
number of annual hours spent in the formal labour market.
We focus on the welfare of individuals aged 25 years or older, using post-government
income and leisure time as our indicators of well-being. Table 1.3 provides summary statistics
for individuals 25 years of age or older. Not surprisingly, average income is highest in the U.S.
for all three years of comparison and the variation in incomes is also greatest in the U.S.
Germany displays the highest average annual hours of leisure time. The average number of
leisure hours increases in Germany and the U.K. over time, but decreases by over 200 hours in
the U.S. between 1983 and 2000.
10 For data prior to 1991 Germany refers to the former territory of West Germany. From 1991 onward Germany refers to the unified territories of East and West Germany. 11 Please refer to http://www.human.cornell.edu/pam/gsoep/equivfil.cfm for further information.
16
Table 1.3 Summary statistics for all individuals 25 years of age or older
1983 1990 2000Germany
Mean of income 8533 12916 17133Mean of leisure time 7875 7904 7966St. dev. of income 4927 7402 9500St. dev. of leisure time 953 870 906Correlation -0.278 -0.285 -0.274Percentage working 52.2 54.9 48.4No. of observations 9549 7522 18692
United Kingdom Mean of income n.a. 13945 21165Mean of non-labour market time n.a. 7575 7647St. dev. of income n.a. 7524 13811St. dev. of non-labour market time n.a. 1073 1063Correlation n.a. -0.231 -0.159Percentage working n.a. 64.7 60.1No. of observations n.a. 6469 7224
United States Mean of income 11446 17046 26852Mean of non-labour market time 7507 7415 7279St. dev. of income 7985 13197 27521St. dev. of non-labour market time 1053 1079 1073Correlation -0.193 -0.230 -0.120Percentage working 70.1 73.4 75.1No. of observations 10394 11563 11444
Note: All income values are reported in current year U.S. dollars and reported in per adult equivalent units. “n.a.” denotes not available due to lack of consistent data.
Figure 1.1 and Figure 1.2 present the empirical marginal distributions of income and
leisure time, respectively, for individuals 25 years of age and older in Germany and the U.S. in
2000. The U.S. income distribution appears to be first-order stochastically dominant over most of
the income support, although the empirical income distributions appear to cross at the lower end
of the income support. The integral test allows us to check whether this crossing is statistically
significant. As for the leisure distribution, the German distribution appears to first-order
stochastically dominate that of the U.S. The distributions are discontinuous at the upper limit of
leisure time, reflected by the sharp jumps in the plots, due to non-working individuals.
17
Figure 1.1 Empirical income distributions for individuals 25 years of age and older in Germany and the U.S., 2000
0 20000 40000 60000 80000 100000
0.0
0.2
0.4
0.6
0.8
Germany 2000U.S. 2000
Annual post-government income per adult equivalent in 2000 USD
0 20000 40000 60000 80000 100000
0.0
0.2
0.4
0.6
0.8
Table 1.4 displays our results for tests of first- and second-order stochastic dominance of
the income, leisure, and joint distributions for each country-pair in the years 1983, 1990, and
2000. The table reports the estimated p-value of each indicated hypothesis. Some consistent
patterns emerge from the results. First, the German leisure distribution is first-order
stochastically dominant with respect to the U.K. and U.S. leisure distributions while the U.K.
leisure distribution is first-order stochastically dominant with respect to the U.S. Second, the
U.S. income distribution is stochastically dominant at first-order except for in comparison to the
U.K. in 2000, while the U.K. income distribution first-order stochastically dominates the German
distribution. Finally, the integral test definitively rejects first-order stochastic dominance of the
bivariate surfaces for each comparison except for Germany and the U.K. in 1990. In this
instance, see Figure 1.3, the German empirical bivariate distribution lies everywhere below that
of the U.K. except along the upper limit of leisure time. The p-value of 0.110 for the null
hypothesis of German stochastic dominance is indicative of the low power of the integral test
under the alternative hypothesis when the violation of the null hypothesis is not very persistent, a
18
point raised earlier in our discussion of our fourth simulation. This claim is supported by the
strong rejection of German stochastic dominance at first-order when the integral test is applied
strictly to the income distribution.
Figure 1.2 Empirical leisure time distributions for individuals 25 years of age and older in Germany and the U.S., 2000
5000 6000 7000 8000
0.0
0.2
0.4
0.6
0.8
1.0
Germany 2000U.S. 2000
Annual leisure hours
5000 6000 7000 8000
0.0
0.2
0.4
0.6
0.8
1.0
Overall, the results of tests for first-order stochastic dominance suggest that bivariate
social welfare cannot be robustly ranked for individuals 25 years of age and older at any order of
stochastic dominance. The sufficient conditions for ranking bivariate distributions require the
direction of stochastic dominance to be the same for both marginal distributions and the overall
bivariate distribution. Since first-order stochastic dominance implies higher-order stochastic
dominance, the direction of stochastic dominance over the income and leisure distributions will
conflict for all orders of stochastic dominance, given our results above. Thus, we will be unable
to establish sufficient conditions for ranking social welfare between the populations.
19
Table 1.4 Estimated p-values for tests of stochastic dominance for individuals 25 years of age and older
Germany and the U.K. income s incomeGer UK n.a. 0.000 0.000 n.a. 0.000 0.000
income s incomeUK Ger n.a. 0.325 0.935 n.a. 0.545 0.690
leisure s leisureGer UK n.a. 1.000 1.000 n.a. 1.000 1.000
leisure s leisureUK Ger n.a. 0.000 0.000 n.a. 0.000 0.000
sGer UK n.a. 0.110 0.000 n.a. 0.955 0.995
sUK Ger n.a. 0.000 0.000 n.a. 0.000 0.000 Germany and the U.S.
income s incomeGer US 0.000 0.000 0.000 0.000 0.000 0.000
income s incomeUS Ger 0.325 0.155 0.470 0.380 0.375 0.560
leisure s leisureGer US 1.000 1.000 1.000 1.000 1.000 1.000
leisure s leisureUS Ger 0.000 0.000 0.000 0.000 0.000 0.000
sGer US 0.000 0.000 0.000 1.000 1.000 1.000
sUS Ger 0.000 0.000 0.000 0.000 0.000 0.000 The U.K. and the U.S.
income s incomeUK US n.a. 0.000 0.000 n.a. 0.000 0.000
income s incomeUS UK n.a. 0.350 0.010 n.a. 0.455 0.440
leisure s leisureUK US n.a. 1.000 1.000 n.a. 0.680 1.000
leisure s leisureUS UK n.a. 0.000 0.000 n.a. 0.000 0.000
sUK US n.a. 0.000 0.000 n.a. 0.355 1.000
sUS UK n.a. 0.000 0.000 n.a. 0.055 0.000
Bivariate poverty dominance (see Duclos et al., 2004) provides an alternative way to
interpret the stochastic dominance results presented in Table 1.4. In particular, consider all
bivariate poverty indices of the form ( ) ( ),P x y dG xπΩ
= ∫∫ , y where is the poverty region
defined by a poverty frontier and
Ω
( ),x yπ is an individual’s contribution to poverty, given well-
20
being indicators x and y (income and leisure in our case).12 The “poverty focus axiom” induces
if ( ) and ( ), 0x yπ ≥ ,x y ∈Ω ( ),x yπ 0= otherwise.
Figure 1.3 Contour plots of the empirical CDFs of income and leisure for individuals 25 years of age and older in Germany and the U.K., 1990
0 10000 20000 30000
Annual post-government income per adult equivalent in 1990 USD
6500
7000
7500
8000
8500
Annu
al le
isur
e ho
urs
0.25
0.5
0.75
0.25
0.5
0.75
Germany 1990U.K. 1990
Duclos et al. (2004) show the equivalence between first-order stochastic dominance and
poverty rankings for all poverty indices that are continuous along the poverty frontier, regardless
of how the poverty frontier is defined and with , 0x yπ π ≤ , 0xyπ ≥ . Moreover, if we strengthen
the conditions on ( , )x yπ to include ,xx yy 0π π ≥ (i.e., the poverty index obeys the transfer
12 It is common to define bivariate poverty by either a union or intersection approach. In the former, an individual is considered to be poor if either indicator of well-being falls below its respective poverty line, whereas in the intersection approach an individual is considered poor only if both indicators fall below the respective poverty lines. An example of the latter would be the working poor who have little leisure and low income. More generally, the poverty cutoff in one dimension can be a function of the level of well-being in the second dimension, as in Duclos et al. (2004).
21
principle in both dimensions), ,xxy yyx 0π π ≤ (i.e., the transfer principle is stronger in one
dimension the lower the level of the other indicator), and 0xx yyπ ≥ then second-order stochastic
dominance of the bivariate distributions over the poverty space implies robust rankings for all
poverty indices that feature these properties. As an example, this class of poverty indices
includes the two-dimensional poverty gap measure, an extension of the FGT index (Foster et al.,
1984) to two dimensions. Thus, based on the test results for bivariate second-order stochastic
dominance, we conclude that bivariate poverty is robustly lower in Germany than in either the
U.K. or the U.S. in the sense described above and is also lower in the U.K. as compared to the
U.S. for individuals 25 years of age and older.
The above poverty orderings are quite general, as they are for all poverty frontiers that
one could care to define within the bivariate support. Stronger poverty orderings, which place
fewer restrictions on ( , )x yπ , are possible if the poverty frontier is restricted to lie within a sub-
region of the support. The obvious drawback is that poverty frontiers that extend beyond this
sub-region are excluded from the analysis. Table 1.5 displays our results of restricted first-order
stochastic dominance tests.
Table 1.5 Estimated p-values for tests of restricted first-order stochastic dominance for individuals 25 years of age and older
We restrict the domain to a lower-left quadrant based on an income upper limit of
$25,000, $32,000, and $40,000 (these are U.S. dollars) for the years 1983, 1990, and 2000,
respectively, and a time-invariant leisure upper limit of 7,000 hours annually. These regions
capture a large portion of the populations, as 14 to 20 percent of the German observations, 32 to
22
44 percent of the U.K. observations, and 41 to 44 percent of the U.S. observations, respectively
fall within these regions. Our results indicate that within these restricted supports the German
bivariate distribution first-order stochastically dominates the U.K. and the U.S. bivariate
distributions, while the U.K. joint distribution first-order stochastically dominates the U.S. in
2000.
We check the robustness of our main results by examining whether the differences in
employment rates are heavily influencing our conclusions by restricting the data to only those
individuals that reported positive hours of work. Table 1.6 displays summary statistics for all
working individuals. In this group, average income and leisure tend to show the same pattern as
for individuals 25 years of age or older.
Table 1.6 Summary statistics for all working individuals
1983 1990 2000Germany
Mean of income 9471 14488 19135Mean of leisure time 7095 7219 7132St. dev. of income 4908 8109 10197St. dev. of leisure time 605 534 559Correlation -0.162 -0.161 -0.224No. of observations 6557 5090 10150
United Kingdom Mean of income n.a. 14615 21840Mean of non-labour market time n.a. 6970 6946St. dev. of income n.a. 7794 14584St. dev. of non-labour market time n.a. 793 742Correlation n.a. -0.213 -0.175No. of observations n.a. 5122 5536
United States Mean of income 11709 17921 27609Mean of non-labour market time 7083 6968 6898St. dev. of income 7752 13281 24451St. dev. of non-labour market time 826 820 808Correlation -0.202 -0.190 -0.110No. of observations 9382 10077 10292
Note: All income values are reported in current year U.S. dollars and reported in per adult equivalent units. “n.a.” denotes not available due to lack of consistent data.
Our results for tests of first- and second-order stochastic dominance among all working
individuals are presented in Table 1.7. We reject first-order stochastic dominance of the bivariate
distributions for all comparisons except the U.S. relative to the U.K. in 1990. This result is
23
puzzling, however, when one considers the results comparing the two marginal distributions. The
integral test rejects the null hypotheses of weak first-order stochastic dominance of the U.K.
income distribution relative to the U.S. and the U.S. leisure distribution relative to the U.K.
Hence, the null of bivariate stochastic dominance should be rejected in both directions. This
inconsistency is similar to that noted above when comparing the bivariate distributions of
Germany and the U.K. in 1990 for all individuals 25 years of age or older. Other results suggest
that the German leisure time distribution second-order stochastically dominates both the U.K.
and U.S. distributions, the U.S. income distribution first-order stochastically dominates those of
Germany and the U.K., and the U.K. is an intermediate case.
Table 1.7 Estimated p-values for tests of stochastic dominance for all working individuals
An additional check on our primary results is offered by looking only at single-person
households, where the individuals are 25 years of age or older (see Table 1.9 for summary
statistics). For this population subset we need not be concerned with choice of an equivalence
scale for income, nor with our implicit assumption that there are no economies associated with
leisure time for a multi-person household. We present results for tests of first- and second-order
stochastic dominance over the entire support in Table 1.10. We find that the U.S. income
distribution is strongly first-order stochastically dominant with respect to Germany and second-
order dominant with respect to the U.K. Similarly, Germany’s leisure distribution is first-order
stochastically dominant with respect to the U.S. and second-order dominant with respect to the
U.K. Results of tests for bivariate second-order stochastic dominance also follow a similar
pattern to those of our primary population group. Germany is stochastically dominant with
25
respect to both the U.K. and the U.S., although only weakly in 2000 with respect to the U.K., and
the U.K. is stochastically dominant relative to the U.S.
Table 1.9 Summary statistics for all singles 25 years of age or older
1983 1990 2000Germany
Mean of income 8998 13914 17460Mean of non-labour market time 8121 8072 8084St. dev. of income 5937 8777 11219St. dev. of non-labour market time 886 867 898Correlation -0.444 -0.462 -0.392Percentage working 37.1 41.8 39.6No. of observations 1166 918 2712
United Kingdom Mean of income n.a. 14500 23008Mean of non-labour market time n.a. 8074 8082St. dev. of income n.a. 7408 18713St. dev. of non-labour market time n.a. 1015 1015Correlation n.a. -0.433 -0.211Percentage working n.a. 35.5 34.3No. of observations n.a. 930 1082
United States Mean of income 11940 16543 26973Mean of non-labour market time 7674 7567 7389St. dev. of income 7918 13881 25931St. dev. of non-labour market time 1051 1124 1105Correlation -0.435 -0.415 -0.270Percentage working 61.3 60.2 69.1No. of observations 1313 1653 1432
Note: All income values are reported in current year U.S. dollars. “n.a.” denotes not available due to lack of consistent data.
Our results for tests of restricted first-order stochastic dominance, which are provided in
Table 1.11, display patterns that are similar to those obtained for all individuals 25 years of age
or older. For singles 25 years or age or older, Germany is first-order stochastically dominant over
the restricted bivariate surface, while the U.K. is first-order stochastically dominant relative to
the U.S.
26
Table 1.10 Estimated p-values for tests of stochastic dominance for all singles 25 years of age and older
Germany and the U.K. income s incomeGer UK n.a. 0.000 0.000 n.a. 0.025 0.000
income s incomeUK Ger n.a. 0.735 0.825 n.a. 1.000 0.680
leisure s leisureGer UK n.a. 0.040 0.015 n.a. 0.745 1.000
leisure s leisureUK Ger n.a. 0.020 0.005 n.a. 0.020 0.015
sGer UK n.a. 0.010 0.000 n.a. 0.555 0.145
sUK Ger n.a. 0.030 0.005 n.a. 0.040 0.130 Germany and the U.S.
income s incomeGer US 0.000 0.000 0.000 0.000 0.000 0.000
income s incomeUS Ger 0.830 0.170 0.800 0.640 0.375 0.605
leisure s leisureGer US 1.000 1.000 1.000 1.000 1.000 1.000
leisure s leisureUS Ger 0.000 0.000 0.000 0.000 0.000 0.000
sGer US 0.045 0.205 0.055 1.000 1.000 1.000
sUS Ger 0.000 0.000 0.000 0.000 0.000 0.000 The U.K. and the U.S.
income s incomeUK US n.a. 0.000 0.000 n.a. 0.005 0.000
income s incomeUS UK n.a. 0.005 0.035 n.a. 0.235 0.410
leisure s leisureUK US n.a. 1.000 1.000 n.a. 1.000 1.000
leisure s leisureUS UK n.a. 0.000 0.000 n.a. 0.000 0.000
sUK US n.a. 0.275 0.340 n.a. 1.000 1.000
sUS UK n.a. 0.000 0.000 n.a. 0.000 0.000 Table 1.11 Estimated p-values for tests of restricted first-order stochastic dominance for singles 25 years of age and older
premiums for low-skilled workers, (2) faster movement into wage and salaried jobs for low-
skilled workers, and (3) more rapid job growth in formal enterprises.
2.3.3 Inequality evidence since 1998
Following 1998, there is much less evidence in the literature concerning the evolution of
inequality. VASS (2007) reports an increase in expenditure inequality between 1998 and 2002,
but a stable level of expenditure inequality between 2002 and 2004 for all of Vietnam. Further,
they report that within urban areas expenditure inequality increased between 1998 and 2002,
before falling again by 2004. By comparison, expenditure inequality within rural areas slowly
increased between 1998 and 2004, such that inequality within rural areas was approaching
inequality levels in urban areas. Hence, between 1998 and 2004, only 39 percent of the increase
in expenditure inequality was due to the increase in the urban-rural gap while 61 percent was due
to changes in inequality with urban and rural areas. Again, this is based on the Theil's index of
inequality.
McCaig (2009) also examines the trends in poverty reduction between 2002 and 2004 in
his analysis of the impact of the USBTA. He finds that the national poverty rate fell by 9.4
percent between 2002 and 2004. In addition, the average decrease in provincial poverty was 31.1
percent in the same time period. Between 2002 and 2004, provinces that were more exposed to
the USBTA experienced faster decreases in poverty. He finds that an increase of one standard
deviation in provincial exposure approximately equivalent to an increase in exposure from the
25th percentile, the province of Ha Tinh, to the 75th percentile, the province of Tien Giang, leads
to a reduction in the poverty headcount ratio of approximately 11 to 14 percent, but this effect
diminishes the further the province is from a major seaport.
2.4 Our Contribution
In contrast to most of the existing literature on inequality within Vietnam, we focus on
per capita income instead of per capita consumption. We do this for a number of reasons. First,
income itself is an important determinant of well-being and hence deserves analysis. Second, the
existing consumption/expenditure estimates generated by the GSO and the World Bank have
some deficiencies. For example, as explained in Glewwe (2005), the imputed value of housing
43
services is estimated as a constant share of other non-food expenditures. In urban areas housing
is valued at 21.4 percent of other non-food expenditures and in rural areas the corresponding
share is 11.8 percent. Hence, the comparable consumption data series may be severely
undervaluing housing if the value of housing has grown more rapidly than other non-food
expenditures. This seems a likely outcome. Third, using income provides us the basis for
additional decomposition exercises to help understand the impact of Vietnam’s changing
economic structure on inequality. Fourth, in an economy that has been growing as rapidly as
Vietnam’s, consumption may lag behind income as households take time to adjust to their higher
permanent income.
With these reasons in mind, we construct estimates of household income that are as
consistent as possible across the five household surveys and that are based on market prices. We
are not aware of similar exercises that have been done for Vietnam. As we allude to throughout
the paper, changes in the questionnaire and changes in the survey framework make this a
challenging task.
2.5 Data
The data used for our analysis comes from five nationally representative household
surveys conducted by the General Statistics Office (GSO) of Vietnam: The 1992/93 and 1997/98
Vietnam Living Standards Surveys (VLSS) and the 2002, 2004 and 2006 Vietnam Household
Living Standards Surveys (VHLSS).14 Each survey collected detailed information from
households on income generating activities, expenditures, employment, education, health, land
holdings, and remittances.
Table 2.1 provides summary information relating to these surveys. A total of 4,800 and
6,000 households were interviewed for the 1993 VLSS and the 1998 VLSS. Beginning with the
2002 VHLSS, the number of households interviewed was greatly increased however not all
households were asked the full set of modules in the questionnaires. In 2002, 2004 and 2006, a
total of 74 347, 45 945, and 45 945, households were surveyed, of which 29 530, 9 188 and 9
189 were asked the full questionnaire.
14 For simplicity, from hereon we refer to the 1992/93 VLSS survey as the 1993 VLSS and the 1997/98 VLSS as the 1998 VLSS since the majority of households were interviewed in 1993 and 1998 respectively.
44
Table 2.1 Summary information on the sample design of each household survey
Survey Year 1993 1998 2002 2004 2006
Number of households 4,800 5,999 29,530 9,188 9,189 Number of communes 150 194 2,901 3,063 3,063
Number of strata 2 10 122 128 128
We restrict our analysis to households asked the full questionnaire in the three VHLSS
for two reasons. First, McCaig (2008a, 2008b) documents large differences in mean income
between households that were asked the entire questionnaire and those that were not asked the
expenditures module in both the 2002 and 2004 VHLSS. In all other respects, e.g. household
size, demographic composition, landholding, economic activity, etc., these households are
identical.15 Second, some questions related to land rental income and payments were only part of
an extended land module in the 2004 VHLSS, which was asked of the smaller 9 188 household
sample. Thus, in an effort to be as consistent as possible in our definition of income across the
surveys, we exclude the households from the 2002, 2004 and 2006 VHLSS that were not asked
the expenditure module.
2.5.1 Household survey design
In 1993, 80 percent of the households interviewed were living in rural areas, with the
remaining 20 percent drawn from urban areas. By 2006, the share of urban households
interviewed had grown slightly to 25 percent. These numbers are based on the contemporaneous
definition of urban and rural within each survey. Note, however, that the shares of urban and
rural households in the surveys are not necessarily the same as the shares in the overall
population as the five surveys used stratified sampling procedures. The GSO estimates the rural
population to be 23.6 percent of the total population as of April 1, 1999 according to the 1999
population census.
15 There are several potential explanations for this. First, it is possible that households asked about expenditures recall income differently than households that are only asked about incomes. Second, enumerators may use information on consumption as a check on reported income information. And third, because of the extended length of interviews involving both income and expenditure, more able enumerators may have been assigned to these households.
45
The 1993 VLSS was self-weighted with each household in the survey having an equal
probability of being selected.16 A total of 150 communes throughout the country were selected
with probability proportional to their size, with 20 percent in urban areas and 80 percent in rural
areas. In the next stage, two clusters in urban areas or two villages in rural areas were randomly
drawn from each commune with probability proportional to their size. Finally, 20 households
were randomly selected from each of the 300 clusters or villages. The 1993 sample includes
households from rural areas in 51 provinces and from urban areas in 21 provinces out of 53
provinces at the time. There are two provinces from which no households were sampled.
The 1998 VLSS had a much more complicated survey framework made up of three
groups of households: households that were reinterviewed from the 1993 VLSS, households
replacing those from the 1993 VLSS that could not be located, and households that were added
to the sample to increase the sample size.17 Also, in contrast to the 1993 VLSS, the 1998 VLSS
was stratified not just by urban-rural, but also by three urban and seven rural areas, for a total of
10 strata. The three urban strata are (1) Ha Noi and Ho Chi Minh City, (2) other cities, and (3)
the remaining urban population in towns. The seven rural areas are the Northern Mountains, the
Red River Delta, the North Central Coast, the South Central Coast, the Central Highlands, the
Southeast, and the Mekong River Delta. The finer level of stratification meant that the survey
included households in 59 of Vietnam’s 61 provinces.18 Households were surveyed in rural areas
of 58 provinces and in urban areas of 29 provinces.
Starting with the 2002 VHLSS the sampling framework changed in two important
ways.19 First, the stratification was now done for urban and rural areas in each province,
bringing the total number of strata to 122. The GSO wanted to be able to produce reliable
16 The discussion of the sampling framework for the 1993 VLSS is taken from World Bank (2000), “Viet Nam Living Standards Survey (VNLSS), 1992-93: Basic Information.” 17 The discussion of the sampling framework for the 1998 VLSS is taken from World Bank (2001), “Vietnam Living Standards Survey (VLSS), 1997-98: Basic Information.” 18 The number of provinces within Vietnam increased from 1993 to 2006. In 1993 the number of provinces was 53. Between 1993 and 1998 eight provinces were split into two new provinces, bringing the total number of provinces to 61. Between the 2002 and 2004 surveys three more provinces split in two, bringing the total number of provinces to 64. To be consistent across the surveys we code households and provinces according to the 61-province definitions. 19 The discussion of the sampling framework for the 2002 and 2004 VHLSSs is taken from GSO (year unknown), “Vietnam Household Living Standards Surveys (VHLSS), 2002 and 2004: Basic Information.”
46
estimates of poverty at the provincial level, which necessitated the stratification by province.
Second, the sampling framework was based on the master list from the 1999 Population and
Housing Census. Communes within each stratum were selected with a probability proportional to
the square root of its population, followed by the selection of one ward in urban areas or one
village in rural areas per commune, again with the selection probability being proportional to the
square root of the population. Finally, households were randomly selected within each ward or
village. One further noteworthy difference is the large increase in the number of communes
covered by the 2002 VHLSS relative to the two earlier VLSSs. In 2002 a total of 2 901
communes are included in the sample as compared to only 150 in the 1993 VLSS and 194 in the
1998 VLSS.
The sampling framework for the 2004 and 2006 VHLSSs is essentially identical to that of
the 2002 VHLSS with a few minor exceptions. First, the number of households selected per ward
or commune was decreased from an average of ten to three. Second, the number of communes
selected increased slightly to 3 063. Third, since three provinces split in two between 2002 and
2004, raising the number of provinces to 64, the total number of strata increased to 128.
2.5.2 Household membership and missing members
For the 1993 VLSS household members were defined to include “all the people who
normally live and eat their meals together in this dwelling.” Those who were absent more than
six of the last twelve months were excluded, except for the head of the household and infants less
than six months old. If the individual was away for more than six months out of the past twelve,
they were only considered a household member if they were a student living away from home
but still supported by the household, or if a new member had recently joined the household and
they would be living there permanently in the future. There were a few minor changes in
definition of household member between the two VLSSs, and then between the VLSSs and the
VHLSSs, but this only affected a very small percentage of individuals and households.
Despite the high degree of comparability of the definition of household membership, a
comparison of gender-age cohorts across the household surveys with the projections based on the
1999 Population and Housing Census suggests some “missing” individuals within the surveys.
Essentially, the problem is that the household surveys seem to be under enumerating young
individuals and young couples with children relative to the census-based projections. Figures 2.1
47
and 2.2 demonstrate the issue. Figure 2.1 compares the age structure for males and females in the
1993 VLSS with an age distribution projection based on the 3 percent sample of the census,
while Figure 2.2 does the same for 2006. Each figure is adjusted by the appropriate sampling
weight. A positive value for the difference in population between the two sources indicates that
the census numbers suggest a larger cohort than the corresponding household survey. A negative
value suggests the opposite. Two key insights emerge. First, the household surveys appear to
under-enumerate very young children and adults in their 20s and 30s, which is offset by a
disproportionately larger share of teenagers and older adults. Second, the patterns in the 1993
and 2006 comparisons are very similar, but the magnitude of the differences appears to be
growing over time. In 1993, the sum of one-half20 of the absolute value of the difference was
equal to 3.5 percent of the population, but by 2006 this increased to 9 percent.21
Figure 2.1 Difference between the projected 1993 population using the 1999 census and the 1993 VLSS
-500000-400000-300000-200000-100000
0100000200000300000400000500000
1 3 5 7 9 11 13 15 17 19 21
Census Projections minus Weighted 1993
Gap_Males
Gap_Females
Note: The population has been grouped into five-year age groups. Thus, the label 1 on the horizontal axis corresponds to individuals aged 0 to 4, the label 2 corresponds
20 Because the two populations are the same size, an overestimate in the size of one group is exactly offset by an underestimate elsewhere. Thus, we divide by 2. 21 This is partially confirmed using information from the 2006 VHLSS to identify those individuals in the 2004-2006 household panel that drop out. Those individuals leaving because of work or household division represent 3.6 percent of the population in 2004 of households in the 2004-2006 panel.
48
to individuals aged 5 to 9, etc.
Figure 2.2 Difference between the projected 1993 population using the 1999 census and the 2006 VHLSS
Census Projections minus Weighted 2006 VLSS
-1000000
-500000
0
500000
1000000
1500000
1 3 5 7 9 11 13 15 17 19 21
Demographic Cohort
Gap_MalesGap_Females
Note: The population has been grouped into five-year age groups. Thus, the label 1 on the horizontal axis corresponds to individuals aged 0 to 4, the label 2 corresponds to individuals aged 5 to 9, etc.
The most likely reason for this is migration to urban areas, which is not being picked up
in the urban samples of the household surveys. The combination of “missing” young children
and males and females in their 20s and 30s suggests that some of this is due to entire families
migrating as opposed to single individuals. The biases that these missing individuals will
introduce into our estimates of mean income and inequality are hard to assess. On the one hand,
the missing adults are likely better educated than older cohorts and thus their per capita incomes
may be higher than the individuals observed in the household surveys. However, part of their
income may be picked up by the surveys in terms of gifts and remittances. Trying to decipher the
impact on inequality is even more problematic. We do not have a solution to this problem, but
raise the issue because it is important for evaluating the comparability of the surveys across time
and the representativeness of the surveys relative to the overall population. This problem would
impact all studies using the same five household surveys, using either income or consumption
estimates. To our knowledge this issue is overlooked in the relevant literature.
49
2.6 Estimation of household income
We aggregate annual household income into six major categories: income from crops,
income from agricultural sidelines, household business income, wage income, gifts and
remittances, and finally “other” residual sources of income. Each income component is gross of
taxes and fees. Income from crops is net income (gross revenue22 minus current expenditures)
from rice; other cereal, vegetable, and annual crops; industrial crops; fruit crops; and crop by-
products such as straw, leaves, etc. Agricultural sidelines include livestock, aquaculture, other
animal products, and agricultural services. Household business income is net income from non-
agriculture, non-forestry, and non-aquaculture businesses run by the household and includes the
processing of agricultural, forestry, and aquacultural products. Wage income includes salary or
wage payments plus additional payments such as holiday contributions, social insurance
payments, etc. for all jobs worked by the individual during the past 12 months. Gifts and
remittances include payments from both domestic and overseas sources. Finally, “other” residual
sources of income include items such as government transfers and earned interest.
We convert all nominal incomes into nationally representative January 2006 prices using
three sets of deflators. First, since the households within each survey are interviewed during
different months we use the monthly deflators included in the datasets to convert the reported
values to January prices of the respective survey year. Second, to reflect differences in the cost of
living across regions we employ regional deflators. And third, to link January prices of each
respective survey year to January 2006, we utilize GSO monthly CPI figures. The respective CPI
inflators are 2.028, 1.393, 1.279, and 1.193 for 1993, 1998, 2002, and 2004.
For 2002 and 2004, we selected not to use the regional deflators provided by the GSO
with the household datasets. Table 2.2 helps show why, especially for the urban deflators. In
1993, the deflator for the urban Southeast, which includes Ho Chi Minh City, is 1.23 implying
that prices were, on average, 23 percent higher than the national average. However, in 2002 and
2004, the regional deflator was only 1.04, before increasing to 1.23 again in 2006. This extreme
22 We calculate the value of harvested output at market prices following Benjamin, Brandt, and Giles (2005). See
the appendix for more details.
50
movement seems highly improbable as do prices in the urban Southeast being only 4 percent
higher than the national average. Thus, instead of using the deflators supplied with the 2002 and
2004 VHLSS datasets, we impute the regional deflators for these years based on the regional
deflators for 1998 and 2006. These are shown in the bottom panel of Table 2.2.
To assist the reader in interpreting the income levels reported below, note that in 2005 the
market exchange rate was 15 900 Vietnamese dong per US dollar. In purchasing power parity
terms, the International Comparison Program estimated the value of a US dollar in 2005 at 4 700
Vietnamese dong (World Bank, 2008).
Table 2.2 Regional deflators included in the household surveys
Regional deflators included in the dataset Region 1993 VLSS 1998 VLSS 2002 VHLSS2004 VHLSS 2006 VHLSS
URBAN Red River Delta 1.066 1.056 0.990 1.017 1.084
Northern Uplands 1.082 0.993 North East 0.984 1.008 0.963 North West 1.059 1.061 1.020 North Central Coast 1.011 1.025 0.988 1.004 0.996 South Central Coast 1.055 1.053 1.011 1.013 1.073 Central Highlands 1.071 1.059 1.036
South East 1.232 1.134 1.044 1.043 1.234 Mekong River Delta 1.093 1.013 1.027 1.018 1.096 RURAL
Red River Delta 0.895 0.916 0.949 0.948 1.007 Northern Uplands 0.969 1.018
North East 0.977 0.974 0.908 North West 1.041 1.030 0.989 North Central Coast 0.981 0.938 0.967 0.974 0.862 South Central Coast 0.976 0.974 0.996 0.996 0.976 Central Highlands 1.064 1.060 1.068 1.043 0.931
South East 1.115 0.965 1.037 1.033 1.061 Mekong River Delta 1.020 1.027 1.023 1.015 0.958
51
Table 2.2 Continued Regional deflators used in the analysis
Red River Delta 1.066 1.056 1.070 1.077 1.084 Northern Uplands 1.082 0.993
North East 0.978 0.970 0.963 North West 1.007 1.013 1.020 North Central Coast 1.011 1.025 1.011 1.003 0.996 South Central Coast 1.055 1.053 1.063 1.068 1.073 Central Highlands 1.036 1.036 1.036
South East 1.232 1.134 1.183 1.208 1.234 Mekong River Delta 1.093 1.013 1.054 1.075 1.096 RURAL
Red River Delta 0.895 0.916 0.961 0.984 1.007 Northern Uplands 0.969 1.018
North East 0.981 0.944 0.908 North West 1.024 1.006 0.989 North Central Coast 0.981 0.938 0.899 0.880 0.862 South Central Coast 0.976 0.974 0.975 0.976 0.976 Central Highlands 1.064 1.060 0.994 0.962 0.931
South East 1.115 0.965 1.012 1.036 1.061 Mekong River Delta 1.020 1.027 0.992 0.975 0.958
2.7 Main evidence As background to our analysis of the evolution of incomes using the five household
surveys, we begin by examining two related dimensions of household behaviour: the structure of
employment in Section 2.7.1; and household endowments in Section 2.7.2, including household
size, distribution of land and distribution of human capital (education). We outline the initial
statistics and trends across the relevant time period. Following this, in Section 2.7.3 we examine
changes in incomes. This includes a discussion of mean per capita income, a comparison
between consumption and income behaviour, and a decomposition of total income into its
various sources. This information will be used to make some preliminary correlations to our
results on the evolution of inequality.
52
2.7.1 Impacts of reforms on employment
The economic reforms from the Doi Moi period highlighted in Section 2.2 can be seen in
all facets of Vietnam’s economy. In this section, we provide a brief description of how the
composition of the labour market has changed between 1993 and 2006. We do not explicitly link
the labour market changes to any particular policy changes. Instead, our goal is to illustrate the
profound impacts that the reforms have had on the structure of the economy and also to provide a
picture of the initial conditions of our study.
In 1993, Vietnam’s labour force was overwhelmingly engaged in the primary sector,
which includes agriculture, forestry and aquaculture.23 Table 2.3 provides a breakdown on the
basis of the primary, secondary and tertiary sectors for an individual’s main job. In 1993, 73.7
percent of all labour was classified as working in the primary sector, followed by 15.6 percent in
the tertiary sector, and 10.6 percent in the secondary sector. Note the significantly higher
percentage in the primary sector in the north than in the south (81.2 versus 65.4). The share of
primary sector employment falls steadily over time, and by 2006 the percentage employed fell to
53.0, or a reduction of 20.7 percentage points over a fifteen year period. Labour leaving
agriculture was absorbed by rapidly growing employment in the tertiary sector, and to a slightly
lesser extent, the secondary sector. Also, by 2006, the gap in the role of primary sector
employment between the north and south had narrowed considerably.
Two other dimensions of the structure of employment are noteworthy. First, a significant
percentage of those working in Vietnam are self-employed as shown in Table 2.4. In 1993, the
percentage was 84.1 percent. A good portion of this reflects the fact that a high percentage of
those working in the dominant primary sector are working on their own farms, but in 1993 more
than half of those working in either the secondary or tertiary sector were also self-employed,
typically in small family businesses. By 2006, the percentage of the labour force that was self-
employed fell to 66.9 percent. Self-employment in the primary and tertiary sectors remained
especially important, but in the secondary sector, the percentage working for other households or
firms rose to three out of every four workers.
23 We define the labor force to include all individuals that report working during the past 12 months, regardless of age or duration of work.
53
Table 2.3 Percentage of workers by major industrial sector 1993 1998 2002 2004 2006 All of Vietnam Primary 73.7 69.6 60.2 56.0 53.0 Secondary 10.6 11.0 16.0 17.7 19.3 Tertiary 15.6 19.0 23.8 26.3 27.8 North Vietnam Primary 81.2 76.9 64.6 60.9 61.3 Secondary 8.6 9.1 15.4 17.6 16.8 Tertiary 10.2 14.0 20.0 21.5 21.8 South Vietnam Primary 65.4 60.9 56.7 50.9 51.7 Secondary 13.0 14.3 16.5 17.9 18.2 Tertiary 21.7 24.8 27.8 31.3 30.1 Rural Vietnam Primary 84.5 82.4 71.3 68.0 65.3 Secondary 6.8 7.5 13.1 15.0 16.6 Tertiary 8.8 10.1 15.6 17.0 18.1 Urban Vietnam Primary 24.3 17.9 19.4 17.0 15.7 Secondary 28.5 27.4 26.4 26.5 27.4 Tertiary 47.2 54.8 54.2 56.5 57.0 Note: This is based on the individual’s main job during the previous 12 months. All figures have been weighted to account for non-equal sampling probabilities. All workers are included regardless of age or duration of work.
Table 2.4 Shares of workers that are self-employed versus working for others 1993 2006 Self-employed 84.1 66.9 In primary 69.2 47.7 In secondary 4.9 4.9 In tertiary 10.0 14.4 Working for others 15.9 33.1 In primary 4.5 5.3 In secondary 5.8 14.4 In tertiary 5.6 13.4
Second, a significant percentage of those working for wages were in the state sector,
which was second only to the private sector (defined here to include working for other
households) as a source of wage employment in 1993 (see Table 2.5). As a result, the
distribution of earnings, and thus inequality, were affected by wage behaviour in the state sector.
In 1993, 38.6 percent of workers were in the state sector, with an even higher percentage, 54.6
percent, in the north, compared to 30.7 percent in the south. The role of the state sector is even
54
more pronounced if we focus solely on the urban sector.24 After 1993, the role of state sector
employment falls as the percentage of the labour force working for wages rises from 15.9 in
1993 to 33.1 percent in 2006. But even as late as 2006, nearly one out of every three individuals
that were not working for themselves was employed by the state. In the urban sector, it was 43.6
percent, with the percentage nearly sixty percent in the north, and a third in the south. By
comparison, the percentage of those working for wages that were employed by foreign firms was
only 6 percent, which amounts to only 2 percent of the entire labour force.
Table 2.5 Share of wage earners working for the state, collectives, the private sector, and the foreign sector
24 The estimates for 1993 already reflect significant layoffs in the state sector in the late 1980s, implying an even larger role for the state sector a few years earlier.
55
As we can see, there have been significant changes in the structure of employment in
Vietnam between 1993 and 2006. This information proves interesting when we decompose
inequality by source of income in the Section 2.8.
2.7.2 Evolution of household and individual endowments
Both the level and distribution of per capita incomes in Vietnam are influenced by
endowments, as well as the returns to these endowments. These relationships are often
formalized through decomposition exercises of inequality measures. Our immediate objective
here is more modest, and largely descriptive. We focus on two key endowments: land and human
capital, supplemented with a brief discussion of household size. In 1993, more than 70 percent of
the labour force was in the primary sector (see Table 2.3) and income from cropping accounted
for slightly less than half of total rural household income (see Table 2.11 below). Hence, the
distribution of land is important for income and inequality, especially early in the reform process.
In the transition from a planned to a more market-based economy, we also expect human capital
to figure more prominently as remuneration in the labour market increasingly reflects individual
productivity. This has been the case in China (Zhang et al. (2005)).
2.7.2.1 Household size
In Table 2.6, we report information on household size for each of the five household
surveys. In 1993, average household size was 4.97. Average household size was significantly
larger in the south than in north, 5.41 versus 4.59, however in both regions there were no
differences between urban and rural households. Over the next 14 years, average household size
falls to 4.26, a decline of nearly fifteen percent. We observe an even larger reduction in the
South, and by 2006 differences in average household size between the north and south have
largely disappeared.
Reductions in household size reflect a multitude of factors including changes in fertility
behaviour, a shift in household types, e.g. from extended to nuclear, and changing living
arrangements. These factors impact dependency ratios, typically measured as the ratio of
dependents in the household to those working, and thus household income and inequality. These
are potentially informative, but will not be examined here.
56
Table 2.6 Mean household size 1993 1998 2002 2004 2006 All Vietnam 4.97 4.70 4.43 4.34 4.20 Rural 4.97 4.80 4.49 4.39 4.25 North 4.67 4.58 4.38 4.28 4.16 South 5.38 5.10 4.63 4.51 4.35 Urban 4.94 4.38 4.25 4.23 4.08 North 4.15 3.98 4.01 3.97 3.83 South 5.46 4.66 4.41 4.38 4.24
2.7.2.2 Distribution of land
As previously mentioned, access to land will be important in a country with such a high
share of income derived from cropping. In this section, we focus on access to land of rural
households. There are marked differences across regions in terms of land access and inequality,
which can be attributed to alternative historical and institutional factors at work (Brandt et al.,
2006). These differences are often glossed over in the literature, which tends to paint a single
picture of the evolution of land access, singling out as the watershed events the
decollectivizatioun of agriculture in the late 1980s and the 1993 Land Law. For major parts of
Vietnam, collectivized agriculture (and thus, decollectivization) was never part of the story.
Agricultural land in Vietnam includes several major categories of land including: annual,
perennial, water surface area, and forestry land. Also usually included are gardens and ponds
adjacent to households’ residential land that is not included in the four primary categories, but
which is used for agriculture-related purposes. In Table 2.7, we report summary data on the
inequality of household agricultural land holdings for select years.25 We focus on landholdings
for which households report having long-term use rights.26 In the regions in the north, a highly
egalitarian redistribution of land in the late 1980s contributed to low levels of landlessness, and
low overall inequality of landholdings. In the south, especially in the densely populated
Southeast and Mekong River Delta, we observe significantly higher degrees of landlessness, and
25 Differences in 2002 in how land is reported make the data less than perfectly comparable. 26 We exclude land that households may have rented in. Land rental remains relatively minor in rural Vietnam, and inequality of landholdings based on cultivated holdings, i.e. land with long-term use rights, land rented out + land rented in, is very similar to that based on land with long-term use rights.
57
inequality. In 1993, nearly one out of every six households in these two regions is landless. The
contrast with the Red River Delta is especially large.
Table 2.7 Landlessness and land inequality in rural Vietnam
% Landlessness Gini 1993 1998 2004 1993 1998 2004 North Vietnam Red River Delta 2.81 1.13 6.04 0.28 0.30 0.38 North East 1.74 5.45 4.52 0.42 0.54 0.59 North West 0.00 4.98 1.47 0.38 0.30 0.52 North Central Coast 4.17 8.37 8.54 0.41 0.59 0.49 South South Central Coast 10.62 6.35 13.86 0.34 0.43 0.69 Central Highlands 6.25 13.46 4.15 0.52 0.56 0.43 South East 17.50 21.97 38.83 0.54 0.62 0.75 Mekong River Delta 15.90 21.32 26.22 0.51 0.56 0.62 All rural Vietnam 7.20 10.30 14.38 0.49 0.57 0.64
Over time, we observe an increase both in the degree of landlessness and inequality of
landholding. The increase is especially sharp in the Southeast and the Mekong River Delta,
where by 2004 landlessness rises to 38.8 and 26.2 percent of all rural households, respectively. In
the north, the percentage of households that are landless rises as well, but still remains
considerably below that in the south, with the exception of the Central Highlands. More
noticeable in the north are the increases in land inequality, which rises fairly monotonically over
time.
All else being equal, we expect rising inequality in land access to be associated with
rising income inequality. Of course, this will be offset by the declining role of land and
agriculture in the economy, and the growth in off-farm opportunities, both of which are
important parts of the story.
2.7.2.3 Distribution of human capital (education)
Productivity of individuals is linked to their human capital, which includes both health
and education. In a socialist economy, the link between earnings and productivity is often
dampened as a result of more egalitarian institutions. These institutions tend to break down with
the emergence of a market-based economy and we expect factors like education to play a larger
role in income-determination. Two factors are potentially at work. First, on the margin,
58
individuals will be able to capture a larger portion of their marginal product. Second, the reform
process may also put a premium on education in the labour market, and the returns to education
will likely rise. A discussion of the evolution of educational attainment in Vietnam will provide
some tentative insights into the accompanying evolution in income inequality.
In Tables 2.8 and 2.9, we report for each survey, and for all individuals in the labour
force, mean educational attainment and the Gini coefficient for educational attainment. In 1993,
mean educational attainment was 6.15 years, which rises fairly smoothly to 7.62 years by 2006.
Especially noteworthy are the initial differences between the north and south, and urban and
rural. On average, individuals in the north had nearly 1.5 more years of education. In both
regions, education attainment of individuals living in rural areas lags behind that of urban
residents.
Table 2.8 Mean levels of education of all workers
1993 1998 2002 2004 2006 All of Vietnam 6.15 6.59 7.22 7.59 7.62 Urban Vietnam 7.97 8.53 9.18 9.44 9.25 Rural Vietnam 5.75 6.10 6.69 7.01 7.08 North Vietnam 6.75 7.21 7.96 8.29 8.32 South Vietnam 5.33 5.85 6.46 6.85 6.91 Note: Workers include all individuals that reported working during the last 12 months regardless of age or duration of work.
Table 2.9 Gini coefficient estimates of educational inequality for all workers
1993 1998 2002 2004 2006 All of Vietnam 0.344 0.309 0.289 0.279 0.267 Urban Vietnam 0.275 0.246 0.239 0.235 0.210 Rural Vietnam 0.353 0.316 0.291 0.282 0.277 North Vietnam 0.297 0.262 0.240 0.228 0.216 South Vietnam 0.389 0.356 0.328 0.323 0.309 Note: Workers include all individuals that reported working during the last 12 months regardless of age or duration of work.
Higher levels of educational attainment in the north in 1993 were accompanied by much
lower levels of inequality. For example, the Gini coefficient for the educational inequality for all
workers was 0.297 in the north compared to 0.389 in the south. Over time, however, and in the
context of a secular rise in educational attainment, we observe a significant reduction in
59
inequality. This is reflected in a reduction in urban-rural and north-south differences, as well as
in the Gini coefficients reported in Table 2.9 for all of Vietnam and separately for urban and
rural and north and south. Between 1993 and 2006, the aggregate Gini falls from 0.344 to 0.267,
a reduction of more than twenty percent. At every level, we observe a reduction in inequality in
educational attainment.27 More equal access to education over time is likely contributing to this
fall through lower within-cohort educational inequality for younger cohorts.
National income accounts suggest that between fifty and sixty percent of GDP represents
the returns to labour. Hence, all else equal, we might expect falling inequality in educational
achievement in the labour force to be associated with reductions in income inequality.
In the remainder of the paper we do not explicitly link the changes in educational
attainment inequality to changes in income inequality. Instead, the purpose of the above
discussion was to highlight how the distribution of an important determinant of individual
remuneration, education, has evolved over time as changes in individual endowments will
influence income.
2.7.3 Incomes
We being our discussion of income by examining the trends in mean per capita income
for all of Vietnam, rural and urban Vietnam, and the rural and urban areas disaggregated by
North and South. Next, we compare our estimates of mean per capita income with estimates of
mean per capita consumption using the GSO/World Bank consumption estimates. Subsequently,
we focus on the individual components of income (crops, wages, etc.) to elucidate which sources
of income have been important contributors to the rapid income growth experienced in Vietnam
between 1993 and 2006. Furthermore, since participation in and earnings from income activities
are often correlated with a household’s position in the income distribution (e.g., poorer
households are more likely to derive income from crops), examining which income sources grew
the most will start to provide us with a sense of the distributional consequences.
27 Rising enrolment rates for primary and secondary education suggest that this trend may continue. See World Bank (2004), p. 15.
60
2.7.3.1 Mean per capita income
We begin our analysis of income by examining the behaviour of our estimates of mean
real per capita income. For reference, we also include the GSO/World Bank’s estimate of
consumption.28 In Table 2.10, we report means for all five years and growth rates for select
periods for all of Vietnam, rural Vietnam, urban Vietnam, and then rural and urban separately for
both north and south. Per capita incomes in 1993 (in 2006 VND) were 2.69 million VND. Urban
incomes were more than two times larger than rural, while incomes in the south were also
slightly higher than in the north. The largest difference between the two regions was with respect
to urban incomes, which were 20 percent higher in the south.
Over the entire period, per capita incomes grew at an impressive rate of 8.4 percent,
which implies a more than tripling of per capita incomes over the 13 year period. In 2006,
average per capita incomes were 7.99 million VND. Income growth was especially rapid
between 1993 and 1998, slowed sharply between 1998 and 2002 (presumably this was partially
due to the Asian Financial Crisis), and then rose to 8.9 percent between 2002 and 2006, only
modestly below the rate of growth achieved between 1993 and 1998.29
Overall, growth was relatively balanced between the urban and rural sectors. In sharp
contrast to China’s experience during a similar period in its development, rural per capita income
growth outpaced that in urban Vietnam, at 8.4 percent versus 7.2 percent, which contributed to a
slight decline in the urban-rural income ratio. However, the aggregate figures for Vietnam
conceal differences between the two regions. In the north, urban growth was slightly faster than
rural growth (8.7 versus 7.8 percent), but in the south, income growth in the urban sector lagged
significantly behind its rural counterpart (6.5 versus 9.1 percent). As a result, the urban-rural
income ratio in the north drifted upwards slowly, while that in the south fell sharply from 2.11 to
1.51 percent. Note also the more rapid growth in the rural south than in the north (9.1 versus 7.8
percent), and the exact opposite with respect to urban incomes (6.5 versus 8.7 percent).
28 In future work, we intend to construct our own estimates of consumption, with an eye to reconciling the behaviour of income and consumption. 29 Estimates of nominal GDP growth from the World Development Indicators suggest a similar pattern of high growth, followed by moderate growth, and then high growth again. Between 1993 and 1998 nominal GDP grew at a per annum rate of 8.3 percent, followed by 6.4 percent between 1998 and 2002, and 7.9 percent between 2002 and 2006.
61
Table 2.10 Mean real per capita income and consumption (000 VND)
Note: All values have been spatially and temporally deflated to January 2006 prices.
62
2.7.3.2 Comparison of Consumption and Income Behaviour
In comparison to income, consumption growth has been slower at 5.80 percent per annum
as compared to 8.37 percent per annum. This could be due to a number of factors. First, if the
value of housing services has been rising more rapidly than the value of other non-food
consumption, then we may be under-estimating the value of true consumption, especially in the
latter surveys. If this is true, then the measurement error affects essentially all papers that use the
GSO/World Bank consumption series contained within the datasets as we are not aware of any
authors that have attempted to deal with this problem. Second, the estimates of mean per capita
income and consumption in 1993 suggest that household savings was negative, subject to
measurement error in both income and consumption of course. Thus, it seems likely that either
(1) per capita income in 1993 is under-estimated, (2) per capita consumption in 1993 is over-
estimated, (3) or both sources of error exist. Third, our estimates of household income do not
include the service value of household durables and housing, whereas the consumption estimates
do.
We can get a rough estimate of the impact of excluding the service value of durables and
housing by looking at the estimates presented by Benjamin and Brandt (2004). They construct
their own estimates of household and per capita consumption, which suggest that the per capita
savings rate in 1993 was 15.3 percent in urban areas and 7.8 percent in rural areas when income
and consumption estimates both include the imputed service value of durables. Excluding the
difficult to measure value of durables, the savings rates are 19.1 percent and 8.7 percent in urban
and rural areas, respectively. Unlike Benjamin and Brandt (2004) our income estimates do not
include the value of services whereas the consumption estimates do. Adding their value of
services for urban and rural areas onto our estimates of income would lead to a positive savings
rate nationally and in particular in urban areas.30
Despite these measurement issues it is clear that per capita consumption grew very
rapidly between 1993 and 2006, both in urban and rural areas. The estimates also show the
marked slowdown in growth between 1998 and 2002, followed by recovery. Over the entire time
30 The estimated values of services from household durables presented by Benjamin and Brandt (2004) for 1993 are for a slightly different sample of households as they focus on only the panel households between the 1993 and 1998 VLSS.
63
period, per capita consumption in urban and rural areas grew at nearly identical rates, 5.44 versus
5.33 percent per annum. However, consumption growth was more rapid in rural areas between
later surveys and was more rapid in urban areas between earlier surveys. Within rural areas, per
capita consumption initially grew more quickly in the north, but between 2002 and 2006 this was
reversed and consumption grew more quickly in the south.
2.7.3.3 Composition of mean per capita income
As mentioned earlier, one of the advantages of studying income, relative to consumption,
as an indicator of welfare is its ability to be disaggregated into various subcomponents, such as
cropping income, agricultural sidelines income, etc. This allows us to discover which income
generating activities have been important sources of growth, which income activities households
participate in, and the distributive implications of each source of income. In this subsection we
address the first two exercises. We return to discuss the distributive implications of each income
activity later in the paper.
The key income categories are: cropping income, agriculture sidelines, family business
income, wages, remittances and gifts, and “other” income. We organize our discussion into rural
and urban, and also discuss north and south differences within these categories. These
breakdowns are important not only because the regions are so different from each other, but also
because of the insights that can be derived when examining trends in inequality decomposed in
the same way. Tables 2.11 and 2.12 do this separately for rural and urban Vietnam, with
additional breakdowns then provided for the north and south.
These tables are complemented by Table 2.13, which provides information on the
percentage of households reporting non-zero income by source. It is useful to remember here that
mean per capita income by activity is influenced by two margins: the share of households that
participate in the activity and the amount of income generated by each participating household.
This detailed picture will also be useful when we examine trends in inequality later on.
Early in the reforms, the rural sector was primarily agricultural, with more than half of all
income coming directly from cropping and agricultural sidelines.31 All but a relatively small
percentage of households earned income from agriculture, with the percentage of households
involved in farming higher in the north than in the south. Hence, strong growth in agriculture is
likely to benefit a very large portion of the rural population. Family-run businesses were next in
importance; they involved slightly less than half of all households, and were the source of 22
percent of all income. Wages contributed 12 percent, but this may underestimate slightly the role
of income from off-farm wage opportunities, some of which is indirectly captured in remittances
from household members that are not included in the GSO’s definition of household
membership. Overall, differences between the north and south in the structure of incomes were
relatively modest, with income from family businesses and wages more important in the south
than in the north. These differences in the composition of incomes between the two regions
largely disappear by 2006.
Over time, off-farm wage opportunities, followed by remittances and gifts, are the two
largest contributors to growth of income for rural households, with a growing percentage of
households reporting income from both sources. Combined, income from wages and remittances
and gifts grew nearly 15 percent per annum, and by 2006 was the source of 36 percent of all rural
income. Growth in incomes in Vietnam’s cropping sector is the slowest of all components of
household income and averaged 4.6 percent over the entire period, but for many farming
households this was partially offset by more robust growth in agriculture sidelines such as animal
husbandry and aquaculture. Overall, however, the primary sector declined in importance and by
2006 was the source of forty percent of rural household income, compared to 55 percent in 1993.
Lastly, income from family-run businesses grew steadily over the entire period and averaged just
below 7 percent. Nonetheless, its share of total rural income declined slightly as did the
percentage of households involved in family businesses. These broad shifts in the composition
of rural incomes are in line with the changes we discussed earlier in the structure of rural
employment.
31 This estimate of primary sector income excludes wage income from working in the primary sector, or land rental payments, which would increase slightly primary sector activity.
69
Trends in cropping income merit more detailed examination. First, the growth rate
calculated over the entire period conceals a sharp drop in income from the cropping sector that
occurred between 1998 and 2002, which is largely a product of forces exogenous to Vietnam.
Over this four year period, growth in cropping incomes fell to negative 1.8 percent per annum
compared to positive growth of 8.4 percent between 1993 and 1998, a swing of more than 10
percent. Indeed, a significant portion of the reduction in rural income growth over 1998-2002
compared to 1993-1998 (see Table 10) is a product of this behaviour.32 Much of the reduction in
cropping income can be attributed to the adverse effect on revenues of the decline in agricultural
prices in the wake of the Asian Financial Crisis. Perennial prices were especially hard hit. The
reduction in real agricultural output growth between the two periods was actually marginal.33
Prices and cropping sector income soon recovered and between 2002 and 2006 the latter
increased at a very reasonable 6.4 percent.
Second, rates of growth in cropping income were higher in the south than in the north.
The differences are especially pronounced in the early years of the reforms, as the south was able
to take better advantage of the domestic and international market liberalization for rice and other
agricultural commodities (Benjamin and Brandt, 2004). The south, and in particular, regions
such as the Central Highlands and the Southeast that moved heavily into perennials, were
especially hard hit by the collapse of world agriculture markets in the late 1990s, but cropping
income growth soon recovered, and averaged 7.3 percent between 2002 and 2006. Over the
entire period, income from the cropping sector grew twice as fast in the rural south as in the
north. This gap may help to explain why economic growth did not have a more favourable
impact on inequality levels in the north.
32 Between 1998 and 2002, there is also a reduction in the percentage of households reporting cropping income from 93.4 to 84.3 percent, with the reduction larger in the south. We cannot rule out that some of the behaviour of cropping income is tied to this decline. This further highlights concern over comparability between the two VLSS surveys and the three VHLSS surveys, which we discuss in more detail in the appendix. 33 Between 1993-98 and 1998-2002, real annual growth in crop output in Vietnam fell from 7.4 percent to 6.2 percent. By comparison, the aggregate crop price index for Vietnam fell at an annual rate of 1.8% between 1998 and 2002 after rising 11.9 percent between 1993 and 1998. See Brandt et al. (2008).
70
2.7.3.3.2 Urban
In the early 1990s, the structure of urban incomes in the north and south looked very
similar. In both, there was an important informal component. Self-employment and self-
employed income from small, family-run businesses was the largest source of urban household
income, involving nearly 70 percent of all households, and generating more than half of total
household income. These family-run businesses, which operated on average 8 months out of the
year, and drew primarily on the labour of household members, were most prominent in tertiary
sector activity followed by the secondary sector. Wage income, on the other hand, was reported
by nearly the same percentage of households in north and south, but was the source of a much
smaller share of income, 21.9 percent.34 In the north, wage income was heavily tied to
employment in the state sector (68.5 percent of those working for wages), but even in the south
41.7 percent of all of those employed and working for wages in their main job reported working
for the state. Income from remittances, 13.6 percent of total income, and “other” income, 9.9
percent, made up most of the rest of total income. In both regions, upwards of a third of all urban
households were marginally involved in agriculture, with income from cropping and sidelines
representing only 4.2 percent of urban incomes in 1993.
Over time, wage income is the most rapidly growing component of urban incomes,
averaging 11.6 percent growth per annum over the entire 13 year period. Partially contributing to
the rapid growth is wage reform in the state sector, which increased wages by 38 percent in 2003
(World Bank, 2004). As a result of this rapid growth, wage earnings represented 40.4 percent of
all urban income by 2002, or nearly double the level in 1993. This rapid growth comes largely at
the expense of the share of income from family-run businesses, which grows less than 4 percent
per annum, and falls to 32.6 percent of household income.
Cumulatively, the urban north fares better than the south. In 1993, urban incomes in the
south were twenty percent higher. This gap was maintained between 1993 and 1998 as growth in
the two regions was nearly identical (8.2 versus 8.6 percent growth). Between 1998 and 2002
however, growth in household incomes in the south slipped to 4.9 percent per annum compared
34 The much higher percentage of income coming from family-run businesses reflects the fact that upwards of 2 members per household were working in these businesses, and that income from this source represents the returns to both labour and capital.
71
to 8.2 percent in the north. Several factors appear to be underlying this difference. First, growth
in incomes from family-run businesses and remittances both dropped off significantly in the
south, possibly a consequence of the fallout from the Asian Financial Crisis. Second, the north
enjoyed much more rapid wage growth, which may be partially linked to the wage reform in the
state sector, and the much larger role of state employment in the north. Between 2002 and 2006,
growth in incomes from family-run businesses and remittances in the south recovered, but over
this four-year period, growth in every source of income in the south lagged behind those in the
north. There is an especially sharp contrast in the behaviour of “other” income, which grew twice
as fast in the north between 2002 and 2006.35 Overall, per capita incomes in the south grew 5.3
percent compared with 9.7 percent growth in the north. As a result, by 2006, per capita incomes
in the south trailed those in the north by 7 percent.
With this detailed information on the changes in employment, household endowments,
mean income and composition of income by source the background to the picture of income and
inequality in Vietnam becomes clearer. We next move on to our analysis of the behaviour of
inequality between 1993 and 2006.
2.8 Distributive dimensions of Vietnam’s growth
In this section, we examine two important indicators of how the benefits of Vietnam’s
impressive growth have been distributed amongst the population. As previously outlined,
although other authors have reported on these changes, we add to the discussion by extending the
time period beyond the scope of most previous studies, and also by using income data. This new
approach to the issues at hand helps to broaden the knowledge base and the available information
on this crucial time in Vietnam’s development. We first look at poverty trends, then discuss the
behaviour of inequality.36 Our primary purpose in reporting headcount ratios is to capture how
35 In future work, we plan to disaggregate “other” income into two components: government transfers, and non-government transfers such as interest from savings. 36 These estimates are subject to the criticism that some of the poor, and possibly a rising number in absolute as well as percentage terms, are missing from the household surveys. Prominent among this group would be migrants, who we discussed earlier. Depending on their incomes relative to those households surveyed, their exclusion could bias upwards or downwards our estimates of rural poverty, urban poverty, as well as overall poverty, and likewise, similar measures of inequality. Nonetheless, this critique holds for all other papers that have used the same household surveys to examine distributional concerns, such as poverty and inequality, in Vietnam.
72
rapidly economic growth was pulling households up from the bottom. It is also important to
remember that there is an element of arbitrariness in any poverty line.
2.8.1 Poverty trends
In Table 2.14 we report estimates of poverty headcount ratios using real per capita
incomes, along with standard errors of our estimates. The poverty line we use is 2.478 million
VND in January 2006 prices, and has been adjusted to reflect differences in the cost of living
between regions, as well as between the rural and urban sectors.37 In 1993, nearly two-thirds of
all households were living below this poverty line. Consistent with the higher average incomes
among urban households reported in Table 2.10, we also observe a significantly smaller
percentage of households identified as poor in the cities compared to the countryside (40.8
versus 70.1 percent).
Table 2.14 Estimates of the poverty headcount ratio using real per capita income
1993 1998 2002 2004 2006 All of Vietnam Poverty headcount ratio 0.643 0.380 0.194 0.108 0.078 Standard error 0.017 0.017 0.004 0.004 0.003 Number of observations 9,089 9,086 Rural Vietnam Poverty headcount ratio 0.701 0.435 0.237 0.137 0.101 Standard error 0.016 0.021 0.005 0.005 0.005 Number of observations 6,863 6,806 Urban Vietnam Poverty headcount ratio 0.408 0.187 0.050 0.026 0.016 Standard error 0.049 0.021 0.005 0.004 0.003 Number of observations 2,226 2,280
Note: The poverty line used is 2 478.11 thousand dong in January 2006 prices.
Between 1993 and 1998, the rapid growth in per capita incomes in the country was
accompanied by a sharp fall in the head count ratio. Recall that in absolute terms, per capita
incomes in Vietnam more than doubled over this five year period. The headcount ratio fell by
37 This poverty line is based on inflating the GSO/World Bank recommended poverty line used in 2004 by the CPI inflation factor we use to convert from 2004 January prices to 2006 January prices.
73
26.3 percentage points between 1993 and 1998, and by an additional 18.6 percentage points over
the next four years despite declining economic growth between 1998 and 2002. By 2002, less
than a quarter of the rural population was identified as “poor,” with only 5 percent of all urban
individuals so classified. After 2002, poverty levels continued to fall, albeit at declining rates.
By 2006, only 1.6 percent of the urban population was still classified as poor, with 10 percent of
the rural population reported to have per capita income levels below the poverty line. Overall,
the poverty headcount ratio was 7.8 percent.
In short, the rapid reduction in the headcount ratios speaks to the success of the growth in
Vietnam in raising the incomes of those households in the lower tail of the income distribution.
We now examine the changes in income inequality that accompanied changes in poverty.
2.8.2 Inequality
We begin our discussion of inequality first by reporting initial 1993 levels using Gini
coefficients, and then moving on to discuss trends between 1993 and 2006. We supplement our
discussion based on Gini coefficient estimates with Lorenz curves. We also use information
gained in Section 2.7 to give further insight into conditions within the relevant regions which
may explain our statistics. Also included in the discussion of inequality is a comparison of
income and consumption data, the role of spatial differences and decompositions by source of
income.
2.8.2.1 Initial levels in 1993
Table 2.15 reports Gini coefficients for the same breakdown of regions and sectors as our
estimates of mean incomes and growth rates. Directly below our estimates we also report 95
percent confidence intervals, which are a good reminder of the lack of precision with which we
estimate Ginis, and the care that must be taken in making inferences about changes in inequality
over time. Early in the reforms, Vietnam had a relatively high level of inequality, with a Gini
coefficient of 0.45. Urban inequality was markedly higher, with a Gini of 0.48 compared to 0.40
for the rural sector. In both sectors, but especially in the rural sector, inequality was also
appreciably higher in the south.
74
Table 2.15 Estimated Gini coefficients based on real per capita income and consumption
Note: The 95% confidence interval is provided below the estimated Gini. These are bootstrapped, bias- corrected estimated confidence intervals.
This high initial inequality seems unusual for a “socialist” economy early in its transition.
China provides an obvious basis for comparison. In 1985, or six to seven years into China’s
reforms, the Gini coefficient for per capita household incomes was 0.29, and only 0.17 in the
75
urban sector. In the rural sector, it was 0.27 (Ravallion and Chen, 2007).38 There are a number
of potential explanations for Vietnam’s high initial inequality related to institutional features of
the urban and rural economies. One factor we can probably rule out is urban-rural income
differences, which were actually lower in Vietnam.
We begin by examining the urban sector, with its relatively high level of inequality as
reflected by a Gini coefficient of 0.48. First, a very high percentage of the urban labour force
(60.4 percent) was self-employed, largely in family-run businesses, in which the distribution of
human capital, access to credit and other key inputs, and so on could be highly skewed. Recall
that income from family-run businesses was the source of half of urban incomes in 1993.39
Second, the role of the state sector as a source of employment was never as large as in other
socialist economies. In general, we expect the wage distribution to be more compressed in the
state sector than in private firms. In 1993, only 39.6 percent of the urban workforce was engaged
in wage employment, of which 50.5 percent were employed by the state for their primary job.
This implies that only 20 percent of the entire urban workforce was employed by the state.40 By
comparison, the share of the urban labour force in China working for the state in 1978 was over
75 percent, with the rest of the labour force primarily working in collectively-run enterprises
owned by local municipalities. Even as late as 1988, the percentage working for the state in the
urban sector had only fallen to 65 percent, with less than 5 percent either self-employed or
working in the private sector. Third, the level of state subsidies for food, housing, medical, and
so on, which might have been equalizing, was also relatively low.41
In the case of the rural sector, with a Gini of 0.40, inequality is greater in the south than
in the north. Higher inequality with respect to both land and human capital likely underpins the
difference between the rural south and the rural north. Both are a product of historical legacy. As
38 The household data that are the basis for these estimates likely underestimate differences in household income in China. However, these biases are not sufficient to overturn the basic comparison. Potentially offsetting this is the larger size, and possibly greater regional heterogeneity in China. 39 The disequalizing role of family business income is confirmed by the Shorrock decompositions that are reported below. 40 The larger role of state sector employment in the north compared to the south may help explain the slightly lower level of urban inequality in the north. 41 In urban China, these subsidies amounted to nearly a quarter of urban incomes as late as the early 1990s. See Benjamin et al., 2008.
76
we noted in the context of discussion of Table 2.7, lower land inequality in the north in the early
1990s reflected the relatively egalitarian redistribution of land that occurred as part of the process
of decollectivization in the late 1980s. Landlessness was minimal. Most of agriculture in the
south avoided collectivisation after reunification in 1975, and landholding patterns reflect a
complex mixture of demographic, economic, and social factors at work for decades or more.
With half of all income in the rural sector in 1993 originating in farming, and off-farm labour
markets incomplete at this time, access to land played a particularly important role in the
income-determination process.
Differences in land endowments were likely reinforced by a much more unequal access
to education in the south compared to the north that is reflected in our inequality measures of
educational attainment of the labour force (see Table 2.9). In the south, household landholdings
and educational attainment are highly positively correlated. In agriculture, as well as in family-
run businesses, human capital played an important role in a household’s ability to deal with an
increasingly complex, and rapidly changing local economy.
2.8.2.2 Inequality trends
We observe a significant reduction in inequality between 1993 and 2006 from a Gini
coefficient of 0.45 to 0.38 for all of Vietnam. All of this occurs between 1993 and 2002 with a
significant reduction in inequality in the north as well as the south, in urban as well as rural, with
one exception: the rural north, in which the level of inequality in 2006 is within measurement
error of its level in 1993. After 2002, the levels of inequality, as expressed by the Gini
coefficient, remain more or less the same in all regions.
The overall experience for Vietnam is in sharp contrast to China, where inequality fell
slightly very early in the reform, a product of the more rapid growth in the rural sector compared
to urban, but then rose sharply in the aggregate, as well as in urban and rural areas separately
(Benjamin et al., 2008; Ravallion and Chen, 2007).42 By 2006, the Gini for per capita household
42 Several caveats are in order in making this comparison. First, inequality in China in the late 1970s starts from a much lower base than it does in Vietnam. Second, China’s population is nearly twenty times that of Vietnams, with enormous heterogeneity. Most of China’s provinces are the size of Vietnam. Third, official Chinese estimates invariably underestimate the true level of inequality, and the increase.
77
incomes in China was probably between 0.45 and 0.50, which is higher than contemporaneous
estimates for Vietnam.
The Lorenz curve provides a non-parametric representation of the income distribution,
and has several advantages relative to the Gini coefficient for analysing changes in inequality.
On the one hand, it allows us to examine changes in inequality throughout the entire distribution
rather than on the basis of a single summary statistic. On the other hand, inequality orderings
based on Lorenz curves are robust to the inequality index. This means that for a large set of
inequality indices, e.g., the Gini coefficient, Atkinson index, variance of the log, and so on, each
will produce the same inequality ranking so long as the Lorenz curves can be ranked. The
primary drawback to using Lorenz curves is that they do not always provide a complete ranking.
When Lorenz curves cross, we cannot unambiguously say whether inequality has increased or
decreased, whereas inequality indices would still rank the two distributions, albeit perhaps
differently depending on how they are constructed.
Figures 2.3 and 2.4 show the Lorenz curves for rural Vietnam for each survey year.
Figure 2.3 shows the Lorenz curve for the entire income distribution, while Figure 2.4 is a
scaled-up version of the Lorenz curve for the bottom 40 percent of the income distribution. The
patterns demonstrated by the Lorenz curves confirm the pattern observed over time for the Gini
coefficients: income inequality has been decreasing over time, as represented by an inward shift
of the Lorenz curve over time (the more equal the distribution of income the closer the Lorenz
curve is to the 45 degree line), but after 2002 shows no further change. Furthermore, there are no
evident crossings of the Lorenz curves. Hence, based on visual inspection, we are able to confirm
that the previous results reported using Gini coefficients are likely robust to a large number of
possible inequality indices. In other words, the reported drop in inequality is not an artefact of
how the Gini coefficient summarizes the income distribution.
78
Figure 2.3 Lorenz curves for rural income per capita, 1999-2006
0
.2
.4
.6
.8
1
Lorenz curve
0 .2 .4 .6 .8 1Cumulative population share
1993 19982002 20042006
Figure 2.4 Lorenz curves for rural income per capita, 1993-2006, for the bottom of the income distribution
0
.05
.1
.15
.2
Lorenz curve
0 .1 .2 .3 .4Cumulative population share
1993 19982002 20042006
79
Figures 2.5 and 2.6 do the same thing for urban Vietnam. Again, the pattern is similar to
that demonstrated by the Gini coefficients: urban income inequality fell between 1993 and 2002
and then remained relatively constant between 2002 and 2006. Furthermore, similar to the
shifting of the Lorenz curves for rural Vietnam, there are no obvious crossings of the Lorenz
curves, confirming the robustness of the pattern implied using the Gini coefficients.
Figure 2.5 Lorenz curves for urban income per capita, 1993-2006
0
.2
.4
.6
.8
1
Lorenz curve
0 .2 .4 .6 .8 1Cumulative population share
1993 19982002 20042006
80
Figure 2.6 Lorenz curves for urban income per capita, 1993-2006, for the bottom of the income distribution
0
.05
.1
.15
.2
Lorenz curve
0 .1 .2 .3 .4Cumulative population share
1993 19982002 20042006
2.8.2.3 Comparison of consumption and income inequality
As most of the existing literature uses consumption data to report inequality in Vietnam,
we also create consumption estimates and present a comparison of Gini coefficients based on our
consumption estimates with our income estimates in Table 2.15. Our estimates of inequality
using per capita consumption show trends similar to much of the existing literature. Per capita
consumption inequality is found to have increased in Vietnam between 1993 and 2004, but then
decreased slightly in 2006. This hides important differences between rural and urban areas. In
rural Vietnam, consumption based inequality was relatively stable between 1993 and 2002,
before increasing moderately between 2002 and 2006. In urban Vietnam consumption inequality
increased between 1993 and 2002 and then fell below the 1993 level by 2006. Note, however,
that given the smaller sample size in urban areas, none of the changes in consumption inequality
are statistically significant.
As the focus of our paper is on income, not consumption, this is the last time we will
discuss consumption. As noted earlier, we have concerns about the construction of the
81
consumption estimates, in particular the estimation of the imputed value of services derived from
housing. However, we leave this for future work.
2.8.2.4 Role of spatial differences
Changes in disparities between locales are often viewed as an important contributor to
changes in overall inequality. This usually includes differences between regions, as well as
urban-rural differences. We have already noted the existence of differences between the north
and south of Vietnam as reflected in a comparison of mean per capita incomes. In this section we
take this analysis further by documenting the contribution to overall inequality made by
disparities between regions, as well as by differences in incomes between the urban and rural
populations.
The Gini coefficient is not easily decomposed in this manner and thus we need an
alternative measure of inequality. We follow the approach taken by Benjamin, Brandt and Giles
(2005), decomposing the variance of log income inequality index. This entails estimating the
following equation:
ln i L iy D uγ′= +
where DL is a vector of dummy variables indicating the location of individual i (e.g. north-south,
region (8 in total), or province). Controlling for the role of location differences in mean incomes,
we also look at the contribution of urban–rural income differences to overall inequality: how
much do mean differences of income between urban and rural households contribute to overall
inequality? We do this by including in the spatial decomposition: regression location dummies,
an urban indicator, and interactions between the location indicator and urban dummies. The
interactions allow the urban–rural gap to vary across locations. To help in the interpretation,
recall from Table 2.10 that in 1993 the urban-rural income ratio was approximately 2. The R-
squared from these regressions indicates the proportion of inequality that is explained by the
location dummy variables, plus urban-rural differences. The remaining variation is the within-
location proportion of inequality.
Table 2.16 reports the results of the spatial decompositions for all of Vietnam and for
rural and urban separately. Care must be taken in our interpretation of these numbers because of
the changes that occurred in the sampling structure of the survey in 1998 and again in 2002 (see
the discussion in Section 2.5). At a minimum, the 2002, 2004 and 2006 estimates are highly
82
comparable, subject possibly to the caveat that the number of observations per locale are smaller
in 2004 and 2006. This can lead to a slightly higher R-squared and convey the impression that
the role of differences between locales in overall inequality is rising, when in fact they are
falling.
Table 2.16 Contribution of location to income inequality and number of households per location
1993 1998 2002 2004 2006 Contribution to variance All Vietnam Contribution of North/South 0.000 0.024 0.038 0.022 0.019 Contribution of region 0.041 0.073 0.128 0.093 0.078 Contribution of province 0.167 0.198 0.230 0.183 0.161 Rural Vietnam Contribution of North/South 0.004 0.014 0.035 0.013 0.024 Contribution of region 0.043 0.039 0.113 0.060 0.073 Contribution of province 0.123 0.143 0.170 0.112 0.131 Urban Vietnam Contribution of North/South 0.003 0.001 0.001 0.000 0.008 Contribution of region 0.084 0.050 0.092 0.078 0.057 Contribution of province 0.214 0.229 0.211 0.176 0.147 All Vietnam Urban/Rural 0.061 0.088 0.149 0.141 0.127 North/South 0.000 0.024 0.038 0.022 0.019 North/South + Urban/Rural 0.065 0.108 0.172 0.149 0.145 Region 0.041 0.073 0.128 0.093 0.078 Region + Urban/Rural 0.111 0.127 0.241 0.197 0.187 Province 0.167 0.198 0.230 0.183 0.161 Province + Urban/Rural 0.198 0.241 0.303 0.255 0.246
With these limitations in mind, several important findings emerge. First, location plays a
relatively small role in overall inequality. North-south differences contribute no more than a few
percent. Regional and provincial effects are more pronounced, however provincial differences
still account for at most twenty percent. Far more important are differences among households
within provinces. Second, urban-rural differences, which were around a factor of two in the early
1990s, also explain a relatively small percentage of overall inequality. Combined, locale plus
urban-rural explains about 10 percent more of the inequality than locale does alone.
83
Overall, the results from the decomposition point to the significant contribution of
differences within urban areas and the countryside to inequality in Vietnam over this period.
They also suggest that it is here, namely, within the rural and urban sectors that we need to look
for clues as for the reasons for the behaviour of inequality. To this we now turn with our
discussion of inequality decomposed by source of income.
2.8.2.5 Decompositions by source of income
In this section we present information on the composition of income by income quartiles
to provide some insight into which households might be likely to gain from changes in a
particular source of income. We then use Shorrocks decompositions to examine what share of
inequality can be attributed to different income sources. This will help to shed light on the
changes in inequality within urban and rural regions.
Why did income inequality fall in both urban and rural Vietnam and then level off?
Endowments, as well as the institutions that map household endowments into family income are
important here. Earlier, we discussed the behaviour of two key endowments: the distribution of
land, which figures prominently in the rural economy, and human capital, which matters to both
the urban and rural sectors. Recall that land inequality was rising over this period, while
differences amongst the labour force in educational attainment, our measure of human capital,
were falling in nearly every dimension, while average educational attainment was rising.
We do not have income data disaggregated on the basis of the returns to individual
factors, e.g. land, human capital, physical capital, nor do we know much about the rapidly
evolving market and non-market institutions mapping endowments into incomes. However, we
do have detailed data by sources of income, which we can put to use. Our limited objective here
is to sketch some of the correlates of inequality, particularly those related to the composition of
household income. The key tools in our analysis are descriptive statistics of the structure of
income across the income distribution, and Shorrocks (1982, 1983) decompositions.
84
2.8.2.5.1 Composition of income by income quartiles
We begin by examining the composition of income by income quartiles for 1993 and
2006. We carry out this exercise separately for urban and rural areas in Figures 2.7 and 2.8.43
The aim of this exercise is to a provide a sense of which households, those at the bottom, in the
middle or at the top of the distribution, are likely to gain the most from growth in a particular
source of income on the basis of the initial structure of incomes. For example, if income from
family-run businesses is highly concentrated amongst high income households, we might expect
rapid growth in income from this source to benefit most this group, and thus, all else equal, for
inequality to rise. The reverse would also be true. Of course, the structure of income may
converge across income groups over time, in which case growth in any particular source of
income may have more modest effects on inequality.
Figure 2.7 Composition of income by source for rural households by income quartile in 1993 and 2006
020
4060
8010
0pe
rcen
t
1993 20061 2 3 4 1 2 3 4
By Income Quartile and YearRural composition of real income by source
Crop Sidelines BusinessWages Remittances Other
43 Tables 11 and 12 provide the complementary information on growth rates by source of income for urban and rural
households, respectively.
85
The upper panel of Figure 2.7 reveals marked differences in the composition of income
across income quartiles in rural Vietnam early in the reform process. Incomes are rather
narrowly concentrated in a few activities at the low end of the income distribution, but the
structure of incomes becomes much more diversified as incomes rise. For households in the
bottom quartile in 1993, cropping income accounts for over 70 percent of total household
income, with wages making up 20 percent. Across the quartiles, the share of income earned from
crops falls steadily, and for households in the highest income quartile, crop income accounts for
only about 35 percent of household income. Some of this is offset by an increase in the role of
farm sidelines, especially by households in the middle quartiles. Far more important at the upper
end of the distribution is income from family-run businesses, which was the source of more than
a third of income for the richest quartile. The role of wage income, measured as a share of total
income, is very similar through the first three income quartiles, but then drops slightly in the
highest income group.
Figure 2.8 Composition of income by source for urban households by income quartile in 1993 and 2006
020
4060
8010
0pe
rcen
t
1993 20061 2 3 4 1 2 3 4
By Income Quartile and YearUrban composition of real income by source
Crop Sidelines BusinessWages Remittances Other
This basic pattern is still present in 2006, but there are several changes to note. First,
cropping income is a much lower share of household income across all quartiles. Offsetting this
86
is a pronounced growth in the share of wages throughout the income distribution, including the
richest quartile. We observe a similar, albeit much smaller increase, in the role of remittances
across all quartiles, which is also correlated with the rapidly growing role of off-farm
opportunities in the rural economy. Second, the role of family-run businesses declines in the
context of a narrowing in the differences across the quartiles in share of the source of income.
Third, for lower income households farm sidelines have also become more important as a share
of income.
Figure 2.8 displays the same information for urban Vietnam. Especially prominent is the
unequal role that household business income plays in urban areas. In 1993, close to 60 percent of
total income in the upper quartile was derived from household business income compared to
about 20 percent in the bottom quartile. However, by 2006 the role of income from family
businesses becomes much less important at the same time that the share of income from this
source becomes much more evenly distributed across the quartiles. A second important feature of
Figure 2.8 is the importance of wage income in 1993 for urban households in the bottom quartile.
Wage income represented over 40 percent of total income for these households, in contrast to 10
percent of total income for upper quartile households. By 2006, the difference across quartiles in
terms of the share of income derived from wages has dramatically reduced, but upper quartile
households still earn a lower share of income from wages than the other quartiles. Some of these
differences in the role of wage income are offset in 1993 by the more important role of
remittance income for households in the upper quartile household. These differences decline
over time, but remittance income remains disequalizing in the sense that upper quartile
households benefit more from this source of income than households in other quartiles.
Perhaps the most noteworthy feature of both Figures 2.7 and 2.8 is how the structure of
income across the quartiles has become much more similar over time. In rural areas, poorer
households have been able to move into the same income generating activities as richer
households. Poorer households earn a larger share of their income from wages and agricultural
sidelines in 2006 than in 1993, making them look much more like rich households from 1993. In
contrast, the convergence in urban areas is due more to changes in the top quartile between 1993
and 2006, converging more closely with the poorer quartiles. The upper quartile of urban
households earns a much smaller fraction of total income from households businesses and a
larger fraction from wages. This helps to explain the patterns in inequality observed over time.
87
2.8.2.5.2 Shorrocks Decompositions
A big advantage of using income over consumption data is the ability to disaggregate
income by activity and examine the relationship of overall inequality to inequality by income
source. At a basic level, this is what Figures 2.7 and 2.8 show. We now formalize this
decomposition using Shorrocks decompositions. The Shorrocks decompositions tell us the
proportion of total inequality that can be attributed to a particular income source. The
decompositions are primarily a descriptive tool and are not meant not to illustrate the impact of
an increase in the inequality of particular source of income without further specifying the nature
of the change in inequality of that income source.
With this limitation in mind, we begin the discussion with a simply outline of the
procedure. Consider a decomposition of household i’s income according to K income generating
activities:
1
K
i ik
y y=
= k∑ .
Mean income can be written as the sum of mean income from each income source. Then a 1
percent increase in income from source k will lead to a Wk percent increase in total income,
where Wk is the share of income from source k. The decomposition of inequality is designed in a
similar manner. We wish to estimate Sk, the proportion of inequality attributable to the inequality
of income source k:
( ) ( )1
K
k kk
I Y S I Y=
= ∑
where I(Y) is the inequality index for total income Y, and I(Yk) is the inequality index for income
source k. Shorrocks demonstrated that for any additively decomposable inequality index, Sk can
be estimated by:
( )( )
cov ,ˆvar
ik ik
i
y yS
y= .
Thus, Sk captures the degree to which income source k is correlated with total income. If an
income source is earned primarily by the rich, then Sk will be relatively large. If it is earned
primarily by the poor then Sk will be relatively small.
88
We next turn to how one should interpret estimates of Sk. Unlike the share of income
earned from a particular source, Wk, Sk can be negative.44 Thus, one interesting benchmark is 0.
If Sk is negative then the income source is disproportionately earned by the poor and a marginal
increase in that income source, maintaining the same correlation with total income, would
decrease overall inequality. In practice, very few sources of income will have a negative value of
Sk. A second helpful benchmark is the share of income earned from that activity, Wk. If Sk>Wk
then income source k contributes more to inequality than it does to mean income, which we
define as a disproportionate effect on inequality.
As a matter of computation Sk can be computed using the following regression:
ik k k i iky y uα β= + + .
This regression formation makes clearer the interpretation of Sk. We are only estimating the
correlation of a particular source of income with total income. Moreover, the regression
formation illustrates the important role that measurement error will have on our estimates.
Overestimates of income from a particular source of income will cause us to overestimate the
“true” value of kβ . This will lead to a corresponding underestimate of the true value of kβ for
other sources of income. We want to highlight the implication then for our analysis. Our
estimates of Sk are only correlations of measured total income and measured income by source.
They are not consistent estimates of the correlation between true total income and true income
by source. This limitation, which plagues all such decompositions, should be kept in mind when
interpreting the estimates.
In Table 2.17 we present the results separately for rural and urban areas, disaggregated by
north and south. We begin our discussion of the Shorrocks decompositions for rural areas. For
1993, a key observation is that crop income is relatively equalizing, that is, it contributes less to
inequality than its share of total income (20.0 percent versus 44.3 percent). This lines up with the
intuition provided by Figure 2.5, which showed the role of income from cropping highly
negatively correlated with household incomes. This same pattern applies to agricultural sidelines.
Hence, primary sector income as a whole is relatively equalizing, contributing 55.3 percent to
total income, but only 28.2 percent of inequality. Wage incomes are also relatively equalizing,
44 Of course if one source of income is, on average, negative, then Wk can also be negative but this is very unlikely in practice.
89
but in 1993 are a relatively modest portion of total incomes. By comparison, household business
income is extremely disequalizing.45 Again, this is consistent with Figure 2.5, and the much
larger role it portrays for family-business income amongst the top quartile of households. It alone
accounts for almost half of overall rural inequality. Remittances are also disequalizing,
accounting for 10.3 percent of overall inequality despite representing only 5.2 percent of overall
income.
As we have seen in other dimensions, there are differences between the rural north and
south in terms of those activities that are most responsible for the inequality. In particular, crop
income, while still relatively equalizing, accounts for a much larger share of rural inequality in
the south than in the north, 25.1 versus 13.9 percent. This is likely due to higher landlessness and
greater land inequality in the south, which prevented a greater share of households from
participating in the rapid expansion in cropping income. In contrast, the share of inequality
accounted for by remittances and “other” income in the north is noticeably larger than in the
south.
The results in Table 2.17 for 1993, combined with information reported in Tables 2.11 on
growth rates by source of income, are helpful in explaining the behaviour of rural inequality. In
particular, throughout the 1990s, the acceleration in growth in agricultural incomes that
accompanied the reform of the farm sector, and rapidly expanding off-farm employment
opportunities played important roles in reducing rural inequality. As we described earlier, the
decline in inequality in rural Vietnam is coming almost exclusively from the reduction amongst
rural households in the south. In the north, inequality remained the same, albeit at a much lower
level than in the south. In the north, growth in cropping income was too slow to offset the
disequalizing influence of the growth in family businesses, and to a much less extent,
remittances. Wage growth in the north worked to the benefit of lower income households, but
their relatively low share of incomes in 1993 (7.6 percent) limited any impact they could exert on
inequality. In contrast, in the south, rapid growth in cropping income and wage growth
45 The disequalizing impact of household business income may be slightly overstated. Household businesses generate income
both for the household and for hired employees. This second stream of income is included under wage income, which tends to be
more evenly distributed than household business income. Hence, a larger view of the impact of household businesses on overall
inequality may lead to a lower impact than that suggested by looking at the profits (which also includes the returns to the
households own labour) accruing to the household.
90
disproportionately benefitted the lowest income groups. Growth in wages were, in all likelihood,
especially important for the rural landless.
In Table 2.17 we also report the results for the Shorrocks’ decomposition for 2006. These
patterns reflect the changes in rates of growth across income activity, as well as changes in role
of these activities across households in the income distribution. Broadly speaking, the general
patterns are similar to those in 1993. Cropping income and wages still remain relatively
equalizing, and family businesses the exact opposite. However, in the case of cropping income
and family income, their impacts have been dampened slightly as their shares of income have
fallen, and the distribution of income from these activities has become less concentrated across
the income distribution. The disequalizing role of remittances, on the other hand, has increased,
while sideline farm activities have also become disequalizing. There is also one noteworthy
difference between the north and south: crop income has a much lower correlation with overall
income in the north, 0.036, compared to 0.28 in the south. Hence, crop income has become more
evenly distributed in the north and less evenly distributed in the south.
Urban inequality in 1993 is largely shaped by the distribution of family-business income,
which was the source of 69.2 percent of overall inequality as compared to representing only 50.3
percent of total income. The other relatively disequalizing sources of income in 1993 are
remittances and gifts, 17.7 percent of overall inequality in comparison to 13.6 percent of overall
income, and also “other” income, 12.2 percent of overall inequality versus 9.9 percent of overall
income. In contrast, wage income contributes very little to overall inequality in comparison to its
share of income, 1.6 percent versus 21.9 percent, and in this regard is highly equalizing. Finally,
farming income, both crops and sidelines, make negligible contributions to overall inequality, as
we expect given their very small contribution to incomes.
These relationships provide clues to the factors underlying the falling inequality we
observe in urban Vietnam through 2002. Rapid growth in wage income, which was especially
important for lower income households, helped to narrow income differentials in the north as
well as the south. This was complemented by relatively slow growth in family business incomes
between 1993 and 2002 (see Table 2.14), which initially were highly concentrated among the
richest households.
91
Table 2.17 Shorrocks decomposition
1993 2006 1993 2006 Share OLS Share OLS Share OLS Share OLSRural Urban Crop 0.443 0.20016 0.26973 0.18312 Crop 0.03238 -0.0086 0.03181 -0.0062Sidelines 0.10968 0.08247 0.13253 0.15056 Sidelines 0.0104 0.00218 0.0301 0.05607Family Business 0.21755 0.49264 0.1781 0.28588 Family Business 0.50275 0.6918 0.32488 0.4742Wages 0.11881 0.05395 0.25548 0.12566 Wages 0.2195 0.01582 0.38979 0.1755Remittances 0.0524 0.10344 0.10525 0.18404 Remittances 0.13563 0.17683 0.10725 0.19613Other Income 0.05855 0.06733 0.0589 0.07073 Other Income 0.09935 0.12197 0.11616 0.10429Rural north Urban north Crop 0.46662 0.13911 0.25301 0.03597 Crop 0.06387 -0.009 0.02057 -0.0108Sidelines 0.1405 0.10031 0.13623 0.13568 Sidelines 0.03044 0.00606 0.03448 0.07925Family Business 0.17491 0.45135 0.17743 0.32498 Family Business 0.45933 0.82584 0.2677 0.24718Wages 0.07563 0.05074 0.24005 0.14201 Wages 0.19857 0.00587 0.41717 0.21804Remittances 0.05402 0.15582 0.10873 0.23549 Remittances 0.10896 0.09244 0.09348 0.36627Other Income 0.08832 0.10266 0.08455 0.12586 Other Income 0.13883 0.0788 0.16661 0.10006Rural south Urban south Crop 0.41561 0.2509 0.28547 0.27991 Crop 0.01976 -0.0069 0.03893 -0.0033Sidelines 0.07393 0.0702 0.12906 0.16281 Sidelines 0.00238 0.00229 0.02733 0.04433Family Business 0.26701 0.52356 0.17873 0.26151 Family Business 0.52014 0.65534 0.36113 0.59135Wages 0.16891 0.05329 0.26999 0.10517 Wages 0.22788 0.01523 0.37244 0.15192Remittances 0.05052 0.06118 0.10197 0.15114 Remittances 0.14631 0.19802 0.11598 0.11259Other Income 0.02402 0.04088 0.03478 0.03946 Other Income 0.08353 0.13607 0.08419 0.10309
92
After 2002, inequality fails to decline any further in either the north or the south. Several
factors appear to be potentially important. First, wage growth, especially in the south, slowed
appreciably after 2002, while wages became much less equalizing in both regions (see Table
2.12). Slower growth in wage earning among households in the lower income quartiles is
consistent with both of these observations. Second, in the north, remittance incomes also became
highly disequalizing, and by 2006 were the source of a third of the inequality. This may also be
tied to important differences amongst households throughout the quartiles in terms of their
ability to access off-farm labour markets. And third, after 2002 growth in family-business
income, the most disequalizing source of urban incomes, recovers. Most of this influence is
coming through its effect on inequality in the south. In the north, family business incomes are
much less important, and much more equally distributed, than they were in 1993.
In summary, the main contributors to measured inequality are household businesses, in
both rural and urban areas, followed by remittances. Both sources of income contribute more to
overall inequality than to overall income. In the rural sector, rapid growth in the more equally
distributed income from farming and wages helped to offset the influence of family business
income on overall inequality. As a result, income growth in the lower end of the income
distribution outstripped that at the top, and inequality fell. Wages performed a similar role in the
urban areas, and played a prominent role in reducing urban inequality up through 2002. After
2002, the equalizing role of wages became much less pronounced, while other sources of income
such as remittances began to exert a more pronounced influence on the distribution of incomes.
Growth no longer disproportionately benefited lower income households, and thus, the level of
inequality in 2006 remained very similar to that in 2002.
2.9 Robustness checks
There are reasons to be concerned about the comparability of the surveys between 1998
and 2002. Changes to the sampling framework between 1998 and 2002 are associated with
unusual movements in the share of household participating in various income generating
activities. There are a number of possible explanations for these unusual trends. First, the
summary statistics generated from the household surveys could be indicative of true rapid
changes in the structure of income between 1998 and 2002. Second, the expansion of the number
93
of communes in which households were surveyed could have dramatically altered the
composition of communes included in the surveys, despite the random nature in which
communes were chosen. Third, changes in the actual questionnaire could have influenced how
households reported various activities. Unfortunately, given the data available we are only able
to test the plausibility of the second reason listed above.
There are 70 communes that appear in both the 1998 VLSS and 2002 VHLSS. We can
compare various summary statistics for these “overlapping” communes with those for “non-
overlapping communes” (i.e., communes that appear in the 2002 VHLSS but not in the 1998
VLSS) to test if the two sets of communes represent different subpopulations. We begin by
comparing the means of household size, total per capita income, and per capita income by
activity between households in the two sets of communes in the 2002 VHLSS. Table 2.18 reports
the mean for each variable within the non-overlapping commune sample, the difference in means
between the two sets of communes, the percentage difference in means, and the t-statistics for
the null hypothesis of equal means. We see mixed evidence as to whether or not the two samples
represent the same population. For rural households we observe no statistically significant
differences in means except for per capita income from remittances and gifts and from other
sources. Similarly, for urban households only two of the variables examined display statistically
significant differences in means between the two samples. Nonetheless, there are some
particularly large differences in the percentage differences in means.
In Table 2.19 we repeat the same analysis, except based on the share of households that
report receiving income from each activity. For both rural and urban households we find no
statistically significant differences in the share of households reporting positive income by
activity between the two sets of communes. Hence, most of the differences observed in Table
2.18 are coming from differences in earnings per activity per active household.
94
Table 2.18 Comparison of mean income in the 2002 VHLSS between households in communes that also appear in the 1998 VLSS and households in communes that only appear in the 2002 VHLSS
Mean fornon-
overlappingcommune
sample
Difference in means
between theoverlappin
g
and non-overlapping
Percenta
communes
gedifferencein means
between theoverlapping
and non-overlapping
communes
t-statistic of null
hypothesis of equal
means Rural Vietnam Household size 5.17 0.20 3.95 1.41Total per capita income 4764.2 58.9 1.24 0.14Per capita income from agriculture 1411.5 288.9 20.47 1.35Per capita income from ag. sidelines 632.6 -88.2 -13.95 -1.13Per capita income from household business 917.5 118.1 12.88 0.31Per capita income from wages 1126.7 -34.0 -3.02 -0.19Per capita income from remittances and gifts 444.9 -153.8 -34.57 -2.59Per capita income from other sources 231.0 -72.1 -31.20 -2.54 Urban Vietnam Household size 4.94 -0.10 -2.12 -0.43Total per capita income 9209.8 -935.3 -10.16 -0.98Per capita income from agriculture 317.5 -179.4 -56.50 -2.50Per capita income from ag. sidelines 284.9 238.3 83.64 0.81Per capita income from household business 3217.9 -1099.4 -34.16 -3.42Per capita income from wages 3589.9 -50.7 -1.41 -0.07Per capita income from remittances and gifts 1010.5 178.7 17.69 0.44Per capita income from other sources 789.2 -22.8 -2.89 -0.12
Note: “Non-overlapping commune sample” refers to households from communes that appear in the 2002 VHLSS but not in the 1998 VLSS. The difference in means should be interpreted relative to the mean for the non-overlapping commune sample. Thus, 0.20 means that the mean household size is 0.20 persons greater in the overlapping sample of households than in the non-overlapping sample.
95
Table 2.19 Comparison of the households reporting non-zero income in the 2002 VHLSS between households in communes that also appear in the 1998 VLSS and households in communes that only appear in the 2002 VHLSS
Share fornon-
overlappingcommune
sample
Difference in shares
between theoverlappin
g
and non-overlapping
Percenta
communes
gedifferencein shares
between theoverlapping
and non-overlapping
communes
t-statistic of null
hypothesis of equal
means Rural Vietnam Per capita income from agriculture 0.844 0.030 3.59 0.83Per capita income from ag. sidelines 0.813 -0.017 -2.03 -0.37Per capita income from household business 0.354 -0.015 -4.24 -0.30Per capita income from wages 0.559 -0.027 -4.92 -0.58Per capita income from remittances and gifts 0.803 0.035 4.37 1.00Per capita income from other sources 0.437 -0.019 -4.39 -0.35 Urban Vietnam Per capita income from agriculture 0.226 -0.089 -39.22 -1.64Per capita income from ag. sidelines 0.243 -0.072 -29.51 -1.31Per capita income from household business 0.549 0.031 5.72 0.60Per capita income from wages 0.712 -0.001 -0.20 -0.03Per capita income from remittances and gifts 0.787 0.028 3.53 0.51Per capita income from other sources 0.449 0.070 15.66 1.06
Note: “Non-overlapping commune sample” refers to households from communes that appear in the 2002 VHLSS but not in the 1998 VLSS. The difference in means should be interpreted relative to the mean for the non-overlapping commune sample. Thus, 0.20 means that the mean household size is 0.20 persons greater in the overlapping sample of households than in the non-overlapping sample.
As a final check we compare inequality estimates between households in the two sets of
communes in Table 2.20. Similar to the results in Table 2.18, we find mixed evidence on
whether or not the two sets of communes are equally representative. The Gini coefficients for
overall Vietnam are very similar, differing by only 0.011, and they are not statistically different.
However, the similarity of the overall Gini coefficients hides important differences within rural
and urban areas. In rural areas, the Gini coefficient is 0.025 points higher in the overlap set of
communes. Again though, this is not a statistically significant difference. However, the
difference in the Gini coefficient is even larger in urban areas. It is 0.071 points higher in the
non-overlapping set of communes and it is a statistically significant difference. It is perhaps not
surprising that we observe a larger difference in urban areas than in rural areas, given the change
in the sampling framework described previously. Recall that the 1998 VLSS was stratified into
96
ten strata, three of which were urban areas. By comparison the 2002 VHLSS was stratified into
122 strata, 61 of which were urban areas. The difference in sampling strategies meant that urban
households were only interviewed in 29 out of 61 provinces in the 1998 VLSS as compared to all
61 provinces in the 2002 VHLSS. Hence, despite the random selection of communes within
strata, it is likely that the 2002 sampling framework produced a more nationally representative
selection of households, particularly in urban areas.
Table 2.20 Gini coefficients from the 2002 VHLSS for households from "overlapping" and "non-overlapping" communes
Confidence interval
Sample Gini
coefficientLower bound
Upper bound
Urban & Rural Communes in 1998 and 2002 0.392 0.358 0.435 Communes in 2002 only 0.403 0.395 0.416 Rural only Communes in 1998 and 2002 0.388 0.339 0.486 Communes in 2002 only 0.363 0.355 0.373 Urban only Communes in 1998 and 2002 0.322 0.291 0.368 Communes in 2002 only 0.393 0.376 0.415
Note, however, that the difference in urban inequality between the two samples of
communes reported in Table 2.20 does not explain away the fall in urban inequality shown in
Table 2.15. Recall that our estimates point to a fall in urban inequality between 1998 and 2002,
from 0.444 to 0.370 as measured by the Gini coefficient. Hence, the difference between
communes in 2002 goes in the opposite direction of the fall between 1998 and 2002. Thus, if we
restrict our 2002 sample to only those communes that also appeared in the 1998 VLSS, then the
fall in urban inequality would have been even greater.
2.10 Discussion and conclusion
In this paper we have tried to present a consistent set of facts about the evolution of
Vietnam’s income distribution between 1993 and 2006. Although we have not directly linked
these changes to specific reform policies associated with Doi Moi we have nonetheless presented
a large amount of information. We believe the most important results from this paper are the
following points.
97
1. Income growth has been very rapid between 1993 and 2006. Over the 13 year period
between 1993 and 2006, per capita incomes in Vietnam experienced growth averaging 8.4
percent. Growth was especially rapid between 1993 and 1998, but then fell between 1998 and
2002, before recovering between 2002 and 2006. This pattern was slightly more pronounced in
the south than in the north, reflecting the region’s greater exposure to international market forces,
including the impact of the Asian Financial Crisis and ensuing downturn in global economic
activity.
2. Rural income growth was especially rapid. Overall, rural growth outstripped urban per
capita income growth, but the opposite was true in the North. Especially important to the growth
in rural incomes was the increase in wage opportunities as reflected in growth in wages and
remittances.
3. Poverty fell very quickly. Poverty levels in Vietnam were very high in 1993, as nearly
two-thirds of the population had incomes below the poverty line. Poverty was considerably
higher in the countryside compared to the cities, with 70.1 percent classified as poor in rural
areas compared to 40.8 percent in the cities. The rapid growth in incomes was accompanied by a
rapid reduction in the poverty headcount ratio that slows over time. By 2006, only 7.8 percent of
the entire population, 1.6 percent of the urban population, and 10.1 percent of the rural
population is classified as “poor”. These estimates are subject to the potential criticism that some
of the poor, and possibly a rising number in absolute terms, are missing from the household
surveys.
4. Income inequality is high initially and falls through 2002. Inequality in Vietnam early
in the reform process is very high by international standards. It is higher in the south than in the
north, and in urban areas compared to rural areas. Much of this can be attributed to institutional
factors and other differences between the two regions. Inequality declines appreciably, with this
largely occurring between 1993 and 2002. There is no further reduction after 2002. We observe a
marked reduction in urban inequality, rural inequality, as well as a narrowing in the urban-rural
gap, the latter driven by behaviour in the south.
5. Rapid wage growth and robust growth in cropping income drove down rural
inequality. In rural areas, growth of wage income, which is especially important to poorer
households, is central to the reduction in income inequality. Robust growth of incomes in
agriculture from both cropping and sidelines are also important at the beginning and end of the
period we examine. Liberalization of farm markets domestically and internationally, and the
98
ability of households to take advantage of these opportunities, is playing an important role in the
rapid growth in agriculture, especially in the south. Although land was much more equally
distributed in the north than in the south, significantly slower growth in the farm sector prevented
this from having an even larger impact on inequality. This helps explain the narrowing in rural
inequality between the north and south.
6. Wage income growth also helped to reduce inequality in urban areas. Rapid growth in
wages, which were the most equalizing source of income in urban areas, is largely responsible
for the reduction in urban inequality. Slower growth in incomes from family run businesses,
which were more unequal, also contributed. This shift partially reflects the changing nature of
employment relationships in urban Vietnam, and the fall in self-employment, and rise in wage
employment.
We also briefly examined two key household and individual endowments – land and
human capital (education). We find that educational attainment is becoming more evenly
distributed amongst workers. We do not formally investigate the impact that this has had on the
distribution of income, but we conjecture, and leave for future research, that growing equality in
educational attainment is contributing to the relatively equalizing role of wage earnings in the
distribution of overall income.
A final outcome from this paper is a greater understanding of some of the data issues
involved with creating consistent estimates of income and distributional measures such as
inequality or poverty. We have identified three main issues that influence the comparability of
various summary statistics across the five household surveys that are not generally addressed in
the literature. The first concern, well mundane, is nonetheless important. The regional price
deflators included with the 2002 and 2004 VHLSS household datasets are improbable and likely
deeply flawed. Since these deflators are used by all authors creating national and comparable
regional estimates of poverty and inequality, all such work will be sensitive to these regional
deflators. We create our own regional deflators for 2002 and 2004. The second concern is the
substantial change to the sampling framework between the 1998 VLSS and the 2002 VHLSS.
Many authors focus only on the 1993 to 1998 period or from 2002 onwards, but there are a
handful of papers that cover the entire time period. Unlike these papers we have examined the
sensitivity of our results to the change in the sampling framework and find mixed evidence as to
whether or not the change in the sampling framework introduced a change in how representative
the sample is of the true population. Finally, the household surveys are missing people. Relative
99
to population projections based on the 1999 census, the household surveys are not adequately
capturing young adults and young children. This is most likely due to household migration and
indeed we find supporting evidence using the household panels between 2002, 2004 and 2006.
The “missing” people problem is becoming more severe overtime and calls into question how
indicative summary statistics are of the true underlying population. Unfortunately, this is not a
problem that is easily addressed.
100
2.11 Data Appendix
2.11.1 Estimation of Household Income
This section describes our methodology for estimating household income in a consistent
manner across the five household surveys. We organize the section according to the six major
income activities discussed in the body of the paper: crops, agricultural sidelines, household
businesses, wages, remittances and gifts, and “other” sources. For each income activity we
describe which sections of the household questionnaires contain the relevant questions, for both
income and, where relevant, expenditures.
2.11.1.1 Income from crops We estimate revenue from crops according to the following steps. First, for each crop we
calculate the unit value of sold output by household. There are many households that produce a
particular crop yet do not report selling any of the output. For these households we cannot
construct a unit value of sold output. Second, we remove any unit values that are extreme outliers
as a means to prevent recording errors from skewing the results. Specifically, we remove any
unit values that are more than three standard deviations away from the median unit value, where
both the median and standard deviation are calculated across all unit values within the sample for
the particular crop. Third, for any households that reported harvesting a particular crop but do
not have an associated unit value (either because they did not report selling any of the output or
because the unit value of sold output was an extreme outlier) we calculate a unit value as the
median unit value within the commune, province, region, or the entire sample. We use the most
local median unit value that is available. Fourth, we estimate the market value of crop revenue as
the reported quantity harvested times the unit value. For some crops, information was not asked
about the quantity and value of sold output. This makes it impossible to calculate unit values and
we thus use the self-reported value of the harvest as an estimate of revenue from that particular
crop.
For crop by-products there is no information on the quantity or value of sold output.
Thus, we use the self-reported value of the by-product item as an estimate of revenue.
101
The following subsections list the specific sections of each household questionnaire that
we use for calculating revenue and expenditures for crops, as well as any deviations from the
above procedure.
2.11.1.1.1 1993 VLSS
Crop revenue is the sum of revenue from all crops listed in Parts B1 through B6 of
Section 9 of the household questionnaire as well as crop by-products listed in Part C of Section
9. This includes paddy, other food crops, annual industrial crops, perennial industrial crops, fruit
crops, agro-forestry crops, and crop by-products.
In Section 9, Part B1, respondents could report the quantity and value of sold output in
terms of rice or paddy. We convert from rice to paddy by dividing the rice quantity by 0.66 for
observations that report the quantity of sold or bartered output in terms of rice, not paddy, since
the quantity harvested is reported in terms of kilograms of paddy.
For agro-forestry crops in Section 9 Part B6 there is no information collected on the
quantity harvested or sold. Hence, the calculation of unit values is impossible. We therefore use
the reported value of the harvest as an estimate of the revenue from each agro-forestry item.
Crop expense is the sum of annual expenditures on seed, chemical fertilizers, organic
fertilizers, insecticide or herbicide, transportation, and other expenses (storage costs, payments in
cash and in-kind for services received from the cooperative or government (land preparation for
planting irrigation, plant protection, protection of land) renting animals; renting equipment or
machinery; maintenance and repairs of agricultural implements and machinery; gasoline,
electricity, and other fuels; other services; labour costs for ploughing, planting and tending, and
harvesting). For each expenditure item, we use the self-reported bought and bartered value
during the past 12 months (see Section 9 Part D of the household questionnaire).
2.11.1.1.2 1998 VLSS
Crop revenue is the sum of revenue from all crops listed in Parts B1 through B6 and Part
C of Section 9 of the household questionnaire. This includes the following crop items: rice, other
Source: U.S. International Trade Commission. Note: Imports are general imports, and exports are FAS exports.
By 2004, the General Statistics Office (GSO) of Vietnam estimates exports to the U.S. accounted
for 20.2 percent of Vietnam’s total exports or about 13 percent of GDP.46 By comparison, in
2000, exports to the U.S. represented only 5.1 percent of total exports or 2.8 percent of GDP.
Hence, the growth in exports to the U.S. represents a quick and substantial shock to Vietnam’s
economy. At a more disaggregated level, exports soared in the 2-digit SITC categories of
articles of apparel and clothing accessories. This commodity category showed an annual growth
of 276.5 percent from 2001 to 2004. Table 3.2 presents information on the value, growth, and
share of exports for Vietnam’s top seven commodity exports to the U.S. according to 2004 value.
With the exception of petroleum products, Vietnam’s top seven exports to the U.S. are all
commodities that are conventionally classified as being low-skilled labour intensive. As low-
skilled workers are more likely to be poor, this suggests the potential for the increase in exports
to have positive impacts on alleviating poverty in Vietnam through increased demand for low-
skilled labour.
Following the entry into force of the BTA, the incidence of poverty in Vietnam declined
dramatically. Between 2002 and 2004 the national poverty rate fell from to 28.9 to 19.5
percent.47 While there is clearly a coincident trend in poverty alleviation and U.S. market access,
it remains an empirical question whether there is a causal connection running from the cut in
U.S. tariffs to the fall in poverty.
46 According to the GSO, exports of goods and services in 2004 were 65.74 percent of GDP. 47 There is some concern over the magnitude of the decline, in particular that the national poverty rate in 2002 may be overestimated (see Glewwe (2005)). I address this issue rigorously in Appendix A.
117
Table 3.2 Main commodity exports from Vietnam to the U.S.
SITC SITC Description 2004 Value Annual Growth Share of exportsCode (million USD) 2001 to 2004 to U.S. in 2004 (%) (%)
84 Articles of apparel and clothing accessories 2571 276.5 48.7
3 Fish 568 5.9 10.8 85 Footwear 475 53.2 9.0 82 Furniture 386 206.4 7.3 33 Petroleum 349 24.0 6.6 5 Vegetables and fruit 184 54.2 3.5 7 Coffee and tea 144 17.3 2.7
Source: U.S. International Trade Commission
The paper measures the immediate short-run impacts of U.S. tariff cuts on provincial
poverty in Vietnam. Following Topalova (2007), I construct provincial measures of exposure to
the U.S. tariff cuts by weighting the tariff cuts by the pre-existing share of employment by
industry within each province. I find that provinces that were more heavily exposed to the tariff
cuts (i.e., had a greater share of workers in industries with large tariff cuts) experienced more
rapid decreases in poverty. The impact on provincial poverty rates between 2002 and 2004 is
large. An increase of one standard deviation in provincial exposure leads to a reduction in the
incidence of poverty by approximately 14 percent, although the effect diminishes the further the
province is from a major seaport. The results are robust to alternative measures of poverty,
alternative poverty lines, plausible measurement error in provincial poverty rates, and differential
provincial poverty trends induced by variation in observable initial conditions. Regarding
transmission mechanisms, I provide evidence that provincial wage premiums increased, low-
skilled workers moved into wage and salaried jobs quicker, and employment in formal
enterprises grew more rapidly in more exposed provinces.
The paper proceeds by providing an overview of the literature on trade and poverty and a
theoretical discussion of the impact of changes in foreign market access when sub-national units
vary in their initial industrial structure. Next, the BTA is discussed in detail, followed by an
overview of the data and empirical methodology used in the paper. Subsequently, regression
results are reported and discussed, before concluding remarks are presented.
118
3.2 Background
The trade and poverty literature provides little direct empirical evidence about the ex post
economic impact of changes in trade policy on the poor (see reviews by Winters et al. (2004) and
Goldberg and Pavcnik (2004)). Nonetheless, the associated literature is very large and generally
falls into one of two literature strands. The first strand relies on the relationship between growth
and openness to trade combined with the relationship between growth and poverty alleviation.48
The second strand relies on indirect evidence of the impact of changes in trade policy on poverty.
This often takes the form of evidence linking labour market correlates of poverty, such as
unemployment, employment in the informal sector, and unfavorable changes in wages for
unskilled workers, with trade liberalization, often focusing only on urban and or manufacturing
workers.49
Very recently, however, empirical evidence on trade liberalization and poverty has
emerged. These studies fall into two categories of methodologies: the first examines what did
happen, and the second predicts what could happen. Topalova (2007) studies India’s unilateral
trade liberalization over the late 1980s and early 1990s and the subsequent variation in regional
impacts. She finds that rural Indian districts that were more exposed to the import tariff
reductions experienced slower declines in poverty than districts that were less exposed. Porto
(2003), Porto (2006), and Nicita (2004) predict the impact of changes in trade policy on
households. These papers use ex post estimates of the impact of tariff changes on prices and
predict the subsequent impact on household income or expenditures as suggested by initial
household production and consumption patterns. This study follows a methodology similar to
Topalova (2007) to examine what occurred after the implementation of the BTA in Vietnam.
Most of the studies on trade and poverty use national trade reforms, such as own country
tariff reductions or quota removals, as their source of variation in trade policy.
48 See Hallack and Levinsohn (2004) for a recent review of the trade and growth literature. Kraay (2006) provides evidence across a panel of developing countries that suggests that most of the long-run variation in changes in poverty can be explained by growth of average incomes. Besley and Burgess (2003) provide evidence of the elasticity of poverty with respect to income per capita. 49 For recent empirical evidence of the impact of trade on labour markets in developing countries see Attanasio, Goldberg and Pavcnik (2004), Goldberg and Pavcnik (2003), Pavcnik, Blom, Goldberg, and Schady (2004), Galiani and Sanguinetti (2003), and Goldberg and Pavcnik (2005), among others.
119
Few papers look at the converse question – can countries use new trade opportunities as a
mechanism for poverty reduction? Porto (2003) estimates the impact of possible domestic and
international trade reform for Argentina. He predicts that the elimination of agricultural subsidies
and trade barriers on agricultural manufactures and industrial manufactures in industrialized
countries would cause poverty to decline in Argentina.
Hence, this paper makes two main contributions to the literature. The first contribution is
the ex post analysis of trade impacts on poverty across all geographic and economic sectors. This
is in contrast to many other papers that focus solely on urban or rural areas or only on
manufacturing or agricultural activities. Second, the paper makes use of a large trade shock
induced by a trading partner as opposed to domestic trade liberalization. This provides two
benefits relative to the existing literature. First, it provides evidence on an important question:
can developing countries benefit from improved market access to large foreign market. Second,
for establishing a causal relationship the exogeneity of the foreign tariff cuts are more plausible
than the exogeneity of domestic tariff cuts.
The empirical section of this paper directly focuses on the impact on poverty of new
export opportunities induced by increased market access. The framework addresses whether all
provinces in Vietnam derived similar benefits from the decreases in U.S. tariffs. Should one
expect variation in impacts at the sub-national level? Traditional theories of international trade
do not address this question. As such, I provide a brief adaptation of the Ricardo-Viner model,
also known as the Specific Factors model, to illustrate why one might expect differences in the
impact across provinces.50 The Specific Factors model seems most appropriate as it focuses on
short-run impacts and the empirical section concentrates on the first two years immediately
following the implementation of the BTA.
In this model, labour is assumed to be completely mobile across industries, whereas
capital is immobile in the short run. As a simple example, consider a two-province country that
moves from international autarky to international free trade. For the current discussion, I abstract
away from internal trade between the two provinces and I further assume that the country takes
world prices as given. Let ( ),p pi i i i
pX f L K= denote the production of good 1, 2i = in
province , where it is assumed that each province uses the same technology to produce ,p A B=
50 See Feenstra (2004) for a discussion of the Ricardo-Viner model of international trade.
120
good i. Assume that prior to international trade, inter-province labour mobility has equalized the
wage rate A Bw w w= = . From the first-order condition with respect to labour demand, this
implies that the labour-capital ratio within an industry must be equal across provinces.51
Consider what happens in the short-run when the country opens up to trade. Suppose that this
increases the relative price, p, of good 1, where the price of good 2 has been normalized to one.
The percentage wage change can be expressed as:
( )( ) ( )
2 2 2
2 2 2 1 1 1
22
2 2
2 12 1
2 2 1 1
22
2
2 2 12 1
2 1 1
,, ,
1 ,1
1 1,1 ,1
,1
,1 ,1
LL
LL LL
LL
LL LL
LL
LL LL
f L Kdw dpw f L K pf L K p
LfK K dp
pL Lf p fK K K K
LfK dp
pL K Lf pfK K K
=+
⎛ ⎞⎜ ⎟⎝ ⎠=
⎛ ⎞ ⎛ ⎞+⎜ ⎟ ⎜ ⎟
⎝ ⎠ ⎝ ⎠⎛ ⎞⎜ ⎟⎝ ⎠=
⎛ ⎞ ⎛ ⎞ ⎛ ⎞+⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
where I have suppressed the province superscripts. The second line comes from the assumption
of constant returns to scale in the production functions (i.e., they are homogeneous of degree
one). This implies the second partial derivatives are homogeneous of degree negative one
(Varian, 1992). Since the ratio of labour to capital is constant across provinces within an
industry, the percentage change in wages will differ across provinces according to the difference
in capital stocks ratios assuming that labour is imperfectly mobile across provinces. Thus, the
province with the higher share of its capital invested in good 1, the rising price industry, would
expect a greater percentage change in the nominal wage rate. This simple model helps to explain
why some provinces might be expected to benefit more than others in the immediate short-run
following entry into force of the BTA.
51 This is a result of fiL being homogenous of degree 0 from assuming constant returns to scale in fi.
121
3.3 Overview of the U.S.-Vietnam Bilateral Trade Agreement
The BTA was signed on 13 July 2000 and came into force on 10 December 2001.52 The
commitments made by the United States and Vietnam are similar to those required by the World
Trade Organization (WTO). As such, the principal change for the U.S. was to grant Vietnam
Normal Trade Relations (NTR) or Most Favored Nation (MFN) access to the U.S. market
immediately upon entry into force of the BTA. The tariff cuts were largest in manufacturing
where the average ad valorem equivalent tariff dropped from 31.5 to 3.3 percent. The average ad
valorem tariff also fell substantially within agriculture, hunting and forestry as it was cut from
10.6 to 3.2 percent. In contrast, the tariff cuts within both fishing and mining were much smaller.
More detail on the U.S. tariff cuts is provided in Section 3.4.
In contrast, the scope of the commitments made by Vietnam is much larger. The bulk of
Vietnam’s commitments are scheduled for implementation within three to four years after entry
into force, but some commitments are not required until up to ten years after. The majority of
Vietnam’s commitments lie in the realm of legal and regulatory change as Vietnam had already
applied MFN tariffs to U.S. products before the BTA. These commitments include accordance of
national treatment to U.S. companies and nationals, customs system and procedures reform,
liberalizing and streamlining trading rights, liberalizing trade in services, and liberalizing and
safeguarding foreign investment, among others. As for trade policy commitments, the BTA
requires Vietnam to cut tariffs on approximately 250 tariff lines out of more than 6,000, typically
by 25 to 50 percent, mostly in agriculture. The overall impact of these cuts on industry level
tariffs has been very small. Industry level Vietnamese tariffs have been very stable over the
period of 1999 to 2004. Furthermore, the BTA has an extensive list of quantitative import
restrictions that must be eliminated, typically four to six years after entry into force. Almost all
of these were eliminated well ahead of schedule as part of an IMF/World Bank Agreement. By
the beginning of 2003, all import quotas except for those on sugar and petroleum products had
been lifted. Quotas on sugar and petroleum products are required to be removed after ten and
seven years from entry into force of the BTA.
52 This section draws heavily on the STAR-Vietnam report “An Assessment of the Economic Impact of the United States – Vietnam Bilateral Trade Agreement.”
122
3.4 Data
This section describes the three principal sources of data used in the subsequent analysis:
tariff data from the U.S. International Trade Commission, poverty estimates derived from the
2002 and 2004 Vietnam Household Living Standards Surveys (VHLSS), and employment data
from the 1999 Population and Housing Census in Vietnam. I describe each of them in turn.
3.4.1 Tariff Data
I use 2001 U.S. tariffs from the U.S. International Trade Commission’s online Tariff
Information Center. Prior to the BTA, Vietnam was subject to tariffs according to Column 2 of
the U.S. tariff schedule. Upon entry into force of the BTA, Vietnam became subject to MFN
tariff rates. For both tariff schedules I compute the ad valorem equivalent of any specific tariffs.
Details of the procedure can be found in the data appendix. I then match the tariff lines to
industries by the concordance provided by the World Bank via the World Integrated Trade
Solution database to construct industry-level tariffs according to 3-digit ISIC nomenclature.
There are 76 3-digit ISIC industries that experienced tariff cuts spread across agriculture
and forestry, fishing, mining, manufacturing, and other industries. Table 3.3 provides some
summary statistics on the tariff cuts by major sectors.
Table 3.3 Summary of U.S. tariffs applied to imports from Vietnam
Mean Mean pre-BTA post-BTA Standard Number of Tariff tariff Mean deviation Industry industries (Column 2) (MFN) tariff cut of tariff cutAgriculture, hunting & forestry 3 0.085 0.016 0.069 0.010 Fishing 1 0.013 0.002 0.011 Mining 9 0.027 0.001 0.026 0.045 Manufacturing 57 0.330 0.034 0.296 0.148 Other 6 0.080 0.002 0.077 0.111
Source: U.S. International Trade Commission Note: The tariffs reported are weighted average tariffs. For each commodity-line tariff, its weight is the share of imports within the sector based on 2001 U.S. imports.
As mentioned above, the average tariff cut was highest in manufacturing. There is large variation
in the tariff cuts, both across and within major sectors. The variation within sectors is highest
123
within manufacturing where the standard deviation of the cut in tariffs is 0.148 percentage
points. The variation is shown in more detail in Figure 3.1, which shows the tariff cut in
percentage points by industry. The empirical analysis below is done using 3-digit industry tariffs,
but to make the figure easier to read, the tariffs have been aggregated to the 2-digit industry
level. Industries 1 and 2 fall within agriculture, hunting & forestry; industry 5 is fishing;
industries 10 through 14 are mining; industries 15 through 36 are manufacturing; and industries
40 through 93 are other industries.
Figure 3.1 U.S. tariff cuts by 2-digit ISIC industry
Clearly the largest tariff cuts were in industry 18 (manufacture of wearing apparel;
dressing and dyeing of fur), industry 16 (manufacture of tobacco products), and industry 17
(manufacture of textiles), all within manufacturing. One of the smallest tariff cuts was also
within manufacturing, industry 23 (manufacture of coke, refined petroleum products, and nuclear
fuel). One thing that is clear from the figure is the variation in tariff cuts across industries, which
is important for the identification strategy outlined below.
124
3.4.2 Household surveys
The principal poverty measure used in the empirical analysis is the poverty headcount
ratio. It measures the share of the population that falls below the poverty line. As with most
studies of poverty in developing countries, this paper focuses on absolute deprivation. Thus, the
poverty line used does not change over time as living standards improve or decline, instead it is
meant to represent the same absolute level of welfare adjusted for price changes.
The 2002 and 2004 Vietnam Household Living Standards Surveys (VHLSS) are
representative at the provincial level and provide information on household expenditures,
occupation, employment, and various other household and individual characteristics.
Expenditure information is available for approximately 30,000 households in the 2002 VHLSS
and 9,000 households in the 2004 VHLSS. The 2002 VHLSS was conducted between January
2002 and December 2002. In contrast, the 2004 VHLSS interviewed households only from May
2004 through November 2004, with the majority of households being interviewed in June and
September. For both surveys the recall period for expenditures and employment is the past
twelve months. To construct estimates of provincial poverty, I use the official “general poverty
line”, which includes an estimate of the cost of a basket of food items required to consume 2100
calories per day and essential non-food items such as clothing and housing.53 The general
poverty line is 1,917 thousand VND in 2002 and 2,077 thousand VND in 2004. Glewwe (2005)
has reviewed the consistency of the expenditure data and concludes that they are broadly
consistent across the 2002 and 2004 VHLSS. Details of the expenditure variables and sample
weights used can be found in the data appendix.
There is a substantial variation in provincial poverty rates as well as the proportional drop
in poverty between 2002 and 2004. The latter is the primary dependent variable of the current
study. Table 3.4 provides summary statistics on the levels of poverty, the rate of poverty
reduction, patterns of employment, measures of education, and other provincial data used in the
analysis. The 2002 levels of poverty range from a high of 77 percent in Lai Chau to a low of 2
percent in Ho Chi Minh City. For the current study, it is not the level of poverty, but rather its
53 See World Bank (1999).
125
rate of decline that is most interesting. Here too there is considerable variation, as shown in
Figure 3.2.
Table 3.4 Summary statistics
Variable MeanStd. Dev. Min Max
Poverty Headcount Ratio 2002 0.322 0.182 0.020 0.766Poverty Headcount Ratio 2004 0.229 0.157 0.000 0.689Proportional Drop in Poverty, 2002 to 2004 0.311 0.221 -0.210 1.000Share of workers in: Agriculture, 1999 0.704 0.175 0.070 0.896 Aquaculture, 1999 0.024 0.035 0.000 0.187 Mining, 1999 0.006 0.016 0.000 0.122 Manufacturing, 1999 0.072 0.065 0.010 0.362Number of formal enterprise jobs, 2000 52468 114683 2860 788922Share of workers working for a wage or salary, 2002 0.287 0.124 0.087 0.602Share of population with at most: Primary education, 1999 0.750 0.089 0.444 0.910 Lower secondary education, 1999 0.139 0.047 0.043 0.270 Upper secondary education, 1999 0.083 0.038 0.030 0.216Urban share of the population, 1999 0.193 0.150 0.046 0.838Distance to nearest major seaport (km) 214.295 142.814 0.000 615.000Regional dummies: Red River Delta region 0.180 0.388 0 1 North East region 0.180 0.388 0 1 North Wests region 0.049 0.218 0 1 North Central Coast region 0.098 0.300 0 1 South Central Coast region 0.098 0.300 0 1 Central Highlands region 0.066 0.250 0 1 South East region 0.131 0.340 0 1 Mekong River Delta region 0.197 0.401 0 1
Two provinces experienced measured increases in the incidence of poverty, Khanh Hoa
and Bac Lieu, while Ho Chi Minh City eliminated all remaining poverty between 2002 and 2004.
The proportional drop in poverty between 2002 and 2004 is negatively correlated with the
incidence of poverty in 2002. This suggests that existing trends in economic performance may be
an important factor for explaining the decrease in poverty. In the empirical section I attempt to
address this concern by controlling for differences in initial provincial characteristics.
126
Figure 3.2 Histogram of the proportional drop in provincial poverty rates, between 2002 and 2004
05
1015
Num
ber o
f pro
vinc
es
-.5 0 .5 1Proportional drop in poverty, 2002 to 2004
3.4.3 Employment data
For constructing the measure of provincial exposure to U.S. tariff cuts, I use employment
data from the 3 percent sample of the 1999 Population and Housing Census. In general, it reports
industry of employment at the 3-digit ISIC level, but for some individuals it is only reported at
the 2-digit level.54 I restrict the sample to individuals 13 years of age and older, as individuals
below age 13 were not asked about their employment status.
Finally, between 2002 and 2004 three Vietnamese provinces were split. To be consistent,
I recode household observations from the 2004 VHLSS into the original 61 provinces, as in the
1999 census and the 2002 VHLSS.
54 To be exact, the industry codes used in the census do not match exactly with the ISIC nomenclature. There are a small number of industries for which the 3-digit industry assigned to the described industry does not match the ISIC code. I recode these observations according to ISIC nomenclature. This is the same for the 2002 and 2004 VHLSS. See the data appendix for further details.
127
3.5 Empirical Methodology
Following Topalova (2007), I exploit provincial variation in exposure to the trade
agreement based on the structure of employment prior to the trade agreement. I construct
provincial measures of the drop in U.S. tariffs as follows:
p ipi
TariffDrop iω τ= Δ∑ (5)
where p indexes provinces, ipω is the share of workers in province p in industry i (i.e.,
), and 1ipiω =∑ iτΔ is the tariff drop in industry i. The employment and tariff data cover over
seventy industries across agriculture, aquaculture, mining, and manufacturing. To establish the
robustness of the relationship between poverty reduction and exposure, I employ the following
regression model:
p py TariffDrop X p pα β= + + +δ ε (6)
where is the proportional drop in the poverty headcount ratio in province and Xpy p p is a vector
of control variables intended to help control for underlying trends in poverty reduction that could
be correlated with provincial exposure to the U.S. tariff cuts. In most specifications Xp includes
the natural logarithm of the poverty headcount ratio in 2002 to control for convergence in
poverty rates and regional dummy variables to control for unobserved trends in poverty that vary
by region. In other specifications, controls for other trade influences are added as are initial
provincial characteristics such as employment patterns.
I use the proportional drop in poverty (which is approximately equal to the difference in
the natural logarithm of poverty) as the dependent variable. I have chosen this form for the
dependent variable to be consistent with other key papers in the literature, such as Besley and
Burgess (2003) and McMillan, Zwane, and Ashraf (2007), which both use the natural logarithm
of poverty as their dependent variable.
It is important to understand the source of variation being used to identify β in equation
(2). The regression measures the partial correlation between the proportional drop in poverty and
exposure to U.S. tariff cuts. This implies that the framework cannot identify the average impact
of increased U.S. market access on poverty across provinces. This will be part of the estimated
constant term. Hence, the total impact of the trade agreement, which is comprised of the relative
impact, as measured by TariffDrop, and the average impact, cannot be determined. In the
128
discussion section I add additional assumptions that allow for an admittedly rough estimate of
the overall impact.
A second point to address is the weighting of national tariffs at the provincial level to
create a measure of provincial exposure to the tariff cuts. I use the industry of employment to
aggregate exposure at the industry level into a provincial measure of exposure. This implicitly
assumes that two workers in the same industry, one in the export-oriented manufacturing centre
of Ho Chi Minh City and the other working in predominantly rural Son La, for instance, will
experience the impact of tariffs cuts on clothing and apparel goods the same way. Ideally, one
would like to know whether the individual is involved in the production of goods destined for the
domestic or international market, but this information is not available in the census data. One
way to address this point is by considering how far away a province is from one of Vietnam’s
three major seaports, which are located in the provinces of Hai Phong, Da Nang, and Ho Chi
Minh City. Thus, I include the distance to the nearest major seaport as well as its interaction with
TariffDrop. This allows the net impact on a worker within an industry to vary geographically
within Vietnam.
Third, weighting national tariffs by industry of employment is not the only plausible
aggregation method. One could measure a province’s exposure by weighting tariffs with the
value of production within an industry by province or the value of exports and imports within an
industry by province. Unfortunately, national account estimates at the provincial level in
Vietnam are unreliable and thus I cannot check the robustness of my results to these alternative
aggregation procedures.
The timing of the tariff cuts and the choice of study period used for identifying the impact
of the tariff cuts are important. I use the 2002 VHLSS as my baseline from which to measure
changes in poverty. This raises two concerns. First, some of the households were surveyed close
to the end of the 2002. Hence, their expenditure and employment data are reported for a period
that is almost entirely after the entry into force of the BTA. Second, to the extent that firms and
individuals changed behavior in anticipation of the BTA, this implies that some of the impacts
were being felt prior to the date of implementation. Both observations suggest that by focusing
on the period of 2002 to 2004 I may be underestimating the impact that that BTA has had as of
2004 on provincial poverty. Unfortunately, due to lack of data, this problem is hard to avoid as
the 1998 Vietnam Living Standards Survey (VLSS), unlike the 2002 and 2004 VHLSS, was not
designed to be representative at the provincial level. Hence, the results should be interpreted as
129
the impact that the BTA had on the two-year period from 2002 to 2004 and not as the cumulative
impact up to 2004.
3.5.1 Exogeneity of U.S. Tariff Cuts
Since the trade agreement is bilateral, this raises concerns about endogenous protection
and endogenous market access through political lobbying by U.S. and Vietnamese industries. In
general, one would expect that U.S. industries would lobby for smaller cuts in the U.S. tariffs
protecting their industry and that Vietnamese industries would lobby for greater cuts in U.S.
tariffs. This concern, however, is unlikely to influence the U.S. tariff cuts in this particular
agreement. The U.S. tariff cuts were presented as an all-or-nothing package whereby exports
from Vietnam into the U.S. would immediately be covered by MFN tariff rates instead of
Column 2 tariff rates. The movement from one pre-existing tariff schedule to a second pre-
existing tariff schedule implies that both U.S. and Vietnamese industries did not have an
opportunity to influence the tariff cuts faced by their industry. This argument relies on the
assumption that both the Column 2 and MFN tariff schedules are exogenous to Vietnam, which I
turn to now.
The Column 2 tariff rates are arguably exogenous to Vietnam for a number of reasons.
First, the countries subject to Column 2 rates are all former or current communist countries,
suggesting that political concerns larger than industry lobbying dominate this category of the
U.S. tariff schedule. Table 3.5 shows the list of countries subject to Column 2 tariff rates from
1996 to 2005. At the time of the U.S.-Vietnam Bilateral Trade Agreement, the only remaining
countries were Afghanistan, Cuba, Laos, and North Korea. Second, imports into the U.S. under
Column 2 constitute a very small fraction of overall U.S. imports. Between 1996 and 2006, the
share of total U.S. imports originating in countries subject to Column 2 rates ranged between
0.00 and 0.09 percent. This implies that the returns to U.S. industries lobbying for protection are
very low within the Column 2 section of the U.S. tariff schedule. Third, as suggested by the
previous point, both prior and subsequent to the BTA, there has been little change in the
prevailing Column 2 rates.
130
Table 3.5 Countries subject to Column 2 U.S. tariffs, 1996-2005
where ωip is the share of workers in province p in industry i, and Importsi,t is the value of U.S.
imports from all countries in industry i in year t=1999, 2004. Hence, provinces with a greater
share of workers in industries that experienced larger increases in U.S. import demand will be
55 I have also run regressions controlling for initial education levels, government spending, government transfers, FDI stocks, and measures of the provincial business environment. None of these qualitatively influence the presented results.
136
more exposed to this structural change. Table 3.8 displays regression results when ImpChangesp
is included as a control variable. I do not include the 2002 employment share variables as they
are not jointly significant in the last regression reported in Table 3.7. The coefficient estimate on
TariffDrop is still statistically significant at the 1 percent level.
Changes in Vietnam’s trade policies, aside from the BTA, may also be a source of
omitted variable bias. I explore this possibility by constructing a measure of provincial exposure
to changes in Vietnam’s import tariffs between 1999 and 2004. This is done in an analogous
method as for changes in U.S. tariffs. Results are shown in column (2) of Table 3.8. Similar to
Topalova’s (2007) results for Indian districts, I find that Vietnamese provinces that were more
exposed to Vietnam’s tariff cuts experienced slower reductions in poverty, although the estimate
is not statistically significant.
One final trade policy change that warrants attention is Vietnam’s tariff commitments
under the BTA. These are almost exclusively concentrated in crops and food processing. As of
2004, Vietnam had not cut these tariff lines. In addition, the tariff cuts are small in magnitude
compared to those made by the U.S. However, firms and farmers may be changing their
production patterns in anticipation of the impending tariff cuts. Column (3) shows regression
results when provincial exposure to future Vietnamese tariff cuts, as proscribed by the BTA, are
included. This exposure does not have a statistically significant impact, nor does it substantially
change the coefficient estimate of exposure to U.S. tariffs. Finally, column (4) of Table 3.8
presents regression results when all three trade influences are included. The results are similar to
those presented in the previous columns.
In Appendix A, I discuss the possible impacts of measurement error in the initial level of
poverty in 2002. Results indicate that the above results are not driven by plausible measurement
error. Furthermore, I check the robustness of my results to the poverty line used and alternative
measures of poverty. These results are also reported in Appendix A in Table 3.13. I consider a 25
percent increase in the poverty line, as well as the normalized poverty gap and the normalized
poverty severity at the original poverty line.56 The results are consistent with the primary results
presented above.
56 The normalized poverty gap is the average difference between actual expenditures and the poverty line for all poor individuals, expressed as a fraction of the poverty line, while the normalized poverty severity gap is the average squared differenced expressed as a fraction of the poverty line.
137
Table 3.8 Ordinary Least Squares regression results of the impact of provincial exposure (TariffDrop) on poverty between 2002 and 2004 controlling for other trade influences
Dependent variable: Proportional drop in poverty, 2002 to 2004 (1) (2) (3) (4) TariffDrop (US) 9.832 10.366 10.294 10.232 (6.00)** (7.05)** (7.16)** (5.93)**Distance to nearest major seaport -0.001 -0.001 -0.001 -0.001 (3.45)** (3.40)** (3.55)** (3.08)**TariffDrop x Distance to nearest -0.041 -0.043 -0.043 -0.042 major seaport (2.20)* (2.26)* (2.34)* (2.59)* ImpChanges 1.117 2.155 (0.64) (1.10) TariffDrop (VN 99-04) -3.830 -22.065 (0.27) (0.58) TariffDrop (VN BTA) 2.804 -11.267 (0.18) (0.26) ln(Poverty 2002) 0.132 0.093 0.099 0.097 (2.19)* (1.09) (1.18) (1.10) Observations 61 61 61 61 R2 0.50 0.50 0.50 0.51 Direct impact of a 1 SD increase 0.135 0.142 0.141 0.140 in TariffDrop Indirect impact evaluated at the -0.126 -0.123 average distance from port
Robust t statistic in parentheses. *significant at 5%; **significant at 1% Includes Regional dummy variables as additional regressors.
3.7 Labour market transmission mechanisms
This section aims to confirm and to explain the above results. Given the extent of the
poverty reductions, intuitively, one would expect to find changes in the labour market that are
consistent with this pattern. If contradictory results were found, then this would lead one to be
suspicious of the previous results. Furthermore, these same labour market channels help to
explain how the tariff cuts led to reductions in poverty.
138
3.7.1 Wages
One channel from tariff cuts to household welfare is the wage labour market. In the 2004
VHLSS, among individuals aged 15 to 64, 82 percent of individuals reported working in the past
12 months. Of these workers, 31 percent reported working for a wage in the past twelve months
for their most time-consuming job. In the 2002 VHLSS, 83 percent of individuals between the
ages of 15 and 64 reporting working in the past 12 months, while 29 percent of these workers
reported working for wages for their most time-consuming job.57
I examine how the drop in U.S. tariffs influenced provincial wage premiums.58 The
provincial wage premium is the variation in individual wages that cannot be explained by
individual characteristics, such as age, gender, or industry affiliation, but can be explained by the
province of the worker. In essence, it is a conditional average wage by province. If labour is
imperfectly mobile across provinces, one would expect to find a relationship between changes in
provincial wage premiums and exposure to the tariff cuts. According to the 1999 census, only
approximately 3 percent of individuals moved across provinces between 1994 and 1999,
suggesting that labour is imperfectly mobile across provinces, at least prior to the BTA.
The empirical analysis follows a two-stage procedure. In the first stage, the log of real hourly
wages for worker i at time t ( )( )ln ijptw is regressed on a vector of individual characteristics
, a vector of industry dummies ( itH ) ( )itI , and a vector of provincial dummies : ( )itP
( )ln it it t it jt it pt ijptw α ε′ ′ ′= + + + +H β I wp P wp .
The vector of individual characteristics includes a dummy for the individual’s gender, a
quadratic in age, dummies for the highest level of completed education, dummies for sector of
ownership, and the number of months, days per month, and hours per day spent working. The
coefficient of the provincial dummy represents the variation in wages that cannot be explained
by individual characteristics or industry affiliation, but can be explained by province of
residence. Following Krueger and Summers (1988), I normalize the sum of the employment-
weighted provincial wage premiums to zero and I express the provincial wage premiums as
57 For both surveys, these are simple averages, unadjusted for sampling weights. 58 See for example Attanasio, Goldberg, and Pavcnik (2004) who report results on industry wage premiums.
139
deviations from zero. In the second stage, the change in the provincial wage premium is
regressed on the drop in tariffs by province and the provincial wage premium in 2002:
,2002p pwp TariffDrop wp up pα β γΔ = + + + .
Since the dependent variable is an estimate, I use weighted least squares. The weights are the
inverse of the variance from the first stage regression, corrected according to Haisken-DeNew
and Schmidt (1997). The results are reported in Table 3.9 for all wage earners and then
subsamples of workers based on education and by the level of skill according to occupation. For
all wage earners the drop in tariffs is positively associated with provincial wage premiums, but
this result is not statistically significant. However, dividing the sample according to education
reveals a more nuanced pattern. For workers with at most a primary education the impact of
TariffDrop on the change in the provincial wage premium is both positive and statistically
significant. A one-standard deviation increase in exposure to the U.S. tariff cuts is associated
with a 1.9 percent increase in the provincial wage premium for primary educated workers. The
results are positive, but statistically insignificant for workers with both a lower secondary and an
upper secondary education. Note also that the estimate of the impact of TariffDrop drops as the
level of education increases. This is consistent with the large increase in exports in low-skilled
labour-intensive goods creating a positive labour demand shock for unskilled workers. A similar
picture emerges when the sample of workers is divided according to whether or not their job is
considered a skilled or unskilled occupation. For unskilled workers, the estimate of TariffDrop
on the provincial wage premium is positive and statistically significant. A one standard deviation
increase in TariffDrop is associated with a 1.8 percent increase in the provincial wage premium
for unskilled workers. By comparison, the impact is estimated to be negative for skilled workers,
although it is not statistically significant.
140
Table 3.9 Provincial wage premiums and provincial exposure
Dependent variable: Change in provincial wage premium, 2002 to 2004 Education Occupation
Finally, I assign individuals based on the province of official residence on the night of the census
using provvn and weight individuals using wtper.
U.S. Tariffs: The 2001 U.S. tariff data from the U.S. International Trade Commission’s
(USITC) website. I convert specific tariffs to ad valorem equivalents by estimating the unit value
of imports within each 8-digit HTS tariff line using total annual imports from all countries. I
calculate the unit value of imports by dividing customs value of total imports by the total
quantity by first unit for each 8-digit HTS tariff line that features a specific tariff component.
Concordance from HS to ISIC: The U.S. tariff data is reported according to the 8-digit
Harmonized Tariff Schedule (HTS) of the United States. I match the 8-digit HTS codes to 6-digit
Harmonized System (HS) codes by dropping the last two digits of the code. I convert the 6-digit
HS codes to 3-digit ISIC codes with the concordance supplied by the World Bank. These
concordances are also available as part of the WITS software program. I calculate a weighted
average of the ad valorem equivalent of all tariff lines within an industry using U.S. imports in
each tariff line as the weights.
Hourly wages: For the 2004 VHLSS, nominal hourly wages are estimated by dividing the wage
and salary received during the past 12 months for the most time consuming job (variable
152
m4ac10a from file m4a.dta) by an estimate of annual hours. Annual hours are estimated by
multiplying the number of months (m4ac6) by the number of days per month (m4ac7) and by
the number of hours per day (m4ac8). I convert the nominal hourly wage series to national
average January 2004 prices by regionally and temporally deflating using the series rcpi and
mcpi available in hhexpe04.dta.
For the 2002 VHLSS, the wage and hours data comes from the file muc3.dta. I take
annual wages from m3c1a and construct annual hours from months (m3c9), days per month
(m3c10) and hours per day (m3c11). As for the 2004 wages, I convert the nominal hourly wage
series to national average January 2002 prices by regionally (rcpi) and temporally (mcpi)
deflating using deflators in the file hhexpe02.dta.
153
References Adelman, Irma and Cynthia Morris. 1973. Economic Growth and Social Equity in Developing
Countries. Stanford, California: Stanford University Press.
Alesina R., E. Glaeser and B. Sacerdote. 2005. Work and leisure in the U.S. and Europe: Why so different? Harvard Institute of Economic Research, Discussion Paper 2068.
Anderson G. 1996. Nonparametric tests of stochastic dominance in income distributions. Econometrica 64: 1183-1193.
Athukorala, Prema-chandra. 2006. Trade Policy Reforms and the Structure of Protection in Vietnam. The World Economy 29: 161-187.
Atkinson A.B. and F. Bourguignon. 1982. The comparison of multi-dimensioned distributions of economic status. Review of Economic Studies 49: 183-201.
Attanasio, Orazio, Pinelopi Goldberg, and Nina Pavcnik. 2004. Trade reforms and wage inequality in Columbia. Journal of Development Economics 74: 331-366.
Bach H-U. and S. Koch. 2003. Working time and the volume of work in Germany: The IAB concept. IAB Labour Market Research Topics, No. 53.
Barrett G.F. and S.G. Donald. 2003. Consistent tests for stochastic dominance. Econometrica 71: 71-104.
Benjamin, Dwayne and Loren Brandt. 2004. Agriculture and income distribution in rural Vietnam under economic reforms: A tale of two regions, in Economic growth, poverty and household welfare in Vietnam, edited by Paul Glewwe, Nisha Agrawal, and David Dollar. Washington, D.C.: The World Bank, 133-86.
Benjamin, Dwayne, Loren Brandt and John Giles. 2005. The evolution of income inequality in rural China. Economic Development and Cultural Change 53:769-824.
Benjamin, Dwayne, Loren Brandt, John Giles and Sangui Wang. 2008. Income inequality during China’s economic transition, in China’s Great Economic Transformation, edited by Loren Brandt and Thomas Rawski. Cambridge: Cambridge University Press.
154
Besley, Timothy and Robin Burgess. 2003. Halving global poverty. Journal of Economic Perspectives 17: 3-22.
Bourguignon F. and S.R. Chakravarty. 2002. Multi-dimensional poverty orderings. DELTA Working Paper 2002-22.
Brambilla, Irene, Guido Porto, and Alessandro Tarozzi. 2008. Adjusting to trade policy: Evidence from U.S. antidumping duties on Vietnamese catfish. NBER Working Paper No. w14495.
Brandt, Loren, Le Dang Trung, Huong Giang, Trang Cong Thang, Pham Giang Linh, Nguyen Bui Linh, and Luu Van Vinh. 2006. Land access, land markets and their distributive implications in rural Vietnam. Draft.
Crawford I. 2005. A nonparametric test of stochastic dominance in multivariate distributions. University of Surrey, Discussion Papers in Economics, DP 12/05.
Davidson R, Duclos J-Y. 2000. Statistical inference for stochastic dominance and for the measurement of poverty and inequality. Econometrica 68: 1435-1464.
Duclos J-Y., D. Sahn and S.D. Younger. 2004. Robust multidimensional poverty comparisons. CIRPEE Working Paper No. 03-04.
Edmonds, Eric and Nina Pavcnik. 2006. Trade liberalization and the allocation of labor between households and markets in a poor country. Journal of International Economics 69: 272-95.
Feenstra, Robert. 2004. Advanced International Trade. Princeton, New Jersey: Princeton University Press.
Fei, John C. H., Gustavo Ranis and Shirley W. Y. Kuo. 1979. Growth with equity: The Taiwan case. World Bank, Oxford University Press.
Foster, James, Joel Greer, and Erik Thorbecke. 1984. A Class of Decomposable Poverty Measures. Econometrica 52: 761-766.
Galiani, Sebastian and Pablo Sanguinetti. 2003. The impact of trade liberalization on wage inequality: evidence from Argentina. Journal of Development Economics 72: 497-513.
Gallup, John Luke. 2004. The wage labor market and inequality in Vietnam, in Economic growth, poverty and household welfare in Vietnam, edited by Paul Glewwe, Nisha Agrawal, and David Dollar. Washington, D.C.: The World Bank, 53-93.
Glewwe, Paul. 2004. An overview of economic growth and household welfare in Vietnam in the 1990s, in Economic growth, poverty and household welfare in Vietnam, edited by Paul Glewwe, Nisha Agrawal, and David Dollar. Washington, D.C.: The World Bank, 1-26.
Glewwe, Paul. 2005. Mission Report for Trip to Vietnam October 17-25, 2005. Mimeo.
Glewwe, Paul, Michele Gragnolati, and Hassan Zaman. 2002. Who gained from Vietnam’s boom in the 1990s? Economic Development and Cultural Change: 50: 773-92.
Goldberg, Pinelopi and Nina Pavcnik. 2003. The response of the informal sector to trade liberalization. Journal of Development Economics 72, 463-496.
Goldberg, Pinelopi and Nina Pavcnik. 2004. Trade, Inequality, and Poverty: What Do We Know? Evidence from Recent Trade Liberalization Episodes in Developing Countries. Brookings Trade Forum: 223-269.
Goldberg, Pinelopi and Nina Pavcnik. 2005. The effects of the Colombian trade liberalization on urban poverty, in Globalization and poverty, edited by A. Harrison. Chicago: University of Chicago Press.
GSO. Year unknown. Vietnam Household Living Standards Surveys (VHLSS), 2002 and 2004: Basic Information.
Haisken-DeNew, John P. and Christoph M. Schmidt. 1997. Interindustry and interegion differentials: Mechanics and interpretations. Review of Economics and Statistics 79: 516-521.
156
Hall P. and A. Yatchew. 2005. Unified approach to testing functional hypotheses in semiparametric contexts. Journal of Econometrics 127: 225-252.
Hallak, Juan Carlos and James Levinsohn. 2004. Trade Policy as Development Policy? Evaluating the Globalization and Growth Debate. Mimeo.
Kaur A., B.L.S. Prakasa Rao and H. Singh. 1994. Testing for second-order stochastic dominance of 2 distributions. Econometric Theory 10: 849-866.
Klecan L., R. McFadden and D. McFadden. 1991. A robust test for stochastic dominance. Mimeo, MIT.
Kraay, Aart. 2006. When is growth pro-poor? Evidence from a panel of countries. Journal of Development Economics 80: 198-227.
Krueger, Alan B. and Lawrence H. Summers. 1998. Efficiency wages and the inter-industry wage structure. Econometrica 56: 259-293.
Krugman, Paul. 2005. French family values. The New York Times, July 29.
Linton O., E. Maasoumi and Y-J. Whang. 2005. Consistent testing for stochastic dominance under general sampling schemes. Review of Economic Studies 72: 735-765.
Litchfield, Julie and Patricia Justino. 2004. Welfare in Vietnam during the 1990s: Poverty, inequality and poverty dynamics. Journal of the Asia Pacific Economy 9: 145-69.
Maasoumi E. and A. Heshmati. 2000. Stochastic dominance among Swedish income distributions. Econometric Reviews 19: 287-320.
McCaig, Brian. 2008a. Comparison of income between the ‘short’ and ‘long’ samples of the 2002 VHLSS. Unpublished research note.
McCaig, Brian. 2008b. Comparison of income between the ‘short’ and ‘long’ samples of the 2004 VHLSS. Unpublished research note.
McCaig, Brian. 2009. Exporting out of poverty: Provincial poverty in Vietnam and U.S. market access. ANU Working Papers in Economics & Econometrics No. 502.
157
McFadden, D. 1989. Testing for stochastic dominance, in Studies in the Economics of Uncertainty: Part II, edited by T. Fomby and T.K. Seo. Springer-Verlag.
McMillan, M., A.P. Zwane and N. Ashraf. 2007. My policies or yours: Does OECD support for agriculture increase poverty in developing countries, in Globalization and Poverty edited by A. Harrison. Chicago: The University of Chicago Press.
Minot, Nicholas, and Bob Baulch. 2004. The Spatial Distribution of Poverty in Vietnam and the Potential for Targeting, in Economic Growth, Poverty, and Household Welfare in Vietnam, edited by Paul Glewwe, Nisha Agrawall, and David Dollar. Washington, D.C.: World Bank.
Molini, Vasco and Guanghua Wan. 2008. Discovering sources of inequality in transition economies: A case study of rural Vietnam. Economic Change and Restructuring 41: 75-96.
Nicita, Alessandro. 2004. Who benefited from trade liberalization in Mexico? Measuring the effects on household welfare. World Bank Policy Research Working Paper 3265.
Nguyen, Binh T., James W. Albrecht, Susan B. Vroman, and M. Daniel Westbrook. 2007. A quantile regression decomposition of urban-rural inequality in Vietnam. Journal of Development Economics 83: 466-90.
Pavcnik, Nina, Andreas Blom, Pinelopi Goldberg and Norbert Schady. 2004. Trade liberalization and industry wage structure: Evidence from Brazil. World Bank Economic Review 18: 319-344.
Porto, Guido. 2003. Trade reforms, market access, and poverty in Argentina. World Bank Policy Research Working Paper No. 3135.
Porto, Guido. 2006. Using survey data to assess the distributional effects of trade policy. Journal of International Economics 70: 140-160.
Prescott E C. 2004. Why do Americans work so much more than Europeans? Federal Reserve Bank of Minneapolis Quarterly Review 28: 2-13.
158
Ravallion, Martin and Shaohua Chen. 2007. China’s (uneven) progress against poverty. Journal of Development Economics 82: 1-42.
Rodrik, Dani. Gene Grossman and Victor Norman. 1995. Getting interventions right: How South Korea and Taiwan grew rich. Economic Policy 10: 55-107.
Romalis, John. 2003. Would rich country trade preferences help poor countries grow? Evidence from the Generalized System of Preferences. Draft.
Schettkat R. 2003. Differences in U.S.-German time-allocation: Why do Americans work longer hours than Germans? IZA discussion paper no. 697.
Sen, Amartya. 1982. Choice, Welfare and Measurement. Cambridge, Mass.: MIT Press.
Shorrocks, Anthony F. 1982. Inequality decomposition by factor components. Econometrica 50: 193-211.
Shorrocks, Anthony F. 1983. The impact of income components on the distribution of family incomes The Quarterly Journal of Economics 98: 311-326.
STAR-Vietnam. 2003. An Assessment of the Economic Impact of the United States – Vietnam Bilateral Trade Agreement. Hanoi, Vietnam: The National Political Publishing House.
Topalova, Petia. 2007. Trade Liberalization, Poverty and Inequality: Evidence from Indian Districts, in Globalization and Poverty, edited by A. Harrison. Chicago: The University of Chicago Press.
Trefler, Daniel. 1993. Trade liberalization and the theory of endogenous protection: An econometric study of U.S. import policy. Journal of Political Economy 101: 138-160.
Tse Y.K. and X. Zhang. 2004. A Monte Carlo investigation of some tests of stochastic dominance. Journal of Statistical Computation and Simulation 74: 361-378.
van de Walle, Dominique and Dorothyjean Cratty. 2004. Is the emerging non-farm market economy the route out of poverty in Vietnam? Economics of Transition 12: 237-274.
159
van de Walle, Dominique and Dileni Gunewardena. 2001. Sources of ethnic inequality in Viet Nam. Journal of Development Economics 65: 177-207.
Varian, H.R. 1992. Microeconomic Analysis, 3rd edition. New York: W.W. Norton and Company.
Winters, Alan L., Neil McCulloch, and Andrew McKay. 2004. Trade Liberalization and Poverty: The Evidence So Far. Journal of Economic Literature 42: 72-115.
Vietnamese Academy of Social Sciences. 2007. Vietnam poverty update report 2006: Poverty and poverty reduction in Vietnam 1993-2004. Hanoi: The National Political Publisher.
Vijverberg, Wim P. M. and Jonathon Haughton. 2004. Household enterprises in Vietnam: Survival, growth, and living standards in Economic Growth, Poverty and Household Welfare in Vietnam, edited by Paul Glewwe, Nisha Agrawal, and David Dollar. Washington, D.C.: The World Bank, 95-132.
Winters, A.L., N. McCulloch and A. McKay. 2004. Trade Liberalization and Poverty: The Evidence So Far. Journal of Economic Literature 42: 72-115.
World Bank. 1999. Vietnam Development Report 2000: Attacking Poverty. Hanoi: World Bank.
World Bank. 2000. Viet Nam Living Standards Survey (VNLSS), 1992-93: Basic Information.
World Bank. 2001. Vietnam Living Standards Survey (VLSS), 1997-98: Basic Information.
World Bank. 2003. Vietnam Development Report 2004: Poverty. Hanoi: World Bank.
World Bank. 2004. Vietnam Development Report 2005: Governance. Hanoi: World Bank.
Yatchew, A. 2003. Semiparametric Regression for the Applied Econometrician. Cambridge: Cambridge University Press.
Zhang, Junsen, Yaohui Zhao, Albert Park, and Xiaoqing Song. 2005. Economic returns to schooling in urban China, 1988 to 2001. Journal of Comparative Economics 33: 730-752.