Estimation of GE inequality indices from complex survey data
svygei varname [if exp] [in range] [, alpha(#) subpop(varname) level(#)
svygei typed without arguments redisplays the last estimates. The level option may be used.
The survey design variables must be set beforehand by svyset, see help svyset.
Warning: Use of if or in restrictions will not produce correct variance estimates for subpopulations in many cases. To compare estimates for subpopulations, use the subpop() option.
Description
svygei provides estimates of finite-population Generalised Entropy (GE) inequality indices, together with their associated variance estimates. The GE class of inequality indices is characterized by a sensitivity parameter, a. The program calculates GE(a) for a = -1 and 2 plus the limiting cases when a -> 0 (the mean logarithmic deviation, MLD) and when a -> 1 (the Theil index), and for one additional value (which defaults to a = 3, unless set otherwise using the alpha option). GE(2) is half the square of the coefficient of variation. The indices differ in their sensitivities to differences in different parts of the distribution of varname. The more positive that a is, the more sensitive GE(a) is to differences at the top of the distribution of varname; the more negative that a is, the more sensitive it is to differences at the bottom of the distribution of varname.
Sampling variances are calculated using a method proposed by Woodruff (1971). The derivations assume that the sample under consideration is sufficiently large that a Taylor series approximation to the index holds. For full details of the derivation of the sampling variances, see Biewen and Jenkins (2003).
The program may also be used to calculate sampling variances in the case where there are i.i.d. observations: see Biewen and Jenkins (2003).
A companion program, svyatk, provides estimates of Atkinson inequality indices, using the same methods.
Options
alpha allows the user to choose a value of a (default = 3).
subpop(varname) specifies that estimates be computed for the single subpopulation defined by the observations for which varname!=0. Typically, varname=1 defines the subpopulation and varname=0 indicates observations not belonging to the subpopulation. For observations whose subpopulation status is uncertain, varname should be set to missing.
Examples
. * (1) Income inequality among individuals using household survey data with obs = individual
. * weight = individual sample weight
. use income_ind, clear
. svyset [pweight = xewght], psu(psu_id) strata(strata_id)
. svygei income
. * (2) Income inequality among individuals using household survey data with obs = individual
. * weight = individual sample weight; survey PSU and strata not provided; household ID known
. use income_ind, clear
. svyset [pweight = xewght], psu(hh_id)
. svygei income
. * (3) Income inequality among individuals using survey data with obs = household;
. * all persons in same household have same income; survey PSU and strata not provided
. * weight = household weight x household size
. use income_hh, clear
. svyset [pweight = xhh_wt]
. svygei income
Authors
Martin Biewen, University of Frankfurt, Germany <biewen@wiwi.uni-frankfurt.de>
Stephen P. Jenkins, ISER, University of Essex, U.K. <stephenj@essex.ac.uk>
References
Biewen, M. and S.P. Jenkins 2003. Estimation of Generalized Entropy and Atkinson indices from complex survey data. Working Paper 2003-11, Institute for Social and Economic Research, University of Essex. http://www.iser.essex.ac.uk/pubs/workpaps/pdf/2003-11.pdf, Oxford Bulletin of Economics and Statistics, submitted.
Woodruff, R.S. 1971. A simple method for approximating the variance of a complicated estimate. Journal of the American Statistical Association 66: 411-4.
Also see
Manual: [U] 30 Overview of survey estimation, [Su-Z] svy
On-line: help for svy and, if installed, svyatk, geivars, ineqdeco.