------------------------------------------------------------------------------- help forsvylorenzStephen P. Jenkins (September 2006) -------------------------------------------------------------------------------

Quantile group shares, cumulative shares (Lorenz ordinates), generalized Lorenz> ordinates, and Gini coefficient

svylorenzvarname[ifexp] [inrange] [,ngp(#)qgp(newvarname)subpop(varname)pvar(newvarname)lvar(newvarname)selvar(newvarname)glvar(newvarname)seglvar(newvarname)level(#)]Data must be

svysetbefore using this command; see svyset.

Description

svylorenzcomputes distribution-free variance estimates for quantile group shares of totalvarname, cumulative quantile group shares (cumulation is in ascending order ofvarname), generalized Lorenz ordinates, and the Gini coefficient. The Lorenz curve,L(p), is the graph of cumulative quantile group shares at each cumulative population sharep=F(varname). The generalized Lorenz curveGL(p) is the Lorenz curve scaled up at eachpby meanvarname. The Gini coefficient,G, is twice the area between the Lorenz curve and the line of perfect inequality, and ranges between 0 and 1. Higher values indicate greater inequality. Note that the Gini coefficient is calculated using all valid observations; it isnotderived by approximation from the income shares.Beach and Davidson (1983) provide formulae for variance estimation of shares, cumulative shares and generalized Lorenz ordinates, but for unweighted data with no complex survey design features. Beach and Kaliski (1986) extend these results to the case with sample weights that are fixed and non-stochastic. Kovacevic and Binder (1997), using the estimating equations approach, provide formulae for variance estimation of cumulative shares allowing for probability weights and for complex survey design more generally. They also provide formulae for variance estimation of

G. All these linearization methods rely on asymptotic approximations, and small sample properties are not well-known.

svylorenzderives variance estimates using the methods of Kovacevic and Binder (1997) for cumulative shares andG, and derives estimates for quantile group shares from those for cumulative shares using a result of Beach and Kaliski (1986). Variance estimates for generalized Lorenz ordinates are derived by an application of the estimating equations approach of Binder and Kovacevic (1995) and Kovacevic and Binder (1997). For an alternative derivation, see Zheng (2002).The point estimates computed by

svylorenzare the same as the estimates that can be calculated usingsumdist,ineqdecoandineqdec0. By default, however,svylorenzuses observations with non-negative values ofvarname,ineqdecouses observations with strictly positive values ofvarname, andineqdec0andsumdistuse observations with negative, zero, or positive values ofvarname.

Options

ngp(#)specifies the number of quantile groups, and must be an integer between 1 and 100. The default is 10.

qgp(newvarname)creates a new variable in the current data set that identifies the quantile group membership of each observation.

subpop(varname)specifies that estimates be computed for the single subpopulation defined by the observations for whichvarname!=0. Typically,varname=1 defines the subpopulation andvarname=0 indicates observations not belonging to the subpopulation. For observations whose subpopulation status is uncertain,varnameshould be set to missing.

level(#)specifies the confidence level, as a percentage, for confidence intervals. The default is level(95) or as set by set level; see [U] 20.6 Specifying the width of confidence intervals.The following options may be used to graph Lorenz curves and generalized Lorenz curves (see also

glcurvewhich is a more general program for this task):

pvar(newvarname)creates a new variable in the current data set containing the values ofp_j=F(x_j) corresponding to eachjfor each quantile groupj= 1,...,J, plus 0.

lvar(newvarname)creates a new variable in the current data set containing the cumulative sharesL(p_j), plusL(0) = 0.

selvar(newvarname)creates a new variable in the current data set containing the estimated standard errors of the cumulative shares.

glvar(newvarname)creates a new variable in the current data set containing the generalized Lorenz ordinatesGL(p_j), plusGL(0) = 0.

seglvar(newvarname)creates a new variable in the current data set containing the estimated standard errors of the generalized Lorenz ordinates.

Examples. svyset psu_name [pw = wgt], strata(strata_name)

. svylorenz income

. svylorenz cYbhcg, ngp(20) pvar(p) lvar(l) selvar(sel)

. twoway (connect p p) (connect l p, sort)

Further examples are provided in the downloadable materials accompanying the presentation by Jenkins (2006).

Saved ResultsScalars:

e(gini)Ge(se_gini)asymptotic SE ofGe(mean)mean ofvarnamee(se_mean)asymptotic SE of the meane(total)total ofvarnamee(ngps)number of quantile groupse(qj)quantilej, wherej= 1, ...,ngpse(shj)share ofvarnameheld by each quantile groupje(se_shj)asymptotic SE of each groupj's share ofvarnamee(cushj)cumulative share ofvarnameheld by each quantile groupje(se_cushj)asymptotic SE of each groupj's cumulative share ofvarnamee(glj)generalized Lorenz ordinate ofvarnameheld by each quantile groupje(se_glj)asymptotic SE of each groupj's generalized Lorenz ordinate ofvarnameMatrices:

e(quantiles)1 x (ngps-1) vector of quantilese(shares)1 x (ngps) vector of quantile group sharese(V_cush)(ngps-1) x (ngps-1) variance-covariance matrix of cumulative sharese(V_gl)(ngps) x (ngps) variance-covariance matrix of generalized Lorenz ordinates

AcknowledgementsPhilippe Van Kerm provided helpful comments on early drafts of this program.

ReferencesBeach, C.M. and R. Davidson. 1983. Distribution-free statistical inference with Lorenz curves and income shares.

Review of EconomicStudies50: 723-725.Beach, C.M. and S.F. Kaliski. 1986. Lorenz curve inference with sample weights: an application to the distribution of unemployment experience.

Applied Statistics35(1): 38-45.Binder, D.A. and M.S. Kovacevic. 1995. Estimating some measures of income inequality from survey data: an application of the estimating equations approach.

Survey Methodology21: 137-145.Jenkins, S.P. 2006. Estimation and interpretation of measures of inequality, poverty, and social welfare using Stata. Presentation at North American Stata Users' Group Meetings 2006, Boston MA. http://econpapers.repec.org/paper/bocasug06/16.htm.

Kovaevic, M.S. and D.A. Binder. 1997. Variance estimation for measures of income inequality and polarization.

Journal of Official Statistics13(1): 41-58. Full text downloadable from http://www.jos.nu/Articles/abstract.asp?article=13141.Zheng, B. 2002. Testing Lorenz curves with non-simple random samples.

Econometrica70: 1235-1243.

AuthorStephen P. Jenkins, Institute for Social and Economic Research, University of Essex. Email: stephenj@essex.ac.uk

Also seesvy, svyset, xtile