```-------------------------------------------------------------------------------
help for glcurve
(STB-48: sg107; STB-49: sg107_1; SJ1-1: gr0001; SJ4-4: gr0001_1; SJ6-4: gr0001
> _2; SJ7-2: gr0001_3; ??)
-------------------------------------------------------------------------------

Derivation of generalised Lorenz curve ordinates with unit record data

glcurve varname [weight] [if exp] [in range] [, pvar(newvarname)
glvar(newvarname) sortvar(varname) by(varname) split nograph
replace lorenz atip(string) rtip(string) plot(plot)
graph_options ]

aweights and fweights are allowed; see help weights.

Description

Given a variable varname, call it x with c.d.f. F(x), glcurve draws its
Generalised Lorenz curve and/or generates two new variables containing
the Generalised Lorenz ordinates for x, i.e.  GL(p) at each p = F(x).
For a population ordered in ascending order of x, a graph of GL(p)
against p plots the cumulative total of x divided by population size
against cumulative population share GL(1) = mean(x). glcurve can also be
used to derive many other related concepts such as Lorenz curves,
concentration curves and 'Three Is of Poverty' (TIP) curves, with
appropriate definition of varname, order of cumulation (set with the
sortvar option), and normalisation (e.g. by the mean of varname).
Alternatively glcurve with the lorenz, atip or rtip option can be used
directly to draw the related Lorenz, concentration and TIP curves.

Comparisons of pairs of distributions (and dominance checks) can be
undertaken by using the by() (with or without the split) options.  It can
also be made manually by 'stacking' the data (see help on stack).

The graphs drawn by glcurve are relatively basic. For graphs with full
user control over formatting and labelling, users are recommended to use
glcurve to generate the ordinates of the graph required using the
pvar(newvarname) and glvar(newvarname) options, and then to draw the
graph using graph twoway.

Options

pvar(pvarname) generates the variable pvarname containing the x
coordinates of the created curve.

glvar(glvarname) generates the variable glvarname containing the y
coordinates of the created curve.

sortvar(sname) specifies the sort variable.  By default, the data are
sorted (and cumulated) in ascending order of varname. If the sortvar
option is specified, sorting and cumulation is in ascending order of
variable sname. Within tied values of sname, data are sorted in
ascending order of varname.

by(groupvar) specifies that the coordinates are to be computed separately
for each subgroup defined by groupvar. groupvar must be an integer
variable.

split specifies that a series of new variables are created containing the
coordinates for each subgroup specified by by(groupvar). split can
not be used without by().  If split is specified, then the string
glname in glvar(glname) is used as a prefix to create new variables
glname_X1, glname_X2,... (where X1, X2, ... are the values taken by
groupvar).

nograph avoids the automatic display of a crude graph made out of the
created variables. nograph is assumed if by() is specified without
split.

replace allows the variables specified in glvar(glvarname) and
pvar(pvarname) to be overwritten if they already exist. Otherwise
glvarname and pvarname must be new variable names.

lorenz requires that the ordinates of the Lorenz curve are computed
instead of generalised Lorenz ordinates. The Lorenz ordinates of
variable x, L(p), are GL(p)/mean(x).

rtip(povline) and atip(povline) require that the ordinates of TIP curves
are computed instead of generalised Lorenz ordinates.  povline
specifies the value of the poverty line: it can be either a numeric
value taken as the poverty line for all observations or a variable
name containing the value of the poverty line for each observation.
atip() draws 'absolute' TIP curves (by cumulating max(z-x,0)) and
rtip() draws 'relative' TIP curves (by cumulating max(1-(x/z),0)).

plot(plot) provides a way to add other plots to the generated graph; see
plot option.

graph_options are standard twoway scatter options.  Note that
modifications to the legend labels should be made with the
legend(order(...) options instead of legend(label(...) (see help
legend_option).

Examples

by Jenkins (2006).

. * Generalized Lorenz curve ordinates; plot using -graph twoway-

. glcurve x, gl(gl1) p(p1) nograph

. twoway line gl1 p1

. * Lorenz curve ordinates; plot using -glcurve-

. glcurve x, lorenz plot(function equality = x)

. * Lorenz curve ordinates; plot using -glcurve-; options

. glcurve x [fw=wgt] if x > 0, gl(gl2) p(p2) lorenz

. * Generalised Lorenz curve ordinates and graphs, by state

. glcurve x, gl(gl2) p(p2) replace sort(y) by(state) split

. * TIP curve ordinates with graph

. glcurve x, gl(gl3) p(p3) atip(10000)

. glcurve x, gl(gl3) p(p3) atip(plinevar)

. * Lorenz curve ordinates; plot using -graph twoway-

. glcurve x, gl(gl) p(p) lorenz nograph

. twoway line gl p , sort || line p p , ///
xlabel(0(.1)1) ylabel(0(.1)1)      ///
xline(0(.2)1) yline(0(.2)1)        ///
title("Lorenz curve") subtitle("Example with custom formatting")    ///
legend(label(1 "Lorenz curve") label(2 "Line of perfect equality")) ///
plotregion(margin(zero)) aspectratio(1) scheme(economist)

Notes

glcurve is designed to be used with individual-level, unit-record data.
Although glcurve can also be applied mechanically to grouped (`banded')
income data using fweights, be aware that the resulting curve is a
potentially poor estimate, because within-income-band inequality is not
taken into account.  On the estimation of Lorenz curves and inequality
indices with grouped data, see e.g. Gastwirth and Glaubermann (1976) or
Cowell and Mehta (1982).

One must also be careful in using the ordinates returned from the option
pvar for subsequent computation of the Gini or Concentration coefficient
using the 'convenient covariance' formulae described by e.g. Lerman and
Yitzhaki (1984, 1989) or Jenkins (1988). The ordinates returned in pvar
are the curve ordinates (and are equal to estimates obtained from cumul)
and these are not necessarily the fractional ranks required in the
covariance formula. The difference is generally negligible with
continuous unit-record data, but is larger if there are many ties in the
ranking variable (as in the case, e.g., for the concentration coefficient
based on an ordinal categorical variable, or when dealing with grouped
data).

Acknowledgements

Nicholas J. Cox helped with updating the code for our program from Stata
7 (glcurve7) to Stata 8. David Demery, Owen O'Donnell, Shehzad Ali made
'sort stable' estimation for concentration curves.

Authors

Philippe Van Kerm, CEPS/INSTEAD, Differdange, G.-D. Luxembourg
philippe.vankerm@ceps.lu

Stephen P. Jenkins, ISER, University of Essex
stephenj@essex.ac.uk

References

Cowell, F.A. 1995. Measuring Inequality (second edition).  Hemel

Cowell, F.A. and Mehta, F. 1982.  The Estimation and Interpolation of
Inequality Measures.  Review of Economic Studies 49(2): 273-290.

Gastwirth, J.L. and Glauberman, M. 1976.  The Interpolation of the Lorenz
Curve and Gini Index from Grouped Data.  Econometrica 44(3): 479-483.

Jenkins, S.P. 1988.  Calculating income distribution indices from
microdata.  National Tax Journal 61: 139-142.

Jenkins, S.P. 2006.  Estimation and interpretation of measures of
inequality, poverty, and social welfare using Stata. Presentation at
North American Stata Users' Group Meetings 2006, Boston MA.
http://econpapers.repec.org/paper/bocasug06/16.htm.

Jenkins, S.P. and Lambert, P.J. 1997.  Three 'I's of poverty curves, with
an analysis of UK poverty trends.  Oxford Economic Papers 49:
317-327.

Lambert, P.J. 2001.  The Distribution and Redistribution of Income (third
edition). Manchester: Manchester University Press.

Lerman, R.I. and Yitzhaki, S. 1984.  A note on the calculation and
interpretation of the Gini index.  Economics Letters 15(3-4):
363-368.

Lerman, R.I. and Yitzhaki, S. 1989.  Improving the Accuracy of Estimates
of Gini Coefficients.  Journal of Econometrics 42(1): 43-47.

Shorrocks, A.F. 1983. Ranking income distributions.  Economica 197: 3-17.

Also see

Manual:  [R] lorenz
STB:  STB-48 sg107, STB-49 sg107.1, SJ 1(1) gr0001
On-line:  help for sumdist, svylorenz (if installed)

```