-------------------------------------------------------------------------------
help for glcurve
 (STB-48: sg107; STB-49: sg107_1; SJ1-1: gr0001; SJ4-4: gr0001_1; SJ6-4: gr0001
> _2; SJ7-2: gr0001_3; ??)
-------------------------------------------------------------------------------

Derivation of generalised Lorenz curve ordinates with unit record data

        glcurve varname [weight] [if exp] [in range] [, pvar(newvarname)
                 glvar(newvarname) sortvar(varname) by(varname) split nograph
                 replace lorenz atip(string) rtip(string) plot(plot)
                 graph_options ]

    aweights and fweights are allowed; see help weights.


Description

    Given a variable varname, call it x with c.d.f. F(x), glcurve draws its
    Generalised Lorenz curve and/or generates two new variables containing
    the Generalised Lorenz ordinates for x, i.e.  GL(p) at each p = F(x).
    For a population ordered in ascending order of x, a graph of GL(p)
    against p plots the cumulative total of x divided by population size
    against cumulative population share GL(1) = mean(x). glcurve can also be
    used to derive many other related concepts such as Lorenz curves,
    concentration curves and 'Three Is of Poverty' (TIP) curves, with
    appropriate definition of varname, order of cumulation (set with the
    sortvar option), and normalisation (e.g. by the mean of varname).
    Alternatively glcurve with the lorenz, atip or rtip option can be used
    directly to draw the related Lorenz, concentration and TIP curves.

    Comparisons of pairs of distributions (and dominance checks) can be
    undertaken by using the by() (with or without the split) options.  It can
    also be made manually by 'stacking' the data (see help on stack).

    The graphs drawn by glcurve are relatively basic. For graphs with full
    user control over formatting and labelling, users are recommended to use
    glcurve to generate the ordinates of the graph required using the
    pvar(newvarname) and glvar(newvarname) options, and then to draw the
    graph using graph twoway.


Options

    pvar(pvarname) generates the variable pvarname containing the x
        coordinates of the created curve.

    glvar(glvarname) generates the variable glvarname containing the y
        coordinates of the created curve.

    sortvar(sname) specifies the sort variable.  By default, the data are
        sorted (and cumulated) in ascending order of varname. If the sortvar
        option is specified, sorting and cumulation is in ascending order of
        variable sname. Within tied values of sname, data are sorted in
        ascending order of varname.

    by(groupvar) specifies that the coordinates are to be computed separately
        for each subgroup defined by groupvar. groupvar must be an integer
        variable.

    split specifies that a series of new variables are created containing the
        coordinates for each subgroup specified by by(groupvar). split can
        not be used without by().  If split is specified, then the string
        glname in glvar(glname) is used as a prefix to create new variables
        glname_X1, glname_X2,... (where X1, X2, ... are the values taken by
        groupvar).

    nograph avoids the automatic display of a crude graph made out of the
        created variables. nograph is assumed if by() is specified without
        split.

    replace allows the variables specified in glvar(glvarname) and
        pvar(pvarname) to be overwritten if they already exist. Otherwise
        glvarname and pvarname must be new variable names.

    lorenz requires that the ordinates of the Lorenz curve are computed
        instead of generalised Lorenz ordinates. The Lorenz ordinates of
        variable x, L(p), are GL(p)/mean(x).

    rtip(povline) and atip(povline) require that the ordinates of TIP curves
        are computed instead of generalised Lorenz ordinates.  povline
        specifies the value of the poverty line: it can be either a numeric
        value taken as the poverty line for all observations or a variable
        name containing the value of the poverty line for each observation.
        atip() draws 'absolute' TIP curves (by cumulating max(z-x,0)) and
        rtip() draws 'relative' TIP curves (by cumulating max(1-(x/z),0)).

    plot(plot) provides a way to add other plots to the generated graph; see 
        plot option.

    graph_options are standard twoway scatter options.  Note that
        modifications to the legend labels should be made with the
        legend(order(...) options instead of legend(label(...) (see help 
        legend_option).

Examples

    Many glcurve examples are provided in the downloadable materials provided
    by Jenkins (2006).

    . * Generalized Lorenz curve ordinates; plot using -graph twoway-

    . glcurve x, gl(gl1) p(p1) nograph

    . twoway line gl1 p1

    . * Lorenz curve ordinates; plot using -glcurve-

    . glcurve x, lorenz plot(function equality = x)

    . * Lorenz curve ordinates; plot using -glcurve-; options

    . glcurve x [fw=wgt] if x > 0, gl(gl2) p(p2) lorenz

    . * Generalised Lorenz curve ordinates and graphs, by state

    . glcurve x, gl(gl2) p(p2) replace sort(y) by(state) split

    . * TIP curve ordinates with graph

    . glcurve x, gl(gl3) p(p3) atip(10000)

    . glcurve x, gl(gl3) p(p3) atip(plinevar)

    . * Lorenz curve ordinates; plot using -graph twoway-

    . glcurve x, gl(gl) p(p) lorenz nograph

    . twoway line gl p , sort || line p p , ///
        xlabel(0(.1)1) ylabel(0(.1)1)      ///
        xline(0(.2)1) yline(0(.2)1)        ///
        title("Lorenz curve") subtitle("Example with custom formatting")    ///
        legend(label(1 "Lorenz curve") label(2 "Line of perfect equality")) ///
        plotregion(margin(zero)) aspectratio(1) scheme(economist)


Notes

    glcurve is designed to be used with individual-level, unit-record data.
    Although glcurve can also be applied mechanically to grouped (`banded')
    income data using fweights, be aware that the resulting curve is a
    potentially poor estimate, because within-income-band inequality is not
    taken into account.  On the estimation of Lorenz curves and inequality
    indices with grouped data, see e.g. Gastwirth and Glaubermann (1976) or
    Cowell and Mehta (1982).

    One must also be careful in using the ordinates returned from the option
    pvar for subsequent computation of the Gini or Concentration coefficient
    using the 'convenient covariance' formulae described by e.g. Lerman and
    Yitzhaki (1984, 1989) or Jenkins (1988). The ordinates returned in pvar
    are the curve ordinates (and are equal to estimates obtained from cumul)
    and these are not necessarily the fractional ranks required in the
    covariance formula. The difference is generally negligible with
    continuous unit-record data, but is larger if there are many ties in the
    ranking variable (as in the case, e.g., for the concentration coefficient
    based on an ordinal categorical variable, or when dealing with grouped
    data).


Acknowledgements

    Nicholas J. Cox helped with updating the code for our program from Stata
    7 (glcurve7) to Stata 8. David Demery, Owen O'Donnell, Shehzad Ali made
    useful bug reports. Comments by Zhuo (Adam) Chen lead to introduction of
    'sort stable' estimation for concentration curves.


Authors

    Philippe Van Kerm, CEPS/INSTEAD, Differdange, G.-D. Luxembourg
    philippe.vankerm@ceps.lu

    Stephen P. Jenkins, ISER, University of Essex
    stephenj@essex.ac.uk


References

    Cowell, F.A. 1995. Measuring Inequality (second edition).  Hemel
        Hempstead: Prentice-Hall/Harvester-Wheatsheaf.

    Cowell, F.A. and Mehta, F. 1982.  The Estimation and Interpolation of
        Inequality Measures.  Review of Economic Studies 49(2): 273-290.

    Gastwirth, J.L. and Glauberman, M. 1976.  The Interpolation of the Lorenz
        Curve and Gini Index from Grouped Data.  Econometrica 44(3): 479-483.

    Jenkins, S.P. 1988.  Calculating income distribution indices from
        microdata.  National Tax Journal 61: 139-142.

    Jenkins, S.P. 2006.  Estimation and interpretation of measures of
        inequality, poverty, and social welfare using Stata. Presentation at
        North American Stata Users' Group Meetings 2006, Boston MA.  
        http://econpapers.repec.org/paper/bocasug06/16.htm.

    Jenkins, S.P. and Lambert, P.J. 1997.  Three 'I's of poverty curves, with
        an analysis of UK poverty trends.  Oxford Economic Papers 49:
        317-327.

    Lambert, P.J. 2001.  The Distribution and Redistribution of Income (third
        edition). Manchester: Manchester University Press.

    Lerman, R.I. and Yitzhaki, S. 1984.  A note on the calculation and
        interpretation of the Gini index.  Economics Letters 15(3-4):
        363-368.

    Lerman, R.I. and Yitzhaki, S. 1989.  Improving the Accuracy of Estimates
        of Gini Coefficients.  Journal of Econometrics 42(1): 43-47.

    Shorrocks, A.F. 1983. Ranking income distributions.  Economica 197: 3-17.


Also see

    Manual:  [R] lorenz
    STB:  STB-48 sg107, STB-49 sg107.1, SJ 1(1) gr0001
    On-line:  help for sumdist, svylorenz (if installed)