help fastgini
-------------------------------------------------------------------------------

Title

fastgini -- Fast algorithm for calculation of Gini coefficient and it's jackknife standard errors

Syntax

fastgini varname [if] [in] [weight] [, bin(#) jk Level(#) nocheck]

pweights and fweights are allowed; see weight.

Description

fastgini calculates the Gini coefficient for either unit-level or aggregated level data. Optionally it returns the jackknife estimates of the standard error. fastgini uses a fast optimized algorithm that could be especially useful when calculating the Gini coefficient and it's standard errors for the large samples. The command implements algorithms for both exact and approximate calculation of the Gini coefficient.

+------+ ----+ Main +-------------------------------------------------------------

bin(#) set number of bins. Specifying this option can dramatically reduce the computation time when working with large datasets (1M+ obs). When bin(#) is specified fastgini uses approximation algorithm for Gini calculation. Specifying the sufficient number bins allows obtaining the approximation for the Gini at any desired level of precision. For example, on the dataset of 1,000,000 observations bin(100,000) will in most cases estimate computer-exact value of Gini. This calculation required significantly less computer time compared to the exact estimation of the Ginin on whole sample.

jk estimate jackknife (leave-one-out) standard error of the Gini coefficient. An efficient method of calculating jackknife estimates involves only two (one to get the Gini coefficient itself and another for standard errors) runs through the data.

level(#) set confidence level for the reported jackknife confidence intervals; default is level(95).

nocheck by default, non-positive values of varname are excluded from Gini calculations. Specifying {opt nocheck} skips the value check as well as ignores [if] [in] conditions. The option can be useful to speed-up the execution if fastgini is used within loops.

Saved Results

fastgini saves in r():

r(gini) calculated Gini coefficient;

if jk option specified:

r(se) jackknife estimate for the standard error of the Gini;

r(mse) jackknife estimate for the mean standard error of the Gini;

r(gini_jk) jackknife estimate for the Gini.

Remarks

fastgini uses formula:

i=N j=i SUM W_i*(SUM W_j*X_j - W_i*X_i/2) i=1 j=1 G = 1 - 2* ---------------------------------- i=N i=N SUM W_i*X_i * SUM W_i i=1 i=1

where observations are sorted in ascending order of X.

if bin(M) is specified, the data are aggregated into M equal-size bins, i.e.

~ X_i = (X_min + i * binsize) binsize = (X_max - X_min)/M

~ ~ ~ W_i = SUM W_j (if X_(i-1)<=X_j<X_i) i=1..M j

and then Gini coefficient is calculated using aggregated data.

Examples

.fastgini pc_exp

.fastgini income [w=weight], jk

.fastgini income [w=weight], bin(10000)

Author

Zurab Sajaia, DECRG-PO SDG, The World Bank, zsajaia@worldbank.org

References

Karagiannis E. and M. Kovacevic' (2000), "A Method to Calculate Jakknife Variance Estimator For the Gini Coefficient", Oxford Bulletin of Economics and Statistics, Vol. 62 Issue 1 119-122.

Also see

Online: jackknife

Links to user-written programs: inequal7, egen_inequal, mm_gini(), ineqerr, ineqdeco, ineqdec0