L-moments and derived statistics
lmoments8 [varlist] [if exp] [in range] [, allobs detail matname(matrix_name) tabdisp_options variablenames se[(matrix_list_options)] ]
lmoments8 varname [if exp] [in range] [, allobs by(varlist) detail generate(specification) matname(matrix_name) tabdisp_options se[(matrix_list_options)] ]
by ... : may also be used with lmoments8: see help on by.
Description
lmoments8 calculates L-moments and derived statistics for a varlist. Any string variables in varlist are ignored. Specifically, the first four L-moments and the derived statistics t, t_3 and t_4 are calculated for each variable in varlist. Stata 8 is required. At the time of writing, lmoments is the latest version of this program: it requires Stata 9.
Remarks
Denote by X(j:n) the j th smallest observation from a sample of size n from a variable X and by E the expectation operator.
The first four L-moments are defined by
E (X(1:1)),
1/2 E (X(2:2) - X(1:2)),
1/3 E (X(3:3) - 2 X(2:3) + X(1:3)) and
1/4 E (X(4:4) - 3 X(3:4) + 3 X(2:4) - X(1:4)).
They are estimated via these weighted averages for a sample x_1, ..., x_n, otherwise known as probability-weighted moments:
b_0 = average of x(j:n),
j - 1 b_1 = average of ----- x(j:n), n - 1
j - 1 j - 2 b_2 = average of ----- ----- x(j:n) and n - 1 n - 2
j - 1 j - 2 j - 3 b_3 = average of ----- ----- ----- x(j:n). n - 1 n - 2 n - 3
The estimators are
l_1 = b_0, l_2 = 2 b_1 - b_0, l_3 = 6 b_2 - 6 b_1 + b_0 and l_4 = 20 b_3 - 30 b_2 + 12 b_1 - b_0,
whence
t = l_2 / l_1 (cf. coefficient of variation), t_3 = l_3 / l_2 (cf. skewness) and t_4 = l_4 / l_2 (cf. kurtosis).
Options
allobs specifies use of the maximum possible number of observations for each variable. The default is to use only those observations for which all variables in varlist are not missing.
by() specifies one or more variables defining distinct groups for which L-moments should be calculated. by() is allowed only with a single varname. The choice between by: and by() is partly one of precisely what kind of output display is required. The display with by: is clearly structured by groups while that with by() is more compact. To show L-moments for several variables and several groups with a single call to lmoments8, the display with by: is essential.
detail specifies a full display of results. By default, n, l_1, l_2, t_3 and t_4 are shown. detail adds l_3, l_4 and t.
generate() specifies one or more new variables to hold calculated results. generate() is allowed only with a single varname. This option is most useful when you want to save L-moments calculated for several groups for further analysis. Note that generate() is not allowed with the by: prefix: use the by() option instead. Values for the new variables will necessarily be identical for all observations in each group: typically it will be useful to select just one observation for each group, say by using egen, tag().
The specification consists of one or more space-separated elements newvar=statistic, where newvar is a new variable name and statistic is one of n, l_1, l_2, l_3, l_4, t, t_3 or t_4. Omission of the underscore, as in l1, l2, l3, l4, t3 and t4, is also allowed.
matname() specifies the name of a matrix in which to save the results of (the last set of) calculations. There will be 8 columns. The columns will contain n, l_1, l_2, l_3, l_4, t, t_3 and t_4. The matrix will contain missing values if n < 4 or l_1 == 0 or l_2 == 0.
tabdisp_options are options of tabdisp. The default display has format(%9.3f).
variablenames specifies that the variable names of varlist should be used in display. The default is to use variable labels to indicate a set of variables.
se specifies that the variance matrix of sample L-moments and the standard error vector of sample L-moments and derived ratios be displayed. The variance matrix of sample L-moments is estimated using the exact unbiased distribution-free estimator of Elamir and Seheult (2004). Note that negative estimates of each variance are possible, especially with very small samples. The standard errors of sample L-moments are the square roots of the diagonal elements of that matrix. The standard errors of t, t_3 and t_4 are obtained from the variances of ratios l_2/l_1, l_3/l_2, l_4/l_2 using Taylor-series-based approximations: for a ratio U/V,
var(U/V) = {var(U)/E(U)^2 + var(V)/E(V)^2 - 2 cov(U,V)/(E(U) E(V))} {E(U)/E(V)}^2.
This information is reported for the last-named variable or group only. However, by: may be used to obtain listings of standard errors for each of several groups. se may be specified with options of matrix list to tune the display of the matrices. This option can be rather slow for large sample sizes.
Examples
. lmoments8 price-foreign
. bysort rep78: lmoments8 mpg
. lmoments8 mpg, by(rep78) generate(mean=l1 L2=l2 L_CV=t)
. lmoments8 c, se(format(%9.4f))
Saved results
(for last-named variable or group only)
r(N) n r(l_1) l_1 r(l_2) l_2 r(l_3) l_3 r(l_4) l_4 r(t) t r(t_3) t_3 r(t_4) t_4 r(b_1) b_1 r(b_2) b_2 r(b_3) b_3
if se specified only: r(V) variance matrix of l_1 ... l_4 r(SE) standard error vector of l_1 ... l_4 t t_3 t_4
Acknowledgements
lmoments8 includes code from Patrick Royston's lshape program. Allan Seheult kindly provided and discussed reprints of his joint work.
Author
Nicholas J. Cox, Durham University, U.K. n.j.cox@durham.ac.uk
References
The L-moments page
Elamir, E.A.H. and A.H. Seheult. 2004. Exact variance structure of sample L-moments. Journal of Statistical Planning and Inference 124: 337-359.
Hosking, J.R.M. 1990. L-moments: Analysis and estimation of distributions using linear combinations of order statistics. Journal of the Royal Statistical Society Series B 52: 105-124.
Hosking, J.R.M. 1998. L-moments. In Kotz, S., C.B. Read and D.L. Banks (eds) Encyclopedia of Statistical Sciences Update Volume 2. New York: Wiley, 357-362.
Hosking, J.R.M. and J.R. Wallis. 1997. Regional frequency analysis: an approach based on L-moments. Cambridge University Press.
Royston, P. 1992. Which measures of skewness and kurtosis are best? Statistics in Medicine 11: 333-343.