help mata mm_quantile()
-------------------------------------------------------------------------------

Title

mm_quantile() -- Empirical quantile function

Syntax

real matrix mm_quantile(X [, w, P, altdef])

real rowvector mm_median(X [, w])

real rowvector mm_iqrange(X [, w, altdef])

where

X: real matrix containing data (rows are observations, columns variables)

w: real colvector containing weights

P: real matrix containing probabilities (default is P = (0, .25, .5, .75, 1)')

altdef: real scalar causing interpolation formula to be used

Description

mm_quantile() applies to P the inverse empirical cumulative distribution function of X (the so called quantile function). That is, mm_quantile() returns the quantiles X corresponding to the probabilities provided by P. For example,

mm_quantile(x, 1, 0.25)

returns the first quartile (i.e. the 0.25-quantile) of x.

Note that mm_quantile() works column by column. If P has one column and X has several columns, then the quantiles corresponding to P are computed for each column of X. If X has one column and P has several columns, then the quantile function of X is applied to each column of P. If X and P both have several columns, then the number of columns is required to be the same and quantiles are computed column by column.

mm_median() and mm_iqrange() are wrappers for mm_quantile() and return the median (the 0.5-quantile) and the inter-quartile range (IQR = 0.75-quantile - 0.25-quantile).

w specifies weights associated with the observations (rows) in X. Omit w, or specify w as 1 to obtain unweighted results. Note that the arguments in the above functions must not contain missing values.

altdef!=0 in mm_quantile() and mm_iqrange() uses an interpolation formula for calculating the quantiles (see remarks below).

Remarks

Example:

: x = invnormal(uniform(10000,1)) : mm_quantile(x, 1, (0.25 \ 0.5 \ 0.75)) 1 +----------------+ 1 | -.6673752219 | 2 | -.0021958246 | 3 | .6880046299 | +----------------+

: mm_median(x, 1), mm_iqrange(x, 1) 1 2 +-------------------------------+ 1 | -.0021958246 1.355379852 | +-------------------------------+

The default for mm_quantile() is to apply a discontinuous quantile function using averages where the empirical distribution function is flat (this corresponds to Definition 2 in Hyndman and Fan 1996). The same method is used by summarize and is the default method in _pctile. However, if altdef!=0 is specified, a piecewise linear continuous function according to Definition 6 in Hyndman and Fan (1996) is applied. This is also used by centile and by _pctile with the altdef option.

Unlike centile and _pctile, mm_quantile() allows using weights with the interpolation method (altdef!=0). This functionality, however, is only intended to be used with (integer) frequency weights.

Conformability

mm_quantile(X, w, P, altdef): X: n x 1 or n x k w: n x 1 or 1 x 1 P: r x 1 or r x k altdef: 1 x 1 result: r x 1 or r x k

mm_median(X, w): X: n x k w: n x 1 or 1 x 1 result: 1 x k

mm_iqrange(X, w, altdef): X: n x k w: n x 1 or 1 x 1 altdef: 1 x 1 result: 1 x k

Diagnostics

mm_quantile(): p < 0 is treated as p = 0, p > 1 as p = 1.

Weights should not be negative.

Results may be misleading if altdef!=0 is used with non-integer weights.

Source code

mm_quantile.mata, mm_median.mata, mm_iqrange.mata

References

Hyndman, R. J., Fan, Y. (1996). Sample Quantiles in Statistical Packages. The American Statistician 50:361-365.

Author

Ben Jann, ETH Zurich, jann@soz.gess.ethz.ch

Also see

Online: help for mm_ecdf(), summarize, pctile, centile, invcdf (if installed), moremata