help jnsn and help jnsni                                 Version 1.2 2007-01-17
-------------------------------------------------------------------------------

Title

jnsn -- Fit Johnson's system of transformations by moment matching jnsni

Syntax

jnsn varname [if] [in] [weight] [, jnsn-specific option] [common options]

jnsni , jnsni-specific options [common options]

options Description ------------------------------------------------------------------------- jnsn-specific Option generate(newvar) create variable named newvar that is the fitted normal transformation of varname

jnsni-specific Options mean(#) mean of the variable to be transformed; this is required sd(#) standard deviation of the variable to be transformed; this is required skewness(#) coefficient of skewness of the variable to be transformed kurtosis(#) coefficient of kurtosis of the variable to be transformed

Common Options tolerance(#) tolerance differentiating distributions sbiterate(#) iteration criterion for fitting bounded distributions moiterate(#) outer-loop iteration limit for estimating moments (used in fitting bounded distributions) motolerance(#) outer-loop convergence criterion for estimating moments (used in fitting bounded distributions) miiterate(#) inner-loop iteration limit for estimating moments (used in fitting bounded distributions) mitolerance(#) inner-loop convergence criterion for estimating moments (used in fitting bounded distributions) ------------------------------------------------------------------------- by may be used with jnsn; see by. aweights and fweights are allowed with jnsn; see weight.

Description

jnsn fits the Johnson system of distributions (Johnson, 1949) for transforming varname to a standard normal deviate. jnsni is the immediate form of the command, and allows the user to specify mean, standard deviation, and coefficients of skewness and kurtosis of a hypothetical variable or of one in a dataset that is not at-hand. jnsn and jnsni implement what is known as Algorithm AS 99 (Hill, Hill and Holder, 1976), which fits parameters of Johnson system functions by moment matching.

Options

+------+ ----+ Main +-------------------------------------------------------------

generate creates newvar containing values of the normal deviate as defined by the fitted transformation coefficients.

mean and sd mean and standard deviation are used in all cases.

skewness and kurtosis coefficients of skewness and kurtosis are those that would be returned by summarize. If these are not supplied, then they default to zero and three.

tolerance tolerance criterion used in discriminating distributions; defaults to 0.01.

sbiterate maximum number of iterations in fitting bounded distributions; defaults to 50.

moiterate maximum number of outer-loop iterations in higher moment estimation used for fitting bounded distributions; defaults to 50.

motolerance criterion for convergence of the outer-loop interations in higher moment estimation used for fitting bounded distributions; defaults to 0.00001.

miiterate maximum number of inner-loop iterations in higher moment estimation used for fitting bounded distributions; defaults to 50.

mitolerance criterion for convergence of the inner-loop interations in higher moment estimation used for fitting bounded distributions; defaults to 0.00000001.

Remarks

Johnson's system comprises several functions intended to transform a variable into a standard normal deviate. The choice of which of the functions to use in the transformation of a given variable is made on the basis of the distribution characteristics of the variable to be transformed. Each of the functions has up to four parameters, which are usually named gamma, delta, xi and lambda. Four of the functions are given below. In each case, z is distributed N(0,1),and y is defined as

y = (x - xi) / lambda

where x is the variable to be transformed.

SN (Normal distribution)

z = y

SL (log-normal distribution)

z = gamma + delta * ln(y)

SU (Unbounded distribution)

z = gamma + delta * asinh(y)

SB (Bounded distribution)

z = gamma + delta * ln(y / (1 - y))

The Johnson SN transformation is the trivial case where the variable to be transformed is already normally distributed. Here, xi and lambda are the only two paramters involved, and represent the mean and standard deviation. The Johnson SL case is tranformation of a variable that is bounded on one side. For this distribution, xi is the bound, and lambda is either one (positive skew) or negative one (negative skew). The Johnson SU case is transformation of an unbounded distribution. A simplified version of this transformation is known as the Inverse Hyperbolic Sine (IHS) transformation. The Johnson SB case is for variables whose distribution is bounded on both ends. Here, xi lies just beneath the minimum of x's distribution, and lambda is such that lambda - xi lies just above the maximum of x's distribution. Parameters for this case are the most difficult to fit. In all cases in which it is involved, the parameter space for delta is strictly positive.

Selection of the transformation function is made on the basis of the location of the variable's coefficients of skewness and kurtosis on the plane defined by their joint parameter space. Two lines are defined in this plane for the purpose of selecting a transformation function. One is the "log-normal" line emanating from (0,3). (See the references for the parametric equations that define the log-normal line.) The other is the boundary line, defined as coefficient of kurtosis = squared coefficient of skewness + 1. A skewness-kurtosis pair lying within tolerance of (0,3) will be fit as SN. A pair within tolerance of the log-normal line will be fit as SL. A pair lying above the log-normal line will be fit as SU. A skew-kurtosis pair within tolerance of the boundary is fit as ST, which is not listed above. A pair lying between the two lines is fit as SB. ST is not a Johnson transformation, but rather a special case created by Hill, Hill and Holder (T stands for "two-ordinate").

Once the transformation function has been chosen, the function's parameters are fitted. Fitting for SN, SL and ST are noniterative and unproblematic. Fitting for SU is iterative, but convergence is usually unproblematic. Fitting of the SB case is more difficult. Suboptimum fit and failure to fit are not unusual. Although jnsn does not screen for adequacy of fit, it does detect failures to fit and then resorts to an alternative transformation (often SU or SL) chosen on the basis of the best match of sample skewness and kurtosis to those for a variable subject to the fall-back transformation.

Johnson transformations involve only the first four moments of the distribution of the variable to be transformed. Its transformation of a variable to the standard normal deviate is thus approximate. If interest lies in the extremes or tails of the distribution, the approximation might not be adequate.

Notes

jnsn and jnsni return the transformation function selected in the return macro r(johnson_type). They return the fitted parameter estimates in return scalars r(gamma), r(delta), r(xi) and r(lambda), and convergence exceptions in the return macro r(fault). Coefficients for parameters that are not used in a transformation, such as gamma and delta in type SN, are set to default values, typically, zero for gamma and one for delta. For ST cases, xi and lambda are set to the ordinates on the skewness-kurtosis plane, delta is set to the proportion of values at lambda, and gamma is set to zero.

Default values for tolerances and iteration maximums are those in the FORTRAN-66 source code accompanying the article by Hill, Hill and Holder (1976).

jsn calls summarize to obtain mean, standard deviation, and skewness and kurtosis coefficients. It then feeds them to jnsni.

References

I. D. Hill, R. Hill and R. L. Holder, Fitting Johnson curves by moments. Applied Statistics 25:180–89, 1976.

N. L. Johnson, Systems of frequency curves generated by methods of translation. Biometrika 36:149–76, 1949.

Examples

. jnsn mpg

. jnsni , mean(0.5) sd(0)

Author

Joseph Coveney jcoveney@bigplanet.com

Also see

Manual: [R] summarize, [R] boxcox, [R] lnskew0, [R] ladder