```help xriml                                                      Patrick Royston
-------------------------------------------------------------------------------

Reference Interval Estimation by Maximum Likelihood

Syntax

xriml yvar [xvar] [if] [in] [weight] [, dist(distribution_code)
[major_options minor_options]

options                         Description
-------------------------------------------------------------------------
major
centile(numlist)              defines the required centiles of
yvar|xvar
fp(terms)                     specifies the fractional polynomial
power(s) in xvar for the M, S, G and D
regression models
minor
cens(censvar)                 defines a censoring variable
covars(covar_list)            includes variables as predictors in the
regression models for the M, S, G and D
curves
cv                            parametrizes the S-curve to be a
coefficient of variation
init(terms)                   specifies initial values for the G and D
curves
ltolerance(#)                 is a convergence criterion for the
iterative fitting process
nodetail                      suppresses display of details of the
iterative fitting algorithm
nograph                       suppresses the plot of the age-specific
reference interval
nooutofsample                 restricts prediction of M, S, G and D
curves and Z-scores to the estimation
sample
offset(varname)               adds varname to the M-curve of the model
plot(plot)                    provides a way to add other plots to the
generated graph; see help plot option
scatter(scatter_options)      are options of scatter
se                            produces standard errors of the M, S, G
and D curves, and reference intervals
line_options                  options allowed with line
-------------------------------------------------------------------------

where distribution_code is one of n|en|men|pn|mpn|sl.

Description

xriml calculates cross-sectional reference intervals for yvar, which is
assumed to follow one of 6 possible distributions.  The parameters are
estimated by maximum likelihood.

If xvar is specified, reference intervals for yvar conditional on xvar
are estimated. Typically, xvar is age. The parameters of the distribution
are modelled as functions of xvar using fractional polynomials (see
fracpoly).

xriml without variables or options displays the results of the most
recent estimation.

NOTE: the default prediction behaviour of xriml changed at version 6.0.0.
The all option has been replaced with a nooutofsample option; please see
the description of the latter under Minor options below.

Options

+---------------+
----+ Major options +----------------------------------------------------

distribution(distribution_code) is NOT optional.  Valid
distribution_codes are Normal (n), exponential-Normal (en),
modulus-exponential-Normal (men), power-Normal (or Box-Cox) (pn),
modulus power-Normal (mpn) and shifted (or three-parameter) lognormal
(sl).

centile(numlist) defines the required centiles of yvar|xvar.  Default
numlist is 3 97 (i.e. a 94% reference interval).

fp([m:term] [, s:term] [, g:term] [, d:term]) specifies the fractional
polynomial power(s) in xvar for the M, S, G and (for the
four-parameter distributions only) D regression models.

term is of form [powers] # [# ...]|fix #.  The phrase powers is
optional.  The powers should be separated by spaces, for example
fp(m:powers 0 1, s:powers 2), or equivalently fp(m:0 1, s:2). If
powers or fix are not specified for any curve, the curve is assumed
to be a constant (_cons) estimated from the data.

fix # implies that the corresponding curve is NOT to be estimated
from the data, but is to be fixed at #. fix is valid only with g: and
d:.

Default: constants for each curve (M, S, G; D if applicable).

+---------------+
----+ Minor options +----------------------------------------------------

cens(censvar) defines censvar as the censoring variable for data in which
some observations are left- (censvar = -1) or right- (censvar = 1)
censored.  Uncensored observations have censvar = 0.

covars([m:mcovars] [, s:scovars] [, g:gcovars] [, d:dcovars]) includes
mcovars (scovars, gcovars, dcovars) variables as predictors in the
regression model for the M (S, G, D if applicable) curves.

cv parametrizes the S-curve to be a coefficient of variation (CV,
standard deviation divided by median), rather than a standard
deviation.

init([g:#] [, d:#]) specifies initial values for the G (g:) and (where
applicable) D (d:) parameter curves. Defaults are shown below.

Distribution     Default # for G    Default # for D
---------------------------------------------------
n                     N/A                N/A
en                    0.01               N/A
men                  -0.2                 1
pn                    1                  N/A
mpn                   1                   1
sl                    0                  N/A
---------------------------------------------------

ltolerance(#) is a convergence criterion for the iterative fitting
process.  For convergence, the difference between the final two
values of the log likelihood must be less than #.  Default # is
0.001.

nodetail suppresses display of the steps of the iterative fitting
algorithm and of the estimated regression coefficients and confidence
intervals.

nograph suppresses the default plot of yvar against xvar with fitted
median and reference limits.

nooutofsample restricts prediction of M, S, G and D curves, standard
errors (when specified) and Z-scores to the estimation sample. The
default is to predict in-sample and out-of-sample for all available
observations of xvar and yvar.

offset(varname) offsets varname, that is, varname is added to the M-curve
of the model.

plot(plot) provides a way to add other plots to the generated graph; see
help plot option.

saving(filename [, replace]) saves the graph to a file (see nograph).

scatter(scatter_options) are options of scatter.  These should be
specified to control the rendering of the original data points.

se produces standard errors of the M, S, G (and if applicable, D) curves.
Standard errors of the estimated reference limits are also
calculated.  Warning: This option is computationally intensive when
determining SEs of centiles, and may take considerable time on a slow
computer and/or with a large dataset.

line_options are any of the options allowed with line.  These should be
specified to control the rendering of the smoothed lines or the
overall graph.

Remarks

All the models fitted by xriml are defined by transformations of the
original data towards a Normal distribution (the `identity
transformation' in the case of the Normal model). The shape parameter(s)
of the resulting distributions may either be estimated from the data or
fixed by the user.

Estimation is by maximum likelihood and is iterative. For the
three-parameter models, the fit should converge within about 4-8
iterations. For the four-parameter models, about 5-15 iterations are
needed in most cases.

The pn and mpn models may be used only with data which are positive in
value.  The restriction does not apply to any of the other models.

Each of the en, pn and sl distributions has 3 parameters known as M (mu,
the median), S (sigma, the scale factor) and G (gamma, generic name for
the shape parameter).  M is modelled as a fractional polynomial (FP)
function of xvar.  S and G may also be modelled as FP functions of xvar,
or may be treated as constants to be estimated from the data.

The mpn (modulus power-Normal) and men (modulus exponential-Normal)
distributions are governed by four parameters, M, S, G and D.  There are
two shape parameters, G (gamma) and D (delta).  Delta = 1 gives the
`parent' pn and en (power-Normal and exponential-Normal) distributions
respectively.  If delta < 1 the distribution has longer tails than the
corresponding `parent' distribution, and vice versa for delta > 1.  The
distributions with gamma = 1 for the mpn and gamma = 0 for the men are
symmetric.

The en (men) and pn (mpn) models are essentially identical in that if Y
has a pn (mpn) distribution, then log Y has an en (men) distribution.
However, the parameter values from the two models will differ, since in
the first case the M curve is the median of Y, whereas in the second it
is the median of log Y.  The S curves from the en and men models for log
Y have the character of a CV for Y.

Note that fractional polynomial transformations of xvar are adjusted such
that the transformed value is 0 at the mean of xvar.

Examples

. use foothemi.dta

. generate y = log(foot)

. xriml y gawks, fp(m:-2 -2, s:1) dist(en)

. xriml y gawks, fp(m:-2 -2, s:1, g:fix 0) dist(men) se

. xriml foot gawks, fp(m:powers 2 2, s:powers 2) dist(pn)

. xriml foot gawks, fp(m:2 2) dist(pn) saving(g1, replace)

. xriml foot gawks, fp(m:2 2) dist(pn) cv

. xriml foot gawks, fp(m:1, s:-1, g:0) dist(en) nooutofsample

Author

Patrick Royston, MRC Clinical Trials Unit, London.
patrick.royston@ctu.mrc.ac.uk

Eileen Wright, Macclesfield

Also see

Manual:  [R] fracpoly

Online:  fracpoly, xrigls (when installed)
```