{smcl}
{* 26nov2008}{...}
{cmd:help xriml}{right:Patrick Royston}
{hline}
{title:Reference Interval Estimation by Maximum Likelihood}
{title:Syntax}
{p 8 12 2}
{cmd:xriml}
{it:yvar}
[{it:xvar}]
{ifin}
{weight}
[{cmd:,}
{cmdab:di:st(}{it:distribution_code}{cmd:)}
[{it:major_options}
{it:minor_options}]
{synoptset 30 tabbed}{...}
{synopthdr}
{synoptline}
{syntab :{it:major}}
{synopt :{opt ce:ntile(numlist)}}defines the required centiles of {it:yvar}|{it:xvar}{p_end}
{synopt :{opt fp(terms)}}specifies the fractional polynomial power(s) in {it:xvar}
for the M, S, G and D regression models{p_end}
{syntab :{it:minor}}
{synopt :{opt ce:ns(censvar)}}defines a censoring variable{p_end}
{synopt :{opt cova:rs(covar_list)}}includes variables as predictors in the regression models
for the M, S, G and D curves{p_end}
{synopt :{opt cv}}parametrizes the S-curve to be a coefficient of variation{p_end}
{synopt :{opt in:it(terms)}}specifies initial values for the G and D curves{p_end}
{synopt :{opt lt:olerance(#)}}is a convergence criterion for the iterative fitting process{p_end}
{synopt :{opt nod:etail}}suppresses display of details of the iterative fitting algorithm{p_end}
{synopt :{opt nogr:aph}}suppresses the plot of the age-specific reference interval{p_end}
{synopt :{opt noout:ofsample}}restricts prediction of M, S, G and D curves and Z-scores
to the estimation sample{p_end}
{synopt :{opt off:set(varname)}}adds {it:varname} to the M-curve of the model{p_end}
{synopt :{opt plot(plot)}}provides a way to add other plots to the
generated graph; see help {help plot_option:plot option}{p_end}
{synopt :{opt scat:ter(scatter_options)}}are options of {help scatter}{p_end}
{synopt :{opt se}}produces standard errors of the M, S, G and D curves, and reference intervals{p_end}
{synopt :{it:line_options}}options allowed with {help line}{p_end}
{synoptline}
{p2colreset}{...}
{pstd}
where {it:distribution_code} is one of {cmd:n}|{cmd:en}|{cmd:men}|{cmd:pn}|{cmd:mpn}|{cmd:sl}.
{title:Description}
{pstd}
{opt xriml} calculates cross-sectional reference intervals for {it:yvar}, which is assumed
to follow one of 6 possible distributions. The parameters are estimated by
maximum likelihood.
{pstd}
If {it:xvar} is specified, reference intervals for {it:yvar} conditional on {it:xvar} are
estimated. Typically, {it:xvar} is age. The parameters of the distribution are
modelled as functions of {it:xvar} using fractional polynomials (see {help fracpoly}).
{pstd}
{opt xriml} without variables or options displays the results of the most recent
estimation.
{pstd}
{bf:NOTE}: the default prediction behaviour of {cmd:xriml} changed at version 6.0.0.
The {opt all} option has been replaced with a {opt nooutofsample} option; please
see the description of the latter under {it:Minor options} below.
{title:Options}
{dlgtab:Major options}
{phang}
{opt distribution(distribution_code)} is NOT optional. Valid distribution_codes are Normal
({opt n}), exponential-Normal ({opt en}), modulus-exponential-Normal ({opt men}),
power-Normal (or Box-Cox) ({opt pn}), modulus power-Normal ({opt mpn}) and shifted
(or three-parameter) lognormal ({opt sl}).
{phang}
{opt centile(numlist)} defines the required centiles of {it:yvar}|{it:xvar}.
Default {it:numlist} is {opt 3 97} (i.e. a 94% reference interval).
{phang}
{cmd:fp(}[{cmd:m:}{it:term}] [{cmd:, s:}{it:term}] [{cmd:, g:}{it:term}] [{cmd:, d:}{it:term}]{cmd:)}
specifies the fractional polynomial power(s) in {it:xvar} for the M, S, G and
(for the four-parameter distributions only) D regression models.
{pmore}
{it:term} is of form
[{it:powers}] {it:#} [{it:#} ...]|{cmd:fix} {it:#}. The phrase {it:powers} is optional.
The powers should be separated by spaces, for example
{cmd:fp(m:powers 0 1, s:powers 2)}, or equivalently {cmd:fp(m:0 1, s:2)}. If
{it:powers} or {cmd:fix} are not specified for any curve, the curve is assumed to be
a constant ({cmd:_cons}) estimated from the data.
{pmore}
{cmd:fix} {it:#} implies that the corresponding curve is NOT to be estimated from the
data, but is to be fixed at {it:#}. {cmd:fix} is valid only with {cmd:g:} and {cmd:d:}.
{pmore}
Default: constants for each curve (M, S, G; D if applicable).
{dlgtab:Minor options}
{phang}
{opt cens(censvar)} defines {it:censvar} as the censoring variable for data in which
some observations are left- ({it:censvar} = -1) or right- ({it:censvar} = 1) censored.
Uncensored observations have {it:censvar} = 0.
{phang}
{cmd:covars(}[{cmd:m:}{it:mcovars}] [{cmd:, s:}{it:scovars}] [{cmd:, g:}{it:gcovars}]
[{cmd:, d:}{it:dcovars}]{cmd:)} includes {it:mcovars}
({it:scovars}, {it:gcovars}, {it:dcovars}) variables as predictors in the regression model
for the M (S, G, D if applicable) curves.
{phang}
{opt cv} parametrizes the S-curve to be a coefficient of variation (CV, standard
deviation divided by median), rather than a standard deviation.
{phang}
{cmd:init(}[{cmd:g:}{it:#}] [{cmd:, d:}{it:#}]{cmd:)}
specifies initial values for the G ({cmd:g:}) and (where applicable) D ({cmd:d:})
parameter curves. Defaults are shown below.
Distribution Default # for G Default # for D
---------------------------------------------------
{cmd:n} N/A N/A
{cmd:en} 0.01 N/A
{cmd:men} -0.2 1
{cmd:pn} 1 N/A
{cmd:mpn} 1 1
{cmd:sl} 0 N/A
---------------------------------------------------
{phang}
{opt ltolerance(#)} is a convergence criterion for the iterative fitting process. For
convergence, the difference between the final two values of the log
likelihood must be less than {it:#}. Default {it:#} is 0.001.
{phang}
{opt nodetail} suppresses display of the steps of the iterative fitting algorithm
and of the estimated regression coefficients and confidence intervals.
{phang}
{opt nograph} suppresses the default plot of {it:yvar} against {it:xvar} with fitted median and
reference limits.
{phang}
{opt nooutofsample} restricts prediction of M, S, G and D curves, standard errors
(when specified) and Z-scores to the estimation sample. The default is to predict
in-sample and out-of-sample for all available observations of {it:xvar} and {it:yvar}.
{phang}
{opt offset(varname)} offsets {it:varname}, that is, {it:varname} is added to the M-curve of
the model.
{phang}
{opt plot(plot)} provides a way to add other plots to the
generated graph; see help {help plot_option:plot option}.
{phang}
{cmd:saving(}{it:filename} [{cmd:, replace}]{cmd:)} saves the graph to a file
(see {opt nograph}).
{phang}
{opt scatter(scatter_options)} are options of {help scatter}.
These should be specified to control the rendering of the original data
points.
{phang}
{opt se} produces standard errors of the M, S, G (and if applicable, D) curves.
Standard errors of the estimated reference limits are also calculated.
Warning: This option is computationally intensive when determining SEs of
centiles, and may take considerable time on a slow computer and/or with a
large dataset.
{phang}
{it:line_options} are any of the options allowed with {help line}.
These should be specified to control the rendering of the smoothed lines
or the overall graph.
{title:Remarks}
{pstd}
All the models fitted by {opt xriml} are defined by transformations of the original
data towards a Normal distribution (the `identity transformation' in the case
of the Normal model). The shape parameter(s) of the resulting distributions
may either be estimated from the data or fixed by the user.
{pstd}
Estimation is by maximum likelihood and is iterative. For the three-parameter
models, the fit should converge within about 4-8 iterations. For the
four-parameter models, about 5-15 iterations are needed in most cases.
{pstd}
The {opt pn} and {opt mpn} models may be used only with data which are positive in value.
The restriction does not apply to any of the other models.
{pstd}
Each of the {opt en}, {opt pn} and {opt sl} distributions has 3 parameters known as M (mu, the
median), S (sigma, the scale factor) and G (gamma, generic name for the shape
parameter). M is modelled as a fractional polynomial (FP) function of xvar. S
and G may also be modelled as FP functions of xvar, or may be treated as
constants to be estimated from the data.
{pstd}
The {opt mpn} (modulus power-Normal) and {opt men} (modulus exponential-Normal)
distributions are governed by four parameters, M, S, G and D. There are two shape
parameters, G (gamma) and D (delta). Delta = 1 gives the `parent' {opt pn} and {opt en}
(power-Normal and exponential-Normal) distributions respectively. If delta < 1
the distribution has longer tails than the corresponding `parent' distribution,
and vice versa for delta > 1. The distributions with gamma = 1 for the {opt mpn} and
gamma = 0 for the {opt men} are symmetric.
{pstd}
The {opt en} ({opt men}) and {opt pn} ({opt mpn}) models are essentially identical in that if Y has a
{opt pn} ({opt mpn}) distribution, then log Y has an {opt en} ({opt men}) distribution. However, the
parameter values from the two models will differ, since in the first case the M
curve is the median of Y, whereas in the second it is the median of log Y. The
S curves from the {opt en} and {opt men} models for log Y have the character of a CV for Y.
{pstd}
Note that fractional polynomial transformations of {it:xvar} are adjusted such
that the transformed value is 0 at the mean of {it:xvar}.
{title:Examples}
{phang}
{cmd:. use foothemi.dta}
{phang}
{cmd:. generate y = log(foot)}
{phang}
{cmd:. xriml y gawks, fp(m:-2 -2, s:1) dist(en)}
{phang}
{cmd:. xriml y gawks, fp(m:-2 -2, s:1, g:fix 0) dist(men) se}
{phang}
{cmd:. xriml foot gawks, fp(m:powers 2 2, s:powers 2) dist(pn)}
{phang}
{cmd:. xriml foot gawks, fp(m:2 2) dist(pn) saving(g1, replace)}
{phang}
{cmd:. xriml foot gawks, fp(m:2 2) dist(pn) cv}
{phang}
{cmd:. xriml foot gawks, fp(m:1, s:-1, g:0) dist(en) nooutofsample}
{title:Author}
{pstd}
Patrick Royston, MRC Clinical Trials Unit, London.
patrick.royston@ctu.mrc.ac.uk
{pstd}
Eileen Wright, Macclesfield
{title:Also see}
{psee}
Manual: {bf:[R] fracpoly}
{psee}
Online: {help fracpoly}, {help xrigls} (when installed)
{p_end}