Reference Interval Estimation by Maximum Likelihood
Syntax
xriml yvar [xvar] [if] [in] [weight] [, dist(distribution_code) [major_options minor_options]
options Description ------------------------------------------------------------------------- major centile(numlist) defines the required centiles of yvar|xvar fp(terms) specifies the fractional polynomial power(s) in xvar for the M, S, G and D regression models minor cens(censvar) defines a censoring variable covars(covar_list) includes variables as predictors in the regression models for the M, S, G and D curves cv parametrizes the S-curve to be a coefficient of variation init(terms) specifies initial values for the G and D curves ltolerance(#) is a convergence criterion for the iterative fitting process nodetail suppresses display of details of the iterative fitting algorithm nograph suppresses the plot of the age-specific reference interval nooutofsample restricts prediction of M, S, G and D curves and Z-scores to the estimation sample offset(varname) adds varname to the M-curve of the model plot(plot) provides a way to add other plots to the generated graph; see help plot option scatter(scatter_options) are options of scatter se produces standard errors of the M, S, G and D curves, and reference intervals line_options options allowed with line -------------------------------------------------------------------------
where distribution_code is one of n|en|men|pn|mpn|sl.
Description
xriml calculates cross-sectional reference intervals for yvar, which is assumed to follow one of 6 possible distributions. The parameters are estimated by maximum likelihood.
If xvar is specified, reference intervals for yvar conditional on xvar are estimated. Typically, xvar is age. The parameters of the distribution are modelled as functions of xvar using fractional polynomials (see fracpoly).
xriml without variables or options displays the results of the most recent estimation.
NOTE: the default prediction behaviour of xriml changed at version 6.0.0. The all option has been replaced with a nooutofsample option; please see the description of the latter under Minor options below.
Options
+---------------+ ----+ Major options +----------------------------------------------------
distribution(distribution_code) is NOT optional. Valid distribution_codes are Normal (n), exponential-Normal (en), modulus-exponential-Normal (men), power-Normal (or Box-Cox) (pn), modulus power-Normal (mpn) and shifted (or three-parameter) lognormal (sl).
centile(numlist) defines the required centiles of yvar|xvar. Default numlist is 3 97 (i.e. a 94% reference interval).
fp([m:term] [, s:term] [, g:term] [, d:term]) specifies the fractional polynomial power(s) in xvar for the M, S, G and (for the four-parameter distributions only) D regression models.
term is of form [powers] # [# ...]|fix #. The phrase powers is optional. The powers should be separated by spaces, for example fp(m:powers 0 1, s:powers 2), or equivalently fp(m:0 1, s:2). If powers or fix are not specified for any curve, the curve is assumed to be a constant (_cons) estimated from the data.
fix # implies that the corresponding curve is NOT to be estimated from the data, but is to be fixed at #. fix is valid only with g: and d:.
Default: constants for each curve (M, S, G; D if applicable).
+---------------+ ----+ Minor options +----------------------------------------------------
cens(censvar) defines censvar as the censoring variable for data in which some observations are left- (censvar = -1) or right- (censvar = 1) censored. Uncensored observations have censvar = 0.
covars([m:mcovars] [, s:scovars] [, g:gcovars] [, d:dcovars]) includes mcovars (scovars, gcovars, dcovars) variables as predictors in the regression model for the M (S, G, D if applicable) curves.
cv parametrizes the S-curve to be a coefficient of variation (CV, standard deviation divided by median), rather than a standard deviation.
init([g:#] [, d:#]) specifies initial values for the G (g:) and (where applicable) D (d:) parameter curves. Defaults are shown below.
Distribution Default # for G Default # for D --------------------------------------------------- n N/A N/A en 0.01 N/A men -0.2 1 pn 1 N/A mpn 1 1 sl 0 N/A ---------------------------------------------------
ltolerance(#) is a convergence criterion for the iterative fitting process. For convergence, the difference between the final two values of the log likelihood must be less than #. Default # is 0.001.
nodetail suppresses display of the steps of the iterative fitting algorithm and of the estimated regression coefficients and confidence intervals.
nograph suppresses the default plot of yvar against xvar with fitted median and reference limits.
nooutofsample restricts prediction of M, S, G and D curves, standard errors (when specified) and Z-scores to the estimation sample. The default is to predict in-sample and out-of-sample for all available observations of xvar and yvar.
offset(varname) offsets varname, that is, varname is added to the M-curve of the model.
plot(plot) provides a way to add other plots to the generated graph; see help plot option.
saving(filename [, replace]) saves the graph to a file (see nograph).
scatter(scatter_options) are options of scatter. These should be specified to control the rendering of the original data points.
se produces standard errors of the M, S, G (and if applicable, D) curves. Standard errors of the estimated reference limits are also calculated. Warning: This option is computationally intensive when determining SEs of centiles, and may take considerable time on a slow computer and/or with a large dataset.
line_options are any of the options allowed with line. These should be specified to control the rendering of the smoothed lines or the overall graph.
Remarks
All the models fitted by xriml are defined by transformations of the original data towards a Normal distribution (the `identity transformation' in the case of the Normal model). The shape parameter(s) of the resulting distributions may either be estimated from the data or fixed by the user.
Estimation is by maximum likelihood and is iterative. For the three-parameter models, the fit should converge within about 4-8 iterations. For the four-parameter models, about 5-15 iterations are needed in most cases.
The pn and mpn models may be used only with data which are positive in value. The restriction does not apply to any of the other models.
Each of the en, pn and sl distributions has 3 parameters known as M (mu, the median), S (sigma, the scale factor) and G (gamma, generic name for the shape parameter). M is modelled as a fractional polynomial (FP) function of xvar. S and G may also be modelled as FP functions of xvar, or may be treated as constants to be estimated from the data.
The mpn (modulus power-Normal) and men (modulus exponential-Normal) distributions are governed by four parameters, M, S, G and D. There are two shape parameters, G (gamma) and D (delta). Delta = 1 gives the `parent' pn and en (power-Normal and exponential-Normal) distributions respectively. If delta < 1 the distribution has longer tails than the corresponding `parent' distribution, and vice versa for delta > 1. The distributions with gamma = 1 for the mpn and gamma = 0 for the men are symmetric.
The en (men) and pn (mpn) models are essentially identical in that if Y has a pn (mpn) distribution, then log Y has an en (men) distribution. However, the parameter values from the two models will differ, since in the first case the M curve is the median of Y, whereas in the second it is the median of log Y. The S curves from the en and men models for log Y have the character of a CV for Y.
Note that fractional polynomial transformations of xvar are adjusted such that the transformed value is 0 at the mean of xvar.
Examples
. use foothemi.dta
. generate y = log(foot)
. xriml y gawks, fp(m:-2 -2, s:1) dist(en)
. xriml y gawks, fp(m:-2 -2, s:1, g:fix 0) dist(men) se
. xriml foot gawks, fp(m:powers 2 2, s:powers 2) dist(pn)
. xriml foot gawks, fp(m:2 2) dist(pn) saving(g1, replace)
. xriml foot gawks, fp(m:2 2) dist(pn) cv
. xriml foot gawks, fp(m:1, s:-1, g:0) dist(en) nooutofsample
Author
Patrick Royston, MRC Clinical Trials Unit, London. patrick.royston@ctu.mrc.ac.uk
Eileen Wright, Macclesfield
Also see
Manual: [R] fracpoly
Online: fracpoly, xrigls (when installed)