------------------------------------------------------------------------------- help forstpmPatrick Royston -------------------------------------------------------------------------------

Flexible parametric models for survival-time data

stpm[varlist] [ifexp] [inrange],model_complexityscale(hazard|normal|odds)[allcluster(cluster_var)left(leftvar))mlmaxopts(ml_maximize_options)noconstantnolognoorthogoffset(offsetvar)q(orthog_matrix)spline(splinevar derivativevar)stratify(strat_varlist)technique(algorithm_spec)theta(est|#)]where

model_complexityisdf(#)orstpmdf(#)orknots([l|%]knotlist)orknots(u#1 #2 #3).

stpmis for use with survival-time data; see st. You must havestsetyour data before using this command; see help st. Weights are to be set bystset. All weight-types are supported.The syntax of predict following

stpmis

predict[type]newvarname[ifexp] [inrange] [,statisticat_stuffnoconstantnooffsetstdpstore(global_macro_name[#])time(#|vn)zero]

where

at_stuffisvarname #|vn[varname #|vn...])orat(@varname#|string)orat(+#)and

statisticis

xbindex (linear predictor for equation[xb]); the defaultcumoddslog cumulative odds functioncumhazardlog cumulative hazard functionnormalNormal deviate functionsplinefitted spline functiondzdyderivative of fitted spline function w.r.t. ln(_t)densitydensity functiondeviancedeviance residualshazardhazard functionmartingalemartingale residualssurvivalsurvival functionfailurefailure (cumulative incidence) functioncentile(#|varname)#th centile of survival time distribution (or centiles stored invarname)tvc(varname)time-varying coefficient forvarname(seeOptions for predict)All statistics are available both in and out of sample; type "

predict...if e(sample)..." if wanted only for the estimation sample. The default is linear prediction of the covariate part of the model, i.e. for statisticxb.Not all of the options for

predictare available with every statistic. The SE (stdpoption ofpredict) is not available for every statistic.

Description

stpmfits spline-based distributional models to right- or interval-censored data.varlistis a set of covariates. Multistate models are also supported.

Options for stpmNote that the complexity of the spline part of the model is defined by

df()orstpmdf()orknots(). Just one of these options must be specified.

allincludes out-of-sample observations in generated variables which contain the spline basis functions.

cluster(cluster_var))adjusts standard errors for intra-group correlation, where the grouping variable iscluster_var; impliesrobust

df(#)specifies the degrees of freedom for the natural spline function.#must be between 1 and 6. Theknots()option is not applicable and the knots are placed at the following centiles of the distribution of the uncensored log times:--------------------------- df Centile positions --------------------------- 1 (no knots) 2 50 3 33 67 4 25 50 75 5 20 40 60 80 6 17 33 50 67 83 >6 (not allowed) ---------------------------

knots([l|%]knotlist)(syntax 1) defines the internal knot positions for the spline. If you specifyknots(knotlist)thenknotlistshould consist of values of log time. If you specifyknots(lknotlist)then the values inknotlistare taken to be times and are automatically log transformed bystpm. (This is a convenience feature; it is easier to enter times than log times). If you specifyknots(%knotlist)then the values inknotlistare taken to be centile positions in the distribution of the uncensored log times.

knots(u#1 #2 #3)(syntax 2) also defines the internal knots for the spline.#1knots are assigned at random uniformly distributed positions between#2and#3, where#2is the lowest centile position of the uncensored log times you wish to entertain and#3is the highest. A suggested choice is#2=10,#3=90; knots are to be placed at random positions between the 10th and 90th centiles of the uncensored log times.

left(leftvar)specifies that some or all of the survival-time observations are interval-censored. The rules for specifying values ofleftvarand their meanings in terms of interval censoring are as follows:------------------------------------------------------------------- Value of

leftvar_d Meaning ------------------------------------------------------------------- . or _t 0 Right censored at _t . or _t 1 Event at _t 0 0 Right censored at _t 0 1 Interval censored, event in (0,_t] <_t 0 Late entry atleftvar, right censored at _t <_t 1 Interval censored, event in [leftvar,_t] -------------------------------------------------------------------Note that

stpmsupport datasets with late entry and interval censoring only when late entry is specified throughleftvarand the value of _d, as given in the above table.

mlmaxopts(ml_maximize_options)control the likelihood maximization process; seldom used. Thedifficultoption is probably the most important. See ml##ml_maxopts for details.

noconstantsuppresses the constant term in the[xb]equation.

nologsuppresses the iteration log while the model is fit.

noorthogsuppresses orthogonalisation of the spline basis functions. You should rarely need this option, which is provided for compatibility with early versions ofstpm, and for pedagngic reasons.

offset(offsetvar)defines the offset for the[xb]equation.offsetvaris added to the linear predictor.

q(orthog_matrix)is required with thespline()option, unless thenoorthogoption was used when the model was fitted. The Q-matrixorthog_matrixis a linear transformation between the regression coefficients on the orthogonalised and original spline basis functions. It may be stored following a run ofstpmby extracting it frome(Q):

stpm ...matrix q = e(Q)

scale(hazard|normal|odds)is not optional and specifies the scale of the model. Thehazard,normalandoddsoptions fit models on the scale of the log cumulative hazard, normal equivalent deviate or log cumulative odds of failure respectively.

spline(splinevar derivativevar)allows you to specify the baseline spline function and its derivative with respect to ln(_t). For a given model where the spline function has been estimated,splinevarcan be created by using for example

predictsplinevar, spline zero

derivativevarmay be created by using for example

predictderivativevar, dzdyNote also that the

noconstantoption ofpredictsubtracts the constant term from the spline function. This will ensure that if the model is refitted usingsplinevarandderivativevar, the constant will be reported as (close to) zero.

stpmdf(#)is identical todf().stpmdf()is for inputting the df required for stpm to mfp, which has its owndf()option.

stratify(strat_varlist)stratifies the spline functions according to the variables instrat_varlist. It will rarely make sense for the variables instrat_varlistnot to be among those invarlist, but this is not checked.

technique()specifies how the likelihood function is to be maximized byml. The following algorithms are currently implemented inml:nr(the default),bhhh,dfp,bfgs. For further details, see ml or the bookMaximum Likelihood Estimation with Stata, 2d Edition(Gould, Pitblado, and Sribney 2003).

theta(est|#)applies only withscale(odds)and estimates the transformation parameter (theta) or performs estimation with theta set to#. The transformation of the (baseline) survival function S_0(t) is theng_theta(S_0(t)) = ln(S_0(t)^(-theta)-1)/theta)

theta = 0 corresponds to the cumulative hazards model. With

theta(est), theta is estimated and presented on a log scale, i.e. ln(theta). Withtheta(#),#must be positive.

Options for predict

at(varname #|vn[varname #|vn...])(syntax 1) computes the various statistics at value(s) (#orvn)... of model covariatesvarname..., wherevnmeans "variable name". Theat()option is a convenient way of specifying out-of-sample prediction for some or all of the covariates in the model. Covariates instpm'svarlistbut NOT listed inat()are used in computing predicted values, unless thezerooption is specified, in which case adjustment is to value 0 of such predictors.

at(@varname #|string)(syntax 2) computes the various statistics at the mean of the covariate values for observations in which varname equals#(ifvarnameis numeric) or in whichvarnameequalsstring(ifvarnameis a string variable).varnamemay typically be an identification number, e.g. a row number of the dataset. Note thatvarnameis not usually one of the covariates in the model (i.e. it is not a member ofvarlist).

at(+#)(syntax 3) computes the various statistics after adding#to the linear predictor from the[xb]equation. This can be used to examine the effects of applying specific offsets to the predicted values on other statisics of interest.

deviancecreates deviance-like residuals. The same formula is used as in streg.

martingalecreates martingale-like residuals. The same formula is used as in streg.

noconstantremoves the constant (if any) in equation[xb]when computing a statistic of interest. This includes the fitted spline function using thesplineoption.

nooffsetis relevant only if you specifiedoffset()withstpm. It modifies the calculations made bypredict, xbso that they ignore the offset variable. The linear prediction for equation[xb]is treated as x*b rather than x*b + offset.

stdpcomputes the standard error of statisticsxb,cumhazard,cumodds,normal,dzdy, or of the log (note: log) survival time forcentile().stdpis also implemented forhazardwith hazard-scaled models (scale(hazard)) only, when it calculates the S.E. of the log (note: log) hazard function.stdpis not implemented for other statistics, but note that confidence intervals for the survival function may be found by back-transformation of confidence intervals for the cumulative hazard or odds or normal function, and those for the hazard function similarly, through the formula exp(ln(hazard) +/- z * SE(log hazard)).

splinecomputes the fitted spline function. This incorporates all covariate effects and by default is evaluated at the observed failure times. Use of thezerooption evaluates the spline with all covariate values set to zero (i.e. the baseline spline). If there are variables in the model with time-dependent effects, specified through thestratify()option, and thetime(#)option is also used, the result is effectively the prognostic index evaluated at time#.

store(global_macro_name[#])stores the contents ofnewvarnamein a global macro calledglobal_macro_name. Successive non-missing values are stored, separated by spaces. Optionally, rounding to#decimal places is allowed, to save space and make the results more readable. Thestore()option is not often used, but may be helpful when passing the output to another program.

time(#|vn)predicts at time#or at the time values invn. Iftime()is not specified, prediction is made for time _t.

tvc(varname)stands for "time-varying coefficient" and computes the estimated coefficient forvarname, a covariate instpm'svarlist. Ifvarnameis "time-fixed", thennewvarnamewill be a constant, namely[xb]_b[varname]. Ifvarnameis included instrat_varlist, thennewvarnamewill depend on _t and may be interpreted as the time-varying effect ofvarnameon the chosen scale of the model (proportional hazards, proportional odds or probit). For example, in a hazards-scale model (scale(hazard)),newvarnamemultiplied byvarnamewill give an estimate of the time-varying log cumulative hazard ratio forvarname(compared withvarname= 0) at every observed value ofvarname.newvarnamealone will give the log cumulative hazard ratio for a one-unit change invarname. Note that the time-varying log cumulative hazard ratio forvarnamewill NOT be identical to the time- varying log hazard ratio forvarname.

zeropredicts at zero values of covariates invarlistand similarly forstrat_varlist, ifstratify()is specified. See also optionat().

RemarksLet t denote time.

stpmworks by first calculating the survival function S(t) non-parametrically by using the Kaplan-Meier estimator. The procedure is illustrated for proportional hazards models, specified by optionscale(hazard). S(t) is converted to an estimate of the log cumulative hazard function Z(t) by the formulaZ(t) = ln(-ln S(t))

This Z(t) is then smoothed on ln(t) using regression splines with knots placed at certain quantiles of the distribution of t. The knot positions are chosen automatically if the spline complexity is specified by the

df()option, or manually by way of theknots()option. (Note that the knots are values of ln(t), not t.) Denote the predicted values of the log cumulative hazard function by Zhat(t). The density function f(t) isf(t) = -dS(t)/dt = dS/dZhat dZhat/dt = S(t) exp(Zhat) dZhat(t)/dt

dZhat(t)/dt is computed from the regression coefficients of the fitted spline function. A smoothed survival function is calculated as

Shat(t) = exp(-exp Zhat(t)).

The hazard function is calculated as f(t)/Shat(t).

If

varlistis specified, the baseline survival function (i.e. at zero values of the covariates) is used instead of the survival function of the raw observations. Withdf(1)a Weibull model is fitted.With

scale(normal), smoothing is of the Normal quantile function, invnorm(1-S(t)), instead of the log cumulative hazard function. Withdf(1)a lognormal model is fitted.With

scale(odds), smoothing is of the log odds of failure function, ln((1-S(t))/S(t)), instead of the log cumulative hazard function. Withdf(1)a log-logistic model is fitted.Estimation is performed by maximum likelihood. Optimisation uses the default technique (

nr, meaning Stata's version of Newton-Raphson iteration.As an example of output from

stpm, consider the following. A stratifying variablennhas two levels, 0 and 1. There are two covariates,x5aandx5b, with effects assumed to act proportionately on the cumulative hazard function:

. stpm x5a x5b nn, stratify(nn) scale(odds) df(2)initial: log likelihood = -1124.6562 rescale: log likelihood = -1124.6562 rescale eq: log likelihood = -1124.6562 Iteration 0: log likelihood = -1124.6562 Iteration 1: log likelihood = -1112.096 Iteration 2: log likelihood = -1111.5188 Iteration 3: log likelihood = -1111.5173 Iteration 4: log likelihood = -1111.5173 Number of obs = 1 > 289 Wald chi2(1) = 10 > .48 Log likelihood = -1111.5173 Prob > chi2 = 0.0 > 012 --------------------------------------------------------------------------- > --- _t | Coef. Std. Err. z P>|z| [95% Conf. Interv > al] ---------+----------------------------------------------------------------- > --- s0 | nn | -1.735629 .5360622 -3.238 0.001 -2.786292 -.6849 > 664 _cons | 3.416165 .4057033 8.420 0.000 2.621001 4.211 > 329 ---------+----------------------------------------------------------------- > --- s1 | nn | -.1167243 .046288 -2.522 0.012 -.2074471 -.0260 > 015 _cons | .1590071 .0349978 4.543 0.000 .0904127 .2276 > 016 ---------+----------------------------------------------------------------- > --- xb | x5a | .9971947 .1749101 5.701 0.000 .6543773 1.340 > 012 x5b | 1.844591 .2142359 8.610 0.000 1.424696 2.264 > 486 nn | -.5830707 .3702606 -1.575 0.115 -1.308768 .1426 > 267 _cons | -2.039426 .2662178 -7.661 0.000 -2.561203 -1.517 > 649 --------------------------------------------------------------------------- > --- Deviance = 2223.035 (1289 observations.)

Notes on the output1. The equations:

s0ands1represent the first and second spline basis functions, with[s0]_consand[s1]_consbeing the coefficients for value 0 of the stratifying variablenn, and[s0]_cons+[s0]_b[nn]and[s1]_cons+[s1]_b[nn]the coefficients for value 1 ofnn.2. If you wanted to test for the need to stratify by

nnyou could typetest [s0]nn [s1]nn. This would give a 2 degree of freedom Wald test of any difference between the spline functions fornn.3. The value of "Wald chi2(1)" should be ignored.

Reproducing the modelSometimes one wishes to reproduce a model, that is, to incorporate the linear predictor in

[xb]and the spline function representing the baseline distribution function without re-fitting any parameters. This can be done by using thespline,xbanddzdyoptions ofpredictto create the necessary variables, and storing the matrixe(Q)for later use. The model is reproduced by using thespline(),offset(),noconstantandq()options ofstpm. The reason for the use ofnoconstantwithpredict, xbis that the constant of the[xb]equation is already included in the spline function bypredict, splineand should not be introduced again. A specific example:.

stpmmodel_specifications.predict sp, spline zero.predict dz, dzdy.predict xb, xb noconstant.matrix Q = e(Q)To refit the model in the subset of the data for which

zequals 1 without re-estimating any parameters (only the deviance is displayed):.

stpm if z==1, offset(xb) spline(sp dz) q(Q) noconstantTo re-estimate the constant in the

[xb]equation in a subset of the data:.

stpm if z==1, offset(xb) spline(sp dz) q(Q)At present, the approach cannot be used to reproduce time-varying models, i.e. models fitted with the

stratify()option.

Stored results

stpmreturns the deviance (minus twice the maximised log likelihood) and the Akaike Information Criterion (AIC) in scalarse(dev)ande(aic)respectively. Additional quantities are stored ine()functions and may be inspected by usingereturn list.

Examples

. stpm, scale(hazard) df(1). stpm, df(3) scale(normal). stpm, df(3) scale(odds). stpm, df(3) scale(odds) theta(2). stpm, df(3) scale(odds) theta(est). stpm treatmnt, scale(odds) knots(%10 90). xi: stpm x1 x2 x3 i.pop, df(2) scale(o) stratify(i.pop). stpm x1 x2 x3, df(3) scale(h) left(left)

. predict s0, survival zero[baseline survival function]. predict surv3, survival time(3)[survival probs at time 3]. predict median, centile(50)[median survival, given covars]. predict semedian, centile(50) stdp. predict h0, hazard zero[baseline hazard function]. predict h1, hazard zero at(z 1). predict lnH, cumhazard at(@@id 25)[at covars for obs #id[25]]

. gen hr = h1/h0[hr is defined for all obs.]. predict bz, tvc(z)[time-varying coeff for z]. predict sbz, tvc(z) stdp[SE of bz]

. stpm, scale(hazard) df(2). gen times = _n in 1/10. predict surv, survival time(times) store(survlist 4). display "$survlist"

. stpm x1 x2 x3, df(3) scale(h). matrix q = e(Q). predict spline, spline zero noconstant. predict dspline, dzdy. stpm x1 x2 x3, spline(spline dspline) scale(h) q(q)

Auxiliary ado-filesfrac_spl, frac_s3b, mlsurvlf, stpm_p.

AuthorPatrick Royston, MRC Clinical Trials Unit, London. patrick.royston@ctu.mrc.ac.uk

ReferencesP. Royston. 2001. Flexible alternatives to the Cox model, and more. The Stata Journal 1:1-28.

P. Royston and M. K. B. Parmar. 2002. Flexible proportional-hazards and proportional-odds models for censored survival data, with application to prognostic modelling and estimation of treatment effects. Statistics in Medicine 21: 2175-2197.

Also see