Title
[ST] stpm2 postestimation -- Post-estimation tools for stpm2
Description
The following standard post-estimation commands are available:
command description ------------------------------------------------------------------------- INCLUDE help post_adjust2 INCLUDE help post_estat INCLUDE help post_estimates INCLUDE help post_lincom INCLUDE help post_lrtest INCLUDE help post_nlcom predict predictions, residuals etc INCLUDE help post_predictnl INCLUDE help post_test INCLUDE help post_testnl -------------------------------------------------------------------------
predict newvar [if] [in] [, statistic ]
Note: in the table below, vn is an abbreviation for varname.
statistic Description ------------------------------------------------------------------------- Main abc area between log hazard ratio curves at(vn # [vn # ...]) predict at values of specified covariates centile(#) #th centile of survival distribution ci calculate confidence intervals cumhazard cumulative hazard cumodds cumulative odds cure cure proportion density density function failure failure function hazard hazard function hrnumerator(vn # [vn # ...]) numerator for (time-dependent) hazard ratio hrdenominator(vn # [vn # ...]) denominator for (time-dependent) hazard ratio hdiff1(vn # [vn # ...]) 1st hazard function for difference in hazard functions hdiff2(vn # [vn # ...]) 2nd hazard function for difference in hazard functions martingale martingale residuals meansurv population averaged survival function normal standard normal deviate of survival function per(#) express hazard rates (and differences) per # person years rmst restricted mean survival time rsdst standard deviation of restricted survival time sdiff1(vn # [vn # ...]) 1st survival curve for difference in survival functions sdiff2(vn # [vn # ...]) 2nd survival curve for difference in survival functions stdp standard error of predicted function survival survival function timevar(vn) time variable used for predictions (default _t) tmax(#) upper bound of time for rmst and abc options tmin(#) lower bound of time for rmst and abc options tvc(vn) time-varying coefficient for varname unccured obtain survival and hazard functions for the 'uncured' xb the linear predictor xbnobaseline predicts the linear predictor, excluding the spline function zeros sets all covariates to zero (baseline prediction)
Subsidiary centol(#) tolerance level when estimating centile deviance deviance residuals dxb derivative of linear predictor level(#) sets confidence level (default 95) startunc(#) sets starting value for Newton-Raphson algorithm for estimating a centile of the survival distribution of 'uncured' ------------------------------------------------------------------------- Statistics are available both in and out of sample; type predict ... if e(sample) ... if wanted only for the estimation sample.
Options for predict
Note that if a relative survival model has been fitted by use of the bhazard() option then survival refers to relative survival and hazard refers to excess hazard.
+------+ ----+ Main +-------------------------------------------------------------
abc evaluates the area between a constant log hazard ratio and a time-dependent log hazard ratio curve. It integrates the difference between a log HR curve and a constant log HR over the time range between tmin() and tmax(). The constant HR is supplied by the hr0() option. The time-dependent log HR curve is determined according to hrnumerator(), which must therefore be specified. You may also specify hrdenominator(). The n(), at() and zeros options are valid with abc.
at(varname # [ varname # ...]) requests that the covariates specified by the listed varname(s) be set to the listed # values. For example, at(x1 1 x3 50) would evaluate predictions at x1 = 1 and x3 = 50. This is a useful way to obtainThis is a useful way to obtain out of sample predictions. Note that if at() is used together with zeros all covariates not listed in at() are set to zero. If at() is used without zeros then all covariates not listed in at() are set to their sample values. See also zeros.
centile(#) gives the #th centile of the survival time distribution, calculated using a Newton-Raphson algorithm.
ci calculate a confidence interval for the requested statistic and stores the confidence limits in newvar_lci and newvar_uci.
cumhazard predicts the cumulative hazard function.
cumodds predicts the cumulative odds of failure function.
cure predicts the cure proportion after fitting a cure model.
density predicts the density function.
failure predicts the failure function, i.e. F(t) = 1 - S(t).
hazard predicts the hazard function.
hdiff1(varname # ...), hdiff2(varname # ...) predict the difference in hazard functions with the first hazard function defined by the covariate values listed for hdiff1() and the second by those listed for hdiff2(). By default, covariates not specified using either option are set to zero. Note that setting the remaining values of the covariates to zero may not always be sensible. If # is set to missing (.) then varname takes its observed values in the dataset.
Example: hdiff1(hormon 1) (without specifying hdiff2()) computes the difference in predicted hazard functions at hormon = 1 compared with hormon = 0.
Example: hdiff1(hormon 2) hdiff2(hormon 1) computes the difference in predicted hazard functions at hormon = 2 compared with hormon = 1.
Example: hdiff1(hormon 2 age 50) hdiff2(hormon 1 age 30) computes the difference in predicted hazard functions at hormon = 2 and age = 50 compared with hormon = 1 and age =30.
hrdenominator(varname # ...) specifies the denominator of the hazard ratio. By default all covariates other than varname and any other variables mentioned are set to zero. See cautionary note in hrnumerator. If # is set to missing (.) then varname takes its observed values in the dataset.
hrnumerator(varname # ...) predicts the (time-dependent) hazard ratio with the numerator of the hazard ratio. By default all covariates other than varname and any other variables mentioned are set to zero. Note that setting the remaining values of the covariates to zero may not always be sensible, particularly with models other than on the cumulative hazard scale, or when more than one variable has a time-dependet effect. If # is set to missing (.) then varname takes its observed values in the dataset.
martingale calculates martingale residuals.
meansurv calculate the population average survival curve. Note this differs from the predicted survival curve at the mean of all the covariates in the model. A predicted survival curve is obtained for each subject and all the survival curves are averaged. The process can be computationally intensive. It is recommended that the timevar() option is used to reduce the number of survival times at which ths survival curves are averaged. Combining this option with the at() option enables adjusted survival curves to be estimated.
n(#) [rmst only] defines the number of evaluation points for integrating the estimated survival function(s) with respect to time. The larger # is, the more accurate is the estimated restricted mean survival time, but the longer the calculation takes. There is no gain by setting # above 5000. Default # is 1000.
normal predicts the standard normal deviate of the survival function.
per(#) express hazard rates and difference in hazard rates per # person years.
rmst evaluates the mean or restricted mean survival time. This is done by integrating the predicted survival curve from 0 to tmax(#); see also the n() and tmax() options. Note that the at() and zeros options are valid with rmst.
rsdst evaluates the standard deviation of the (restricted) survival time. For a single sample the SE of the restricted mean survival time may be estimated by dividing the SD by the square root of the number of observations. See also the rmst, n() and tmax() options. Note that the at() and zeros options are valid with rsdst.
sdiff1(varname # ...), sdiff2(varname # ...) predict the difference in survival curves with the first survival curve defined by the covariate values listed for sdiff1() and the second by those listed for sdiff2(). By default, covariates not specified using either option are set to zero. Note that setting the remaining values of the covariates to zero may not always be sensible. If # is set to missing (.) then varname takes its observed values in the dataset.
Example: sdiff1(hormon 1) (without specifying sdiff2()) computes the difference in predicted survival curves at hormon = 1 compared with hormon = 0.
Example: sdiff1(hormon 2) sdiff2(hormon 1) computes the difference in predicted survival curves at hormon = 2 compared with hormon = 1.
Example: sdiff1(hormon 2 age 50) sdiff2(hormon 1 age 30) computes the difference in predicted survival curves at hormon = 2 and age = 50 compared with hormon = 1 and age =30.
stdp calculates standard error of prediction and stores it in newvar_se. Only available for the xb, dxb, xbnobaseline and rmst options.
survival predicts the survival function.
timevar(varname) defines the variable used as time in the predictions. Default varname is _t. This is useful for large datasets where for plotting purposes predictions are only needed for 200 observations for example. Note that some caution should be taken when using this option as predictions may be made at whatever covariate values are in the first 200 rows of data. This can be avoided by using the at() option and/or the zeros option to define the covariate patterns for which you require the predictions.
tmax(#) [rmst and abc only] defines the upper limit of time over which the integration of the estimated survival function is to be conducted. Default # is 0, meaning an upper limit as close to t = infinity as is reasonable (in fact, using the estimated 99.999999th centile of the survival distribution).
tmin(#) [rmst and abc only] defines the lower bound of time over which the integration of the estimated survival function is to be conducted. Default # is -1, taken as 0 and meaning a lower bound of 0.
tvc(varname) stands for "time-varying coefficient" and computes the estimated coefficient for varname, a covariate in stpm2's varlist. If varname is "time-fixed", then newvarname will be a constant. If varname is included in the tvc() option, then newvarname will depend on _t and may be interpreted as the time-varying effect of varname on the chosen scale of the model (proportional hazards, proportional odds or probit). For example, in a hazards-scale model ( scale(hazard)), newvarname multiplied by varname will be an estimate of the time-varying log cumulative hazard ratio for varname (compared with varname = 0) at every observed value of varname. newvarname alone will give the log cumulative hazard ratio for a one-unit change in varname. Note that the time-varying log cumulative hazard ratio for varname will NOT be identical to the time- varying log hazard ratio for varname.
uncured can be used after fitting a cure model. It can be used with the survival, hazard or centile() options to base predictions for the 'uncured' group.
xb predicts the linear predictor, including the spline function.
xbnobaseline predicts the linear predictor, excluding the spline function - i.e. only the time-fixed part of the model.
zeros sets all covariates to zero (baseline prediction). For example, predict s0, survival zeros calculates the baseline survival function. See also at().
+------------+ ----+ Subsidiary +-------------------------------------------------------
centol(#) defines the tolerance when searching for predicted suvival time at a given centile of the survival distribution. Default # is 0.0001.
deviance calculates deviance residuals.
dxb calculates the derivative of the linear predictor.
level(#) sets the confidence level; default is level(95) or as set by (help set level}.
startunc(#) sets starting value for Newton-Raphson algorithm for estimating a centile of the survival time distribution of 'uncured'; default 12.5th centile of the observed follow-up times. Examples
Setup webuse brcancer stset rectime, failure(censrec = 1)
Proportional hazards model stpm2 hormon, scale(hazard) df(4) eform predict h, hazard ci predict s, survival ci
Time-dependent effects on cumulative hazard scale stpm2 hormon, scale(hazard) df(4) tvc(hormon) dftvc(3) predict hr, hrnumerator(hormon 1) ci predict survdiff, sdiff1(hormon 1) ci predict hazarddiff, hdiff1(hormon 1) ci
Use of at() option stpm2 hormon x1, scale(hazard) df(4) tvc(hormon) dftvc(3) predict s60h1, survival at(hormon 1 x1 60) ci
Also see
Online: [ST] stpm2;