help stpm2 postestimation
also see: stpm2
-------------------------------------------------------------------------------
Title
[ST] stpm2 postestimation -- Post-estimation tools for stpm2
Description
The following standard post-estimation commands are available:
command description
-------------------------------------------------------------------------
INCLUDE help post_adjust2
INCLUDE help post_estat
INCLUDE help post_estimates
INCLUDE help post_lincom
INCLUDE help post_lrtest
INCLUDE help post_nlcom
predict predictions, residuals etc
INCLUDE help post_predictnl
INCLUDE help post_test
INCLUDE help post_testnl
-------------------------------------------------------------------------
predict newvar [if] [in] [, statistic ]
Note: in the table below, vn is an abbreviation for varname.
statistic Description
-------------------------------------------------------------------------
Main
abc area between log hazard ratio curves
at(vn # [vn # ...]) predict at values of specified
covariates
centile(#) #th centile of survival distribution
ci calculate confidence intervals
cumhazard cumulative hazard
cumodds cumulative odds
cure cure proportion
density density function
failure failure function
hazard hazard function
hrnumerator(vn # [vn # ...]) numerator for (time-dependent) hazard
ratio
hrdenominator(vn # [vn # ...]) denominator for (time-dependent) hazard
ratio
hdiff1(vn # [vn # ...]) 1st hazard function for difference in
hazard functions
hdiff2(vn # [vn # ...]) 2nd hazard function for difference in
hazard functions
martingale martingale residuals
meansurv population averaged survival function
normal standard normal deviate of survival
function
per(#) express hazard rates (and differences)
per # person years
rmst restricted mean survival time
rsdst standard deviation of restricted
survival time
sdiff1(vn # [vn # ...]) 1st survival curve for difference in
survival functions
sdiff2(vn # [vn # ...]) 2nd survival curve for difference in
survival functions
stdp standard error of predicted function
survival survival function
timevar(vn) time variable used for predictions
(default _t)
tmax(#) upper bound of time for rmst and abc
options
tmin(#) lower bound of time for rmst and abc
options
tvc(vn) time-varying coefficient for varname
unccured obtain survival and hazard functions for
the 'uncured'
xb the linear predictor
xbnobaseline predicts the linear predictor, excluding
the spline function
zeros sets all covariates to zero (baseline
prediction)
Subsidiary
centol(#) tolerance level when estimating centile
deviance deviance residuals
dxb derivative of linear predictor
level(#) sets confidence level (default 95)
startunc(#) sets starting value for Newton-Raphson
algorithm for estimating a centile of
the survival distribution of 'uncured'
-------------------------------------------------------------------------
Statistics are available both in and out of sample; type predict ... if
e(sample) ... if wanted only for the estimation sample.
Options for predict
Note that if a relative survival model has been fitted by use of the
bhazard() option then survival refers to relative survival and hazard
refers to excess hazard.
+------+
----+ Main +-------------------------------------------------------------
abc evaluates the area between a constant log hazard ratio and a
time-dependent log hazard ratio curve. It integrates the difference
between a log HR curve and a constant log HR over the time range
between tmin() and tmax(). The constant HR is supplied by the hr0()
option. The time-dependent log HR curve is determined according to
hrnumerator(), which must therefore be specified. You may also
specify hrdenominator(). The n(), at() and zeros options are valid
with abc.
at(varname # [ varname # ...]) requests that the covariates specified by
the listed varname(s) be set to the listed # values. For example,
at(x1 1 x3 50) would evaluate predictions at x1 = 1 and x3 = 50. This
is a useful way to obtainThis is a useful way to obtain out of sample
predictions. Note that if at() is used together with zeros all
covariates not listed in at() are set to zero. If at() is used
without zeros then all covariates not listed in at() are set to their
sample values. See also zeros.
centile(#) gives the #th centile of the survival time distribution,
calculated using a Newton-Raphson algorithm.
ci calculate a confidence interval for the requested statistic and stores
the confidence limits in newvar_lci and newvar_uci.
cumhazard predicts the cumulative hazard function.
cumodds predicts the cumulative odds of failure function.
cure predicts the cure proportion after fitting a cure model.
density predicts the density function.
failure predicts the failure function, i.e. F(t) = 1 - S(t).
hazard predicts the hazard function.
hdiff1(varname # ...), hdiff2(varname # ...) predict the difference in
hazard functions with the first hazard function defined by the
covariate values listed for hdiff1() and the second by those listed
for hdiff2(). By default, covariates not specified using either
option are set to zero. Note that setting the remaining values of the
covariates to zero may not always be sensible. If # is set to missing
(.) then varname takes its observed values in the dataset.
Example: hdiff1(hormon 1) (without specifying hdiff2()) computes the
difference in predicted hazard functions at hormon = 1 compared with
hormon = 0.
Example: hdiff1(hormon 2) hdiff2(hormon 1) computes the difference in
predicted hazard functions at hormon = 2 compared with hormon = 1.
Example: hdiff1(hormon 2 age 50) hdiff2(hormon 1 age 30) computes the
difference in predicted hazard functions at hormon = 2 and age = 50
compared with hormon = 1 and age =30.
hrdenominator(varname # ...) specifies the denominator of the hazard
ratio. By default all covariates other than varname and any other
variables mentioned are set to zero. See cautionary note in
hrnumerator. If # is set to missing (.) then varname takes its
observed values in the dataset.
hrnumerator(varname # ...) predicts the (time-dependent) hazard ratio
with the numerator of the hazard ratio. By default all covariates
other than varname and any other variables mentioned are set to zero.
Note that setting the remaining values of the covariates to zero may
not always be sensible, particularly with models other than on the
cumulative hazard scale, or when more than one variable has a
time-dependet effect. If # is set to missing (.) then varname takes
its observed values in the dataset.
martingale calculates martingale residuals.
meansurv calculate the population average survival curve. Note this
differs from the predicted survival curve at the mean of all the
covariates in the model. A predicted survival curve is obtained for
each subject and all the survival curves are averaged. The process
can be computationally intensive. It is recommended that the
timevar() option is used to reduce the number of survival times at
which ths survival curves are averaged. Combining this option with
the at() option enables adjusted survival curves to be estimated.
n(#) [rmst only] defines the number of evaluation points for integrating
the estimated survival function(s) with respect to time. The larger
# is, the more accurate is the estimated restricted mean survival
time, but the longer the calculation takes. There is no gain by
setting # above 5000. Default # is 1000.
normal predicts the standard normal deviate of the survival function.
per(#) express hazard rates and difference in hazard rates per # person
years.
rmst evaluates the mean or restricted mean survival time. This is done by
integrating the predicted survival curve from 0 to tmax(#); see also
the n() and tmax() options. Note that the at() and zeros options are
valid with rmst.
rsdst evaluates the standard deviation of the (restricted) survival time.
For a single sample the SE of the restricted mean survival time may
be estimated by dividing the SD by the square root of the number of
observations. See also the rmst, n() and tmax() options. Note that
the at() and zeros options are valid with rsdst.
sdiff1(varname # ...), sdiff2(varname # ...) predict the difference in
survival curves with the first survival curve defined by the
covariate values listed for sdiff1() and the second by those listed
for sdiff2(). By default, covariates not specified using either
option are set to zero. Note that setting the remaining values of the
covariates to zero may not always be sensible. If # is set to missing
(.) then varname takes its observed values in the dataset.
Example: sdiff1(hormon 1) (without specifying sdiff2()) computes the
difference in predicted survival curves at hormon = 1 compared with
hormon = 0.
Example: sdiff1(hormon 2) sdiff2(hormon 1) computes the difference in
predicted survival curves at hormon = 2 compared with hormon = 1.
Example: sdiff1(hormon 2 age 50) sdiff2(hormon 1 age 30) computes the
difference in predicted survival curves at hormon = 2 and age = 50
compared with hormon = 1 and age =30.
stdp calculates standard error of prediction and stores it in newvar_se.
Only available for the xb, dxb, xbnobaseline and rmst options.
survival predicts the survival function.
timevar(varname) defines the variable used as time in the predictions.
Default varname is _t. This is useful for large datasets where for
plotting purposes predictions are only needed for 200 observations
for example. Note that some caution should be taken when using this
option as predictions may be made at whatever covariate values are in
the first 200 rows of data. This can be avoided by using the at()
option and/or the zeros option to define the covariate patterns for
which you require the predictions.
tmax(#) [rmst and abc only] defines the upper limit of time over which
the integration of the estimated survival function is to be
conducted. Default # is 0, meaning an upper limit as close to t =
infinity as is reasonable (in fact, using the estimated 99.999999th
centile of the survival distribution).
tmin(#) [rmst and abc only] defines the lower bound of time over which
the integration of the estimated survival function is to be
conducted. Default # is -1, taken as 0 and meaning a lower bound of
0.
tvc(varname) stands for "time-varying coefficient" and computes the
estimated coefficient for varname, a covariate in stpm2's varlist. If
varname is "time-fixed", then newvarname will be a constant. If
varname is included in the tvc() option, then newvarname will depend
on _t and may be interpreted as the time-varying effect of varname on
the chosen scale of the model (proportional hazards, proportional
odds or probit). For example, in a hazards-scale model (
scale(hazard)), newvarname multiplied by varname will be an estimate
of the time-varying log cumulative hazard ratio for varname (compared
with varname = 0) at every observed value of varname. newvarname
alone will give the log cumulative hazard ratio for a one-unit change
in varname. Note that the time-varying log cumulative hazard ratio
for varname will NOT be identical to the time- varying log hazard
ratio for varname.
uncured can be used after fitting a cure model. It can be used with the
survival, hazard or centile() options to base predictions for the
'uncured' group.
xb predicts the linear predictor, including the spline function.
xbnobaseline predicts the linear predictor, excluding the spline function
- i.e. only the time-fixed part of the model.
zeros sets all covariates to zero (baseline prediction). For example,
predict s0, survival zeros calculates the baseline survival function.
See also at().
+------------+
----+ Subsidiary +-------------------------------------------------------
centol(#) defines the tolerance when searching for predicted suvival time
at a given centile of the survival distribution. Default # is 0.0001.
deviance calculates deviance residuals.
dxb calculates the derivative of the linear predictor.
level(#) sets the confidence level; default is level(95) or as set by
(help set level}.
startunc(#) sets starting value for Newton-Raphson algorithm for
estimating a centile of the survival time distribution of 'uncured';
default 12.5th centile of the observed follow-up times. Examples
Setup
webuse brcancer
stset rectime, failure(censrec = 1)
Proportional hazards model
stpm2 hormon, scale(hazard) df(4) eform
predict h, hazard ci
predict s, survival ci
Time-dependent effects on cumulative hazard scale
stpm2 hormon, scale(hazard) df(4) tvc(hormon) dftvc(3)
predict hr, hrnumerator(hormon 1) ci
predict survdiff, sdiff1(hormon 1) ci
predict hazarddiff, hdiff1(hormon 1) ci
Use of at() option
stpm2 hormon x1, scale(hazard) df(4) tvc(hormon) dftvc(3)
predict s60h1, survival at(hormon 1 x1 60) ci
Also see
Online: [ST] stpm2;