Title
survsim -- Simulate complex survival data
Syntax
survsim newvarname1 [newvarname2] [, options]
options Description ------------------------------------------------------------------------- n(int) specifies the number of survival times to generate, default is _N lambdas(numlist) scale parameters gammas(numlist) shape parameters distribution(exponential) exponential survival distribution distribution(gompertz) Gompertz survival distribution distribution(weibull) Weibull survival distribution (default) covariates(vn # [# ...] ...) baseline covariates tde(vn # [# ...] ...) time-dependent effects
2-component mixture mixture simulate survival times from a 2-component mixture model pmix(real) mixture parameter, default is 0.5
Competing risks cr simulate survival times from the all-cause distribution of cause-specific hazards ncr(int) specifies the number of competing risks
Newton-Raphson scheme centol(real) set the tolerance for the Newton-Raphson scheme, default is 0.0001 showdiff display the maximum difference in estimates between iterations ------------------------------------------------------------------------- Abbreviation: vn = varname
Description
survsim simulates survival times from parametric distributions. Distributions include the exponential, Gompertz and Weibull. Newton-Raphson iterations are used to generate survival times under a 2-component mixture model or cause-specific hazards model for competing risks. Non-proportional hazards can be included with all models except a mixture model; under an exponential or Weibull model covariates are interacted with log time, under a Gompertz model covariates are interacted with time. Baseline covariates can be included in all models. newvarname1 specifies the new variable name to contain the generated survival times. newvarname2 is required when generating competing risks data to create the status indicator.
Options
n specifies the number of survival times to generate. Default is _N.
lambdas(numlist) defines the scale parameters in the Weibull/Gompertz distribution(s). The number of values required depends on the model choice. Default is a single number corresponding to a standard parametric distribution. Under a mixture model 2 values are required. Under a competing risks model, cr, the number of values are defined by ncr.
gammas(numlist) defines the shape parameters of the parametric distribution(s). Number of entries must be equal to that of lambdas.
distribution(string) specifies the parametric survival distrubution to use. exponential, gompertz or weibull can be used, with weibull the default.
covariates(varname # [# ...] ...) defines baseline covariates to be included in the linear predictor of the survival model, along with the value of the corresponding coefficient. For example a treatent variable coded 0/1 can be included, with a log hazard ratio of 0.5, by covariates(treat 0.5). Variable treat must be in the dataset before survsim is run. If cr is used with ncr(4), then a value of each covariate must be inputted for each competing risk, e.g. covariates(treat 0.5 -0.2 0.1 0.25).
tde(varname # [# ...] ...) creates non-proportional hazards by interacting covariates with either log time or time under a Weibull or Gompertz model, respectively. This option is not available under mixture models. Values should be entered as tde(trt 0.5), for example. +---------------+ ----+ Mixture model +----------------------------------------------------
mixture specifies that survival times are simulated from a 2-component mixture model. lambdas and gammas must be of length 2.
pmix(real) defines the mixture parameter. Default is 0.5.
+-----------------+ ----+ Competing risks +--------------------------------------------------
cr specifies that survival times are simulated from the all-cause distribution from ncr cause-specific hazards.
ncr(int) defines the number of competing risks. lambdas and gammas must be of length ncr.
+-----------------------+ ----+ Newton-Raphson scheme +--------------------------------------------
centol(real) specifies the tolerance of the Newton-Raphson scheme. Default is 0.0001.
showdiff display the maximum difference in estimates between iterations. This can be used to monitor convergence.
Remarks
On rare occasions the Newton-Raphson scheme may not converge. The user should experiment with appropriate parameter values and tolerance levels.
Examples
Generate times from a Weibull model including a binary treatment variable, with log(hazard ratio) = -0.5, and censoring after 5 years: . clear . set obs 1000 . gen trt = rbinomial(1,0.5) . survsim stime, lambdas(0.1) gammas(1.5) cov(trt -0.5) . gen died = stime <= 5 . replace stime = 5 if died == 0 . stset stime, f(died = 1) . streg trt, dist(weibull) nohr
Generate times from a Gompertz model: . survsim stime, n(1000) lambdas(0.1) gammas(0.05) dist(gompertz)
Generate times from a 2-component mixture Weibull model: . survsim stime, n(1000) mixture lambdas(0.1 0.05) gammas(1 1.5) pmix(0.5)
Generate times from a competing risks model with 4 cause-specific Weibull hazards and 4 cause-specific treatment effects: . survsim stime status, n(1000) cr ncr(4) lambdas(0.1 0.05 0.1 0.05) gammas(0.5 1.5 1 1.2) cov(trt 0.2 0.1 -0.1 0.4)
Generate times from a Weibull model with diminishing treatment effect: . survsim stime, n(1000) lambdas(0.1) gammas(1.5) cov(trt -0.5) tde(trt 0.05)
Author
Michael J. Crowther, University of Leicester, United Kingdom. mjc76@le.ac.uk.
Please report any errors you may find.
References
Bender, R.; Augustin, T. and Blettner, M. Generating survival times to simulate Cox proportional hazards models. Stat Med, 2005, 24, 1713-1723
Beyersmann, J.; Latouche, A.; Buchholz, A. & Schumacher, M. Simulating competing risks data in survival analysis. Stat Med, 2009, 28, 956-971