-------------------------------------------------------------------------------
help for stexpect                                                            EC
-------------------------------------------------------------------------------

Compute and tabulate Expected Survival

stexpect [ newvarname ] [if exp] [in range] , ratevar(varname) [output(filename [,replace]) [ method(#) at(numlist) by(varlist) npoints(#) nolist]

stexpect is for use with survival-time data; see help st. You must stset your data using the id() option before using this command; see help stset.

stexpect tabulates the expected survival probability. You must specify in ratevar(varname) the reference rate variable.

Description

Expected survival is used to produce an overall survival curve. This is then added to the Kaplan-Meier plot of the study group for visual comparison between these subjects and the population at large. Before estimating the expected survival we have to specify a different follow-up time for the study group according to three common methods of computing it.

In the Ederer or "exact" method the subjects in the study cohort are not censored or dead before the end of a stated follow-up time. Let us assume that the subjects of the cohort are enrolled from 1985 to 1990 and that we want to estimate the expected survival until ten years from the start of the follow-up. If entrydate is a decimal date, we set timevar as follows:

. gen survtime = entrydate + 10

In the Hakulinen method, follow up time is the actual censoring time for those patients who are censored and the maximum potential follow up for those who die, i.e. the most optimistic last contact date. In the previous example let the follow-up be terminated June 30, 2000, status is 1 if the subject dies and 0 otherwise, exitdate is the actual follow-up time. Then

. gen survtime = cond(status,2000.5,exitdate)

Conditional method (or Ederer II) is simpler in respect of this, because follow-up time is what it actually is, either the subject dies or not, i.e.:

. gen survtime = exitdate

Combining the observed survival function produced by sts generate and the expected survival estimated with stexpect, it is easy to achieve the relative survival function, the preferred survival measure in cancer registry activity. In a context of population-based cancer registry data, due to the large number of cases frequently involved, stexpect memory requirements can exceed the available RAM. In this case the expected survival function can still be estimated using the npoints(#) option.

Options

ratevar(varname) is not an option. It specifies a reference rate variable.

output(filename [,replace]) saves a dataset in filename. The file contains the expected survival probability, the time to which it refers and the number at risk. If not specified in newvarname, expected survival will be stored in a variable named Survexp.

method(#) defines the method to be used according to the following numerical codes: 1 = Ederer 2 = Conditional (Ederer II) 3 = Hakulinen Hakulinen is the default.

at(numlist) specifies time intervals at which the expected survival is to be computed. It is not an option if method(1) (Ederer) is chosen. If, using the other methods, at(numlist) is omitted an estimate is returned for each unique survival time.

by(varlist) specifies up to 5 by variables and produces separate estimates for each group identified by equal values of the by() variable(s) taking on integer or string values.

npoints(#) specifies the number of the points at which to calculate intermediate results, evenly spaced over the range of the survival times. The usual calculation is done at each unique follow-up time. Therefore stexpect may require great amounts of memory. For large datasets specifying npoints(#) can reduce the amount of memory and computation required. nolist suppresses the output. This is used only when saving results to a file specified by output().

Example

The following example uses the mgus data, downloadable from prof. T. Therneau's web page at http://www.mayo.edu/hsr/people/therneau/book/book.html. File survexp.us freely available within R package has been used as source of the US reference rates. They have been transformed in annual rates and saved as a Stata data file named usrate. Both files are included in the package.

The example illustrates how to plot the observed and the expected survival estimated using both Ederer's method and Hakulinen's method. When Conditional method is applied, Conditional and Hakulinen's expected survival are plotted together with the observed survival.

Observed Survival .use mgusexample,clear .stset time, failure(status) scale(365.25) .sts gen kaplan = s,by(sex) .gen kapmale=kaplan if sex==1 .gen kapfemale=kaplan if sex==2 .label var kapmal "S(males)" .label var kapfem "S(females)" .preserve

Ederer Method In this example the stated follow-up time is 30 years. .gen survederer=30 *365.25 .stset survederer, f(status) id(id) scale(365.25) noshow Usually, to use reference population rates, data have to be splitted by age band and calendar year and then merged with a suitable file wher > e reference rates are stored. .stsplit fu, at(0(1)30) .gen year = yeardiagnosis + fu .gen age = agediagnosis + fu .sort year age sex .merge year age sex using usrate, uniqus nokeep Now expected survival is saved in a file named mgusederer .stexpect ederer, ratevar(rate) at(0(1)30) out(mgusederer,replace) meth > od(1) by(sex)

Note that t_exp is the time at which expected survival has been estimat > ed .use mgusederer,clear .gen edermale = ederer if sex==1 .gen ederfemale = ederer if sex==2 .save mgusederer,replace .restore,preserve

The two files are joined .append using mgusederer

.twoway (line kapmale kapfemale _t, sort c(J J) clc(blue*1.3 red*1.3)) > /// (lowess edermale t_exp, bw(.3) clc(blue*1.3)) /// (lowess ederfemale t_exp, bw(.3) clc(red*1.3)), /// xti("Years of Follow Up") yti("Survival") xla(0(5)30) /// legend(label(3 "Expexted males") label(4 "Expected females") pos(7) > ring(0) col(1)) /// t1t("MGUS example") t2t("Ederer Method")

-------------------------------------------------------------------------------

Hakulinen Method In this study the follow-up was closed at 1st august 1990. Then, the potential follow-up if death occurs is determined as the difference between this date and the date of diagnosis.

.use mgusexample,clear .gen survhakulinen = cond(status,mdy(8,1,1990) - datediagnosis, time) .stset survhakulinen, f(status) id(id) scale(365.25) noshow .stsplit fu, at(0(1)30) .gen year = yeardiagnosis + fu .gen age = agediagnosis + fu .sort year age sex .merge year age sex using usrate, uniqus nokeep .stexpect hakulinen, ratevar(rate) at(0(1)30) out(mgushakulinen,replace > ) by(sex) .use mgushakulinen,clear .gen hakulmale = hakulinen if sex==1 .gen hakulfemale = hakulinen if sex==2 .save mgushakulinen,replace .restore, preserve .append using mgushakulinen .twoway (line kapmal kapfem _t, sort c(J J) clc(blue*1.3 red*1.3)) /// (lowess hakulmale t_exp, bw(.3) clc(blue*1.3)) /// (lowess hakulfemale t_exp, bw(.3) clc(red*1.3)), /// xti("Years of Follow Up") yti("Survival") xla(0(5)30) /// legend(label(3 "Expexted males") label(4 "Expected females") po > s(7) ring(0) col(1)) /// t1t("MGUS example") t2t("Hakulinen Method")

-------------------------------------------------------------------------------

Conditional Method .use mgusexample,clear

Actual Follow up .stset time, f(status) id(id) scale(365.25) noshow .stsplit fu, at(0(1)30) .gen year = yeardiagnosis + fu .gen age = agediagnosis + fu .sort year age sex .merge year age sex using usrate, uniqus nokeep .stexpect conditional, ratevar(rate) at(0(1)30) out(mgusconditional,rep > lace) by(sex) method(2) .use mgusconditional,clear .gen condmale = conditional if sex==1 .gen condfemale = conditional if sex==2 .save mgusconditional,replace .restore .append using mgushakulinen .append using mgusconditional .twoway (line kapmale kapfemale _t,sort c(J J) clc(blue*1.3 red*1.3)) / > // (lowess hakulmale t_exp, bw(.3) clc(blue*1.3)) /// (lowess hakulfemale t_exp, bw(.3) clc(red*1.3)) /// (lowess condmale t_exp, bw(.3) clc(black) clp(shortdash)) /// (lowess condfemale t_exp, bw(.3) clc(black) clp(shortdash)), // > / xti("Years of Follow Up") yti("Survival") xla(0(5)30) /// legend(label(3 "Hakulinen males") label(4 "Hakulinen females") > /// label(5 "Conditional males") label(6 "Conditional females") pos > (2) ring(0) col(1)) /// t1t("MGUS example") t2t("Conditional vs Hakulinen Method")

Setting mgusexample.dta and usrate.dta in one of your `"`c(adopath)'"' directory you can run this example.

(click to run)

References

Therneau TM., Grambsch PM., Modeling Survival Data Extending the Cox Model, p. 261 - 287. Springer, 2000.

Author

Enzo Coviello, Department of Prevention ASL BA/1, Minervino Murge, IT. Email: enzo.coviello@tin.it

Also see

Manual: [ST] sts generate, [ST] strate