{smcl}
{* *! version 1.0.0}{...}
{vieweralsosee "stpm3 postestimation" "help stpm3_postestimation"}{...}
{vieweralsosee "stpm3 extended varlist" "help stpm3_extfunctions"}{...}
{vieweralsosee "stpm3" "help stpm3"}{...}

{title:A guide to predictions for {cmd:stpm3}}

{pstd}
{cmd:stpm3} has a range of useful post-estimation utilities. 
This help file shows how to obtain various types of predictions conditional on covariate patterns.

{pstd}
For more examples see 
{bf:{browse "https://pclambert.net/software/stpm3/":https://pclambert.net/software/stpm3/}}.

{dlgtab:Load data}

{pstd}
It is easier to show through examples, so the best way to use this help file is to start without
a dataset in memory and load the data and then click through the different examples.

{pstd}{cmd:use https://www.pclambert.net/data/rott2b}{p_end}
{pstd}{cmd:stset os, failure(osi = 1) scale(12) exit(time 120)}{p_end}
{pstd}{it:({stata "stpm3_predictions_eg 0":click to load data (clear current data first)})}{p_end}

{dlgtab:Example 1 - simple model with a single covariate}

{pstd}
The model will include one binary covariate, {bf:hormon}.
Predictions of the survival will be made at each level of {bf:hormon}.
This can be done through the use of two {cmd:at()} options. 

{pstd}{cmd:stpm3 i.hormon, scale(lncumhazard) df(5) eform}{p_end}
{pstd}{cmd:predict S0 S1, surv ci at1(hormon 0) at2(hormon 1) timevar(0 10)}{p_end}
{pstd}{bf:frame stpm3_pred {c -(}}{p_end}
{p 6}{bf:line S0 S1 tt}{p_end}
{p 6}{bf:list in 1/10}{p_end}
{pstd}{bf:{c )-}}{p_end}
{pstd}{it:({stata "stpm3_predictions_eg 1a":click to run.})}{p_end}

{pstd}
This has saved predictions to a new frame named {it:stpm3_pred}.
If you try running the command again, you will get an error as the frame now exists.
You could use {cmd:frame(,replace)}} or write to a different frame.

{pstd}{cmd:predict S0 S1, surv ci at1(hormon 0) at2(hormon 1) timevar(0 10) frame(p, replace)}{p_end}
{pstd}{bf:frame p: describe}{p_end}
{pstd}{it:({stata "stpm3_predictions_eg 1b":click to run.})}{p_end}

{pstd}
If you look at what has been created in frame {cmd:p} it contains a time variable {bf:tt} that has
100 equally spaced observations between 0 and 10. The predictions and lower and upper bounds
of the 95% confidence intervals are for each value of time.

{pstd}
Instead of using {bf:timevar(0 10)} we could have created a time variable and then used
the {bf:timevar}({it:varname}) option.

{pstd}{cmd:range tt 0 10 100}{p_end}
{pstd}{cmd:predict S0 S1, surv ci at1(hormon 0) at2(hormon 1) timevar(tt) frame(p, replace)}{p_end}
{pstd}{bf:frame p: list in 1/10}{p_end}
{pstd}{it:({stata "stpm3_predictions_eg 1c":click to run.})}{p_end}

{pstd}
It can be useful to have nicer spaced intervals. You can use the {bf:step()} suboption to define the gap between
the time points. Below we use {cmd:step(0.1)}, this enable us to list times at specific years.

{pstd}{cmd:predict S0 S1, surv ci at1(hormon 0) at2(hormon 1) timevar(0 10, step(0.1)) frame(p, replace)}{p_end}
{pstd}{bf:frame p: list if inlist(tt,1,5,10), noobs}{p_end}
{pstd}{it:({stata "stpm3_predictions_eg 1d":click to run.})}{p_end}

{pstd}
The same could have been achieved by using the option {bf:n(101)}. This is shown below where we also 
change the name of the time variable using the {bf:gen()} suboption. 

{pstd}{cmd:predict S0 S1, surv ci at1(hormon 0) at2(hormon 1) timevar(0 10, n(101) gen(time)) frame(p, replace)}{p_end}
{pstd}{bf:frame p: list in 1/10 }{p_end}
{pstd}{it:({stata "stpm3_predictions_eg 1e":click to run.})}{p_end}

{pstd}
You do not have to save data to a frame. If you use the {bf:merge} option 
the predictions will be merged with the current dataset.

{pstd}{cmd:predict S0 S1, surv ci at1(hormon 0) at2(hormon 1) timevar(0 10, step(0.1) gen(t)) merge}{p_end}
{pstd}{bf:list t S0* S1* in 1/10}{p_end}
{pstd}{it:({stata "stpm3_predictions_eg 1f":click to run.})}{p_end}

{dlgtab:Example 2 - Including more covariates}

{pstd}
Now we will fit a model with more (3) covariates. 
This will include the binary covariate {bf:hormon} and the continuous covariates {bf:age} and {bf:nodes}.
Both {bf:age} and {bf:nodes}. will be modelled using natural splines with 3df (4 knots) 
using {helpb stpm3_extfunctions:stpm3 extendend functions}.
We will also include an interaction between {bf:hormon} and {bf:age}.

{pstd}{cmd:stpm3 i.hormon##@ns(age,df(3)) i.size, scale(lncumhazard) df(5) eform}{p_end}
{pstd}{cmd:predict S0 S1, surv ci at1(hormon 0 age 60 size 3) at2(hormon 1 age 60 size 3) timevar(0 10, step(0.1)) frame(p, replace)}{p_end}
{pstd}{bf:frame p {c -(}}{p_end}
{p 6}{bf:line S0 S1 tt}{p_end}
{p 6}{bf:list in 1/10}{p_end}
{pstd}{bf:{c )-}}{p_end}
{pstd}{it:({stata "stpm3_predictions_eg 2a":click to run.})}{p_end}

{pstd}
Although the model is complex containing non-linear effects and interaction terms,
the prediction syntax is relatively easy. The only covariates in the model
are {cmd:age}, {cmd:size} and {cmd:hormon}. so all the {cmd:predict} command needs is the
age and the levels of {cmd:size} and {cmd:hormon} you want a prediction for.

{pstd}
If your model contains time-dependent effects using the {cmd:tvc()} option,
then the prediction command stays the same.

{pstd}{cmd:stpm3 i.hormon##@ns(age,df(3)) i.size, scale(lncumhazard) df(5) tvc(@ns(age,df(3))) dftvc(3)}{p_end}
{pstd}{cmd:predict S0 S1, surv ci at1(hormon 0 age 60 size 3) at2(hormon 1 age 60 size 3) timevar(0 10, step(0.1)) frame(p, replace)}{p_end}
{pstd}{bf:frame p {c -(}}{p_end}
{p 6}{bf:line S0 S1 tt}{p_end}
{p 6}{bf:list in 1/10}{p_end}
{pstd}{bf:{c )-}}{p_end}
{pstd}{it:({stata "stpm3_predictions_eg 2b":click to run.})}{p_end}