{smcl}
{* *! version 30aug2024}{...}
{smcl}
{pstd}{ul:Partially-linear model - Detailed example with stacking regression using {help pystacked}}

{pstd}Preparation: we load the data, define global macros and set the seed.{p_end}

{phang2}. {stata "use https://github.com/aahrens1/ddml/raw/master/data/sipp1991.dta, clear"}{p_end}
{phang2}. {stata "global Y net_tfa"}{p_end}
{phang2}. {stata "global D e401"}{p_end}
{phang2}. {stata "global X tw age inc fsize educ db marr twoearn pira hown"}{p_end}
{phang2}. {stata "set seed 42"}{p_end}

{pstd}We next initialize the ddml estimation and select the model.
{it:partial} refers to the partially linear model.
The model will be stored on a Mata object with the default name "m0"
unless otherwise specified using the {opt mname(name)} option.{p_end}

{pstd}We set the number of random folds to 2 so that 
the model runs quickly. The default is {opt kfolds(5)}. We recommend 
considering at least 5-10 folds and even more if your sample size is small.{p_end}

{pstd}We recommend re-running the model multiple times on 
different random folds; see options {opt reps(integer)}.
Here we set the number of repetions to 2, again only so that the model runs quickly.{p_end}

{phang2}. {stata "ddml init partial, kfolds(2) reps(2)"}{p_end}

{pstd}Stacking regression is a simple and powerful method for 
combining predictions from multiple learners.
Here we use {help pystacked} with the partially linear model,
but it can be used with any model supported by {cmd:ddml}.{p_end}

{pstd}Note: the additional support provided by {opt ddml} for {help pystacked} (see {help ddml##pystacked:above})
is available only if, as in this example, {help pystacked} is the only learner for each conditional expectation.
Mutliple learners are provided to {help pystacked}, not directly to {opt ddml}.

{pstd}Add supervised machine learners for estimating conditional expectations.
The first learner in the stacked ensemble is OLS.
We also use cross-validated lasso, ridge and two random forests with different settings, 
which we save in the following macros:{p_end}

{phang2}. {stata "global rflow max_features(5) min_samples_leaf(1) max_samples(.7)"}{p_end}
{phang2}. {stata "global rfhigh max_features(5) min_samples_leaf(10) max_samples(.7)"}{p_end}

{phang2}. {stata "ddml E[Y|X]: pystacked $Y $X || method(ols) || method(lassocv) || method(ridgecv) || method(rf) opt($rflow) || method(rf) opt($rfhigh), type(reg)"}{p_end}
{phang2}. {stata "ddml E[D|X]: pystacked $D $X || method(ols) || method(lassocv) || method(ridgecv) || method(rf) opt($rflow) || method(rf) opt($rfhigh), type(reg)"}{p_end}

{pstd}Note: Options before ":" and after the first comma refer to {cmd:ddml}. 
Options that come after the final comma refer to the estimation command. 
Make sure to not confuse the two types of options.{p_end}

{pstd}Check if learners were correctly added:{p_end}

{phang2}. {stata "ddml desc, learners"}{p_end}

{pstd} Cross-fitting: The learners are iteratively fitted on the training data.
This step may take a while, depending on the number of learners, repetitions, folds, etc.
In addition to the standard stacking done by {help pystacked},
also request short-stacking to be done by {opt ddml}.
Whereas stacking relies on (out-of-sample) cross-validated predicted values
to obtain the relative weights for the base learners,
short-stacking uses the (out-of-sample) cross-fitted predicted values.{p_end}

{phang2}. {stata "ddml crossfit, shortstack"}{p_end}

{pstd}Finally, we estimate the coefficients of interest.{p_end}

{phang2}. {stata "ddml estimate, robust"}{p_end}

{pstd}Examine the standard ({cmd:pystacked}) stacking weights as well as the {opt ddml} short-stacking weights.{p_end}

{phang2}. {stata "ddml extract, show(stweights)"}{p_end}
{phang2}. {stata "ddml extract, show(ssweights)"}{p_end}

{pstd}Replicate the {opt ddml estimate} short-stacking results for resample 2 by hand,
using the estimated conditional expectations generated by {opt ddml},
and compare using {opt ddml estimate, replay}:{p_end}

{phang2}. {stata "cap drop Yresid"}{p_end}
{phang2}. {stata "cap drop Dresid"}{p_end}
{phang2}. {stata "gen double Yresid = $Y - Y_net_tfa_ss_2"}{p_end}
{phang2}. {stata "gen double Dresid = $D - D_e401_ss_2"}{p_end}
{phang2}. {stata "regress Yresid Dresid, robust"}{p_end}
{phang2}. {stata "ddml estimate, mname(m0) spec(ss) rep(2) notable replay"}{p_end}

{pstd}Obtain the estimated coefficient using ridge - the 3rd {help pystacked} learner - 
as the only learner for the 2nd cross-fit estimation (resample 2),
using the estimated conditional expectations generated by {opt ddml} and {help pystacked}.
This can be done using {opt ddml estimate} with the {opt y(.)} and {opt d(.)} options:
"L3" means the 3rd learner and "_2" means resample 2.
Then replicate by hand.{p_end}

{phang2}. {stata "ddml estimate, y(Y1_pystacked_L3_2) d(D1_pystacked_L3_2) robust"}{p_end}
{phang2}. {stata "cap drop Yresid"}{p_end}
{phang2}. {stata "cap drop Dresid"}{p_end}
{phang2}. {stata "gen double Yresid = $Y - Y1_pystacked_L3_2"}{p_end}
{phang2}. {stata "gen double Dresid = $D - D1_pystacked_L3_2"}{p_end}
{phang2}. {stata "regress Yresid Dresid, robust"}{p_end}