{smcl} {* *! version 30aug2024}{...} {smcl} {pstd}{ul:Partially-linear model with {help pystacked} and stacking}:{p_end} {pstd}Preparation: load the data, define global macros, set the seed and initialize the model.{p_end} {phang2}. {stata "use https://github.com/aahrens1/ddml/raw/master/data/sipp1991.dta, clear"}{p_end} {phang2}. {stata "global Y net_tfa"}{p_end} {phang2}. {stata "global D e401"}{p_end} {phang2}. {stata "global X tw age inc fsize educ db marr twoearn pira hown"}{p_end} {phang2}. {stata "set seed 42"}{p_end} {phang2}. {stata "ddml init partial, kfolds(2) reps(2)"}{p_end} {pstd}Add supervised machine learners for estimating conditional expectations. For simplicity, we use {help pystacked}'s default learners: OLS, cross-validated lasso, and gradient boosting.{p_end} {phang2}. {stata "ddml E[Y|X]: pystacked $Y $X"}{p_end} {phang2}. {stata "ddml E[D|X]: pystacked $D $X"}{p_end} {pstd} Cross-fitting and estimation: The learners are iteratively fitted on the training data to obtain the estimated conditional expectations, and then the causal coefficient of interest is estimated along with heteroskedastic-consistent SEs. Note that the initial stacking is specified at the {help ddml crossfit:cross-fitting} stage. In addition to the standard stacking done by {helpb pystacked}, also request short-stacking and pooled-stacking to be done by {opt ddml}.{p_end} {phang2}. {stata "ddml crossfit, shortstack poolstack"}{p_end} {phang2}. {stata "ddml estimate, robust"}{p_end} {pstd}Examine the standard ({cmd:pystacked}) stacking weights as well as the {opt ddml} short-stacking and pooled-stacking weights.{p_end} {phang2}. {stata "ddml extract, show(stweights)"}{p_end} {phang2}. {stata "ddml extract, show(ssweights)"}{p_end} {phang2}. {stata "ddml extract, show(psweights)"}{p_end} {pstd} Re-stack without cross-fitting, using the single-best learner instead of the default constrained nonlinear least squares. We do this using the {help ddml estimate} command. Since no stacking method is specified, restacking will be done for all three methods.{p_end} {phang2}. {stata "ddml estimate, robust finalest(singlebest)"}{p_end} {pstd} As above, but request short-stacking only at the cross-fitting stage. Note the speed improvement.{p_end} {phang2}. {stata "ddml crossfit, shortstack nostdstack"}{p_end} {phang2}. {stata "ddml estimate, robust"}{p_end} {pstd} Re-stack the above without cross-fitting, using OLS as the final estimator. Use the option {opt shortstack} since only these results are re-stacked.{p_end} {phang2}. {stata "ddml estimate, robust shortstack finalest(ols)"}{p_end} {phang2}. {stata "ddml estimate, robust shortstack finalest(ols)"}{p_end} {pstd}{ul:Extended example with specified {help pystacked} learners and settings}:{p_end} {pstd}Same example as above, but specify the base learners explicitly. We again make use of {help pystacked} integration, so there is a single call to {help pystacked} for each conditional expectation. The first learner in the stacked ensemble is OLS. We also use cross-validated lasso, ridge and two random forests with different settings. The settings are stored in macros for readability.{p_end} {phang2}. {stata "ddml init partial, kfolds(2) reps(2)"}{p_end} {phang2}. {stata "global rflow max_features(5) min_samples_leaf(1) max_samples(.7)"}{p_end} {phang2}. {stata "global rfhigh max_features(5) min_samples_leaf(10) max_samples(.7)"}{p_end} {phang2}. {stata "ddml E[Y|X]: pystacked $Y $X || method(ols) || method(lassocv) || method(ridgecv) || method(rf) opt($rflow) || method(rf) opt($rfhigh), type(reg)"}{p_end} {phang2}. {stata "ddml E[D|X]: pystacked $D $X || method(ols) || method(lassocv) || method(ridgecv) || method(rf) opt($rflow) || method(rf) opt($rfhigh), type(reg)"}{p_end} {pstd}Note: Options before ":" and after the first comma refer to {cmd:ddml}. Options that come after the final comma refer to the estimation command. Make sure to not confuse the two types of options.{p_end} {pstd}The learners are iteratively fitted on the training data. In addition to the standard stacking done by {helpb pystacked}, also request short-stacking to be done by {opt ddml}. Finally, estimate the coefficients of interest.{p_end} {phang2}. {stata "ddml crossfit, shortstack"}{p_end} {phang2}. {stata "ddml estimate, robust"}{p_end} {pstd}Examine the standard ({cmd:pystacked}) stacking weights as well as the {opt ddml} short-stacking weights.{p_end} {phang2}. {stata "ddml extract, show(stweights)"}{p_end} {phang2}. {stata "ddml extract, show(ssweights)"}{p_end}