1.0 is a slack parameter; gamma controls the confidence level. The alternative formula lambda0 = 2c*sqrt(N)*sqrt(2*log(2p/(gamma/log(N)))) is available with the {opt lalt} option; with the cluster-lasso, the number of clusters (saved in {opt e(N_clust)}) replaces N. The constants c and gamma can be set using the {opt c(real)} and {opt gamma(real)} options. The denominator of the gamma fraction can be set with the {opt gammad(real)} option; some papers in the literature set the denominator to 1, i.e., {opt gammad(1)} and lambda0=2c*sqrt(N)*invnormal(1-gamma/(2p)). The {opt xdep} option is another alternative that implements an "X-dependent" penalty level lambda0; see Belloni and Chernozhukov ({helpb rlasso##BC2011:2011}) and Belloni et al. ({helpb rlasso##BCH2013:2013}) for discussion. The {opt c(real)} option is ignored when {opt xdep} is specified. {pstd} The default lambda for the lasso is lambda0*rmse, where rmse is an estimate of the standard deviation of the error variance. The sqrt-lasso differs from the standard lasso in that the penalty term lambda is pivotal in the homoskedastic case and does not depend on the error variance. The default for the sqrt-lasso is lambda=lambda0=c*sqrt(N)*invnormal(1-(gamma/log(N))/(2*p)) (note the absence of the factor of "2" vs. the lasso lambda). {marker loadings}{...} {title:Penalty loadings} {pstd} As is standard in the lasso literature, regressors are standardized to have unit variance. By default, standardization is achieved by incorporating the standard deviations of the regressors into the penalty loadings. In the default homoskedastic case, the penalty loadings are the vector of standard deviations of the regressors. The normalized penalty loadings are the penalty loadings normalized by the SDs of the regressors. In the homoskedastic case the normalized penalty loadings are a vector of 1s. {opt rlasso} saves the vector of penalty loadings, the vector of normalized penalty loadings, and the vector of SDs of the regressors X in {opt e(.)} macros. {pstd} Penalty loadings are constructed after the partialling-out of unpenalized regressors and/or the FE (fixed-effects) transformation, if applicable. A alternative to partialling-out unpenalized regressors with the {opt partial(varlist)} option is to give them penalty loadings of zero with the {opt pnotpen(varlist)} option. By the Frisch-Waugh-Lovell Theorem for the lasso (Yamada {helpb rlasso##Yam2017:2017}), the estimated lasso coefficients are the same in theory (but see {helpb rlasso##notpen:below}) whether the unpenalized regressors are partialled-out or given zero penalty loadings, so long as the same penalty loadings are used for the penalized regressors in both cases. Note that the calculation of the penalty loadings in both the {opt partial(.)} and {opt pnotpen(.)} cases involves adjustments for the partialled-out variables. This is different from the {opt lasso2} handling of unpenalized variables specified in the {opt lasso2} option {opt notpen(.)}, where no such adjustment of the penalty loadings is made (and is why the two no-penalization options are named differently). {pstd} Regressor-specific penalty loadings for the heteroskedastic and clustered cases are derived following the methods described in Belloni et al. ({helpb rlasso##BCCH2012:2012}, {helpb rlasso##BCH2013:2013}, {helpb rlasso##BCH2014:2014}, {helpb rlasso##BCW2015:2015}, {helpb rlasso##BCHK2016:2016}). The penalty loadings for the heteroskedastic-robust case have elements of the form sqrt[avg(x^2e^2)]/sqrt[avg(e^2)] where x is a (demeaned) regressor, e is the residual, and sqrt[avg(e^2)] is the root mean squared error; the normalized penalty loadings have elements sqrt[avg(x^2e^2)]/(sqrt[avg(x^2)]sqrt[avg(e^2)]) where the sqrt(avg(x^2) in the denominator is SD(x), the standard deviation of x. This corresponds to the presentation of penalty loadings in Belloni et al. ({helpb rlasso##BCW2014:2014}; see Algorithm 1 but note that in their presentation, the predictors x are assumed already to be standardized). NB: in the presentation we use here, the penalty loadings for the lasso and sqrt-lasso are the same; what differs is the overall penalty term lambda. {pstd} The cluster-robust case is similar to the heteroskedastic case except in two respects. First, the numerator sqrt[avg(x^2e^2)] in the heteroskedastic case is replaced by sqrt[avg(u_i^2)], where (using the notation of the Stata manual's discussion of the {mansection P _robust:_robust} command) u_i is the sum of x_ij*e_ij over the j members of cluster i; see Belloni et al. ({helpb rlasso##BCHK2016:2016}). Second, the penalty loadings incorporate the term 1/sqrt(Tbar) where Tbar is the average cluster size. Again in the presentation used here, the cluster-lasso and cluster-sqrt-lasso penalty loadings are the same. The unit vector is again the benchmark for the standardized penalty loadings. NB: also following {helpb _robust}, the denominator of avg(u_i^2) and Tbar is (N_clust-1). {pstd} The {opt center} option centers the x_ij*e_ij terms (or the cluster-lasso case, the u_i terms) prior to calculating the penalty loadings. {marker supscore}{...} {title:Sup-score test of joint significance} {pstd} {opt rlasso} with the {opt supscore} option reports a test of the null hypothesis H0: beta_1 = ... = beta_p = 0. i.e., a test of the joint significance of the regressors (or, alternatively, a test that H0: s=0; of the full set of p regressors, none is in the true model). The test follows Chernozhukov et al. ({helpb rlasso##CCK2013:2013}, Appendix M); see also Belloni et al. ({helpb rlasso##BCCH2012:2012}, {helpb rlasso##BCH2013:2013}). (The variables are assumed to be rescaled to be centered and with unit variance.) {pstd} If the null hypothesis is correct and the rest of the model is well-specified (including the assumption that the regressors are orthogonal to the disturbance e), then E(e*x_j) = E((y-beta_0)*x_j) = 0, j=1...p where beta_0 is the intercept. The sup-score statistic is S=N*max_j(abs(avg((y-b_0)*x_j))/(sqrt(avg(((y-b_0)*x_j)^2)))), where: (a) the numerator abs(avg((y-b_0)*x_j)) is the absolute value of the average score for regressor x_j and b_0 is sample mean of y; (b) the denominator sqrt(avg(((y-b_0)*x_j)^2)) is the sample standard deviation of the score; (c) the statistic is N times the maximum across the p regressors of the ratio of (a) to (b). {pstd} The p-value for the sup-score test is obtained by a multiplier bootstrap procedure simulating the statistic W, defined as W=N*max_j(abs(avg((y-b_0)*x_j*u))/(sqrt(avg(((y-b_0)*x_j)^2)))) where u is an iid standard normal variate independent of the data. The {opt ssnumsim(int)} option controls the number of simulated draws (default=500); {opt ssnumsim(0)} requests that the sup-score statistic is reported without a simulation-based p-value. {opt rlasso} also reports a conservative critical value (asymptotic bound) as per Belloni et al. ({helpb rlasso##BCCH2012:2012}, {helpb rlasso##BCCH2013:2013}), defined as c*sqrt(N)*invnormal(1-gamma/(2p)) where gamma is the same gamma as in the penalty level lambda, set by the {opt gamma(real)} option (default=0.10). Note that the critical value is identical to the sqrt-lasso lambda with option {opt gammad(1)}. {marker computation}{...} {title:Computational notes} {pstd} A computational alternative to the default of standardizing "on the fly" (i.e., incorporating the standardization into the lasso penalty loadings) is to standardize all variables to have unit variance prior to computing the lasso coefficients. This can be done using the {opt prestd} option. The results are equivalent in theory. The {opt prestd} option can lead to improved numerical precision or more stable results in the case of difficult problems; the cost is (a typically small) computation time required to standardize the data. {marker notpen}{...} {pstd} Either the {opt partial(varlist)} option or the {opt pnotpen(varlist)} option can be used for variables that should not be penalized by the lasso. The options are equivalent in theory (see above), but numerical results can differ in practice because of the different calculation methods used. Partialling-out variables can lead to improved numerical precision or more stable results in the case of difficult problems vs. specifying the variables as unpenalized, but may be slower in terms of computation time. {pstd} By default the constant (if present) is not penalized; this is equivalent to mean-centering prior to estimation. The {opt partial(varlist)} option also partials out the constant (if present); to partial out just the constant, specify {opt partial(_cons)}. Both {opt partial(.)} and {opt fe} mean-center the data; the {opt nocons} option is redundant in this case and may not be specified with these options. If the {opt nocons} option is specified an intercept is not included in the model, but the estimated penalty loadings (see {help rlasso##loadings:above}) are still estimated using mean-centered regressors. {pstd} The {opt prestd} and {opt pnotpen(varlist)} vs. {opt partial(varlist)} options can be used as simple checks for numerical stability by comparing results that should be equivalent in theory. {pstd} The {opt fe} fixed-effects option is equivalent to (but computationally faster and more accurate than) specifying unpenalized panel-specific dummies. The fixed-effects ("within") transformation also removes the constant as well as the fixed effects. The panel variable used by the {opt fe} option is the panel variable set by {helpb xtset}. {marker misc}{...} {title:Miscellaneous} {pstd} By default {opt rlasso} reports only the set of selected variables and their lasso and post-lasso coefficients; the omitted coefficients are not reported in the regression output. The {opt postall} and {opt displayall} options allow the full coefficient vector (with coefficients of unselected variables set to zero) to be either posted in {opt e(b)} or displayed as output. {pstd} {opt rlasso}, like the lasso in general, accommodates possibly perfectly-collinear sets of regressors. Stata's {helpb fvvarlist:factor variables} are supported by {opt rlasso} (as well as by {helpb lasso2}). Users therefore have the option of specifying as regressors one or more complete sets of factor variables or interactions with no base levels using the {it:ibn} prefix. This can be interpreted as allowing {opt rlasso} to choose the members of the base category. {pstd} The choice of whether to use {opt partial(varlist)} or {opt pnotpen(varlist)} will depend on the circumstances faced by the user. The {opt partial(varlist)} option can be helpful in dealing with data that have scaling problems or collinearity issues; in these cases it can be more accurate and/or achieve convergence faster than the {opt pnotpen(varlist)} option. The {opt pnotpen(varlist)} option will sometimes be faster because it avoids using the pre-estimation transformation employed by {opt partial(varlist)}. The two options can be used simultaneously (but not for the same variables). {pstd} The treatment of standardization, penalization and partialling-out in {opt rlasso} differs from that of {opt lasso2}. In the {opt rlasso} treatment, standardization incorporates the partialling-out of regressors listed in the {opt pnotpen(varlist)} list as well as those in the {opt partial(varlist)} list. This is in order to maintain the equivalence of the lasso estimator irrespective of which option is used for unpenalized variables (see the discussion of the Frisch-Waugh-Lovell Theorem for the lasso above). In the {opt lasso2} treatment, standardization takes place after the partialling-out of only the regressors listed in the {opt notpen(varlist)} option. In other words, {opt rlasso} adjusts the penalty loadings for any unpenalized variables; {opt lasso2} does not. For further details, see {helpb lasso2}. {pstd} Belloni et al. ({helpb rlasso##BCW2014:2014}) recommend using gamma=0.05 (instead of the {opt rlasso} default 0.1)) with the sqrt-lasso. {pstd} The initial overhead for fixed-effects estimation and/or partialling out and/or pre-estimation standardization (creating temporary variables and then transforming the data) can be noticable for large datasets. For problems that involve looping over data, users may wish to first transform the data by hand. {pstd} If a small number of correlations is set using the {opt corrnum(int)} option, users may want to increase the number of penalty loadings iterations from the default of 2 to something higher using the {opt maxupsiter(int)} option. {pstd} The sup-score p-value is obtained by simulation, which can be time-consuming for large datasets. To skip this and use only the conservative (asymptotic bound) critical value, set the number of simulations to zero with the {opt ssnumsim(0)} option. {marker examples}{...} {title:Examples using prostate cancer data from Hastie et al. ({helpb rlasso##HTF2009:2009})} {pstd}Load prostate cancer data.{p_end} {phang2}. {stata "clear"}{p_end} {phang2}. {stata "insheet using https://web.stanford.edu/~hastie/ElemStatLearn/datasets/prostate.data, tab"}{p_end} {pstd}Estimate lasso using data-driven lambda penalty; default homoskedasticity case.{p_end} {phang2}. {stata "rlasso lpsa lcavol lweight age lbph svi lcp gleason pgg45"}{p_end} {pstd}Use square-root lasso instead.{p_end} {phang2}. {stata "rlasso lpsa lcavol lweight age lbph svi lcp gleason pgg45, sqrt"}{p_end} {pstd}Illustrate relationships between lambda, lambda0 and penalty loadings:{p_end} {pstd}Basic usage: homoskedastic case, lasso{p_end} {phang2}. {stata "rlasso lpsa lcavol lweight age lbph svi lcp gleason pgg45"}{p_end} {pstd}lambda=lambda0*SD is lasso penalty; incorporates the estimate of the error variance{p_end} {pstd}default lambda0 is 2c*sqrt(N)*invnormal(1-(gamma/log(N))/(2*p)){p_end} {phang2}. {stata "di e(lambda)"}{p_end} {phang2}. {stata "di e(lambda0)"}{p_end} {pstd}In the homoskedastic case, penalty loadings are the vector of SDs of penalized regressors{p_end} {phang2}. {stata "mat list e(eUps)"}{p_end} {pstd}...and the standardized penalty loadings are a vector of 1s.{p_end} {phang2}. {stata "mat list e(sUps)"}{p_end} {pstd}Heteroskedastic case, lasso{p_end} {phang2}. {stata "rlasso lpsa lcavol lweight age lbph svi lcp gleason pgg45, robust"}{p_end} {pstd}lambda and lambda0 are the same as for the homoskedastic case{p_end} {phang2}. {stata "di e(lambda)"}{p_end} {phang2}. {stata "di e(lambda0)"}{p_end} {pstd}Penalty loadings account for heteroskedasticity as well as incorporating SD(x){p_end} {phang2}. {stata "mat list e(eUps)"}{p_end} {pstd}...and the standardized penalty loadings are not a vector of 1s.{p_end} {phang2}. {stata "mat list e(sUps)"}{p_end} {pstd}Homoskedastic case, sqrt-lasso{p_end} {phang2}. {stata "rlasso lpsa lcavol lweight age lbph svi lcp gleason pgg45, sqrt"}{p_end} {pstd}with the sqrt-lasso, the default lambda=lambda0=c*sqrt(N)*invnormal(1-(gamma/log(N))/(2*p));{p_end} {pstd}note the difference by a factor of 2 vs. the standard lasso lambda0{p_end} {phang2}. {stata "di e(lambda)"}{p_end} {phang2}. {stata "di e(lambda0)"}{p_end} {pstd}{opt rlasso} vs. {opt lasso2} (if installed){p_end} {phang2}. {stata "rlasso lpsa lcavol lweight age lbph svi lcp gleason pgg45"}{p_end} {pstd}lambda=lambda0*SD is lasso penalty; incorporates the estimate of the error variance{p_end} {pstd}default lambda0 is 2c*sqrt(N)*invnormal(1-(gamma/log(N))/(2*p)){p_end} {phang2}. {stata "di %8.5f e(lambda)"}{p_end} {pstd}Replicate {opt rlasso} estimates using {opt rlasso} lambda and {opt lasso2}{p_end} {phang2}. {stata "lasso2 lpsa lcavol lweight age lbph svi lcp gleason pgg45, lambda(44.34953)"}{p_end} {title:Examples using data from Acemoglu-Johnson-Robinson ({helpb rlasso##AJR2001:2001})} {pstd}Load and reorder AJR data for Table 6 and Table 8 (datasets need to be in current directory).{p_end} {phang2}. {stata "clear"}{p_end} {phang2}. {browse "https://economics.mit.edu/files/5138":(click to download maketable6.zip from economics.mit.edu)}{p_end} {phang2}. {stata "unzipfile maketable6"}{p_end} {phang2}. {browse "https://economics.mit.edu/files/5140":(click to download maketable8.zip from economics.mit.edu)}{p_end} {phang2}. {stata "unzipfile maketable8"}{p_end} {phang2}. {stata "use maketable6"}{p_end} {phang2}. {stata "merge 1:1 shortnam using maketable8"}{p_end} {phang2}. {stata "keep if baseco==1"}{p_end} {phang2}. {stata "order shortnam logpgp95 avexpr lat_abst logem4 edes1975 avelf, first"}{p_end} {phang2}. {stata "order indtime euro1900 democ1 cons1 democ00a cons00a, last"}{p_end} {pstd}Basic usage:{p_end} {phang2}. {stata "rlasso logpgp95 lat_abst edes1975 avelf temp* humid* steplow-oilres"}{p_end} {pstd}Heteroskedastic-robust penalty loadings:{p_end} {phang2}. {stata "rlasso logpgp95 lat_abst edes1975 avelf temp* humid* steplow-oilres, robust"}{p_end} {pstd}Partialling-out vs. non-penalization:{p_end} {phang2}. {stata "rlasso logpgp95 lat_abst edes1975 avelf temp* humid* steplow-oilres, partial(lat_abst)"}{p_end} {phang2}. {stata "rlasso logpgp95 lat_abst edes1975 avelf temp* humid* steplow-oilres, pnotpen(lat_abst)"}{p_end} {pstd}Request sup-score test (H0: all betas=0):{p_end} {phang2}. {stata "rlasso logpgp95 lat_abst edes1975 avelf temp* humid* steplow-oilres, supscore"}{p_end} {title:Examples using data from Angrist-Krueger ({helpb rlasso##AK1991:1991})} {pstd}Load AK data and rename variables (dataset needs to be in current directory). NB: this is a large dataset (330k observations) and estimations may take some time to run on some installations.{p_end} {phang2}. {stata "clear"}{p_end} {phang2}. {browse "https://economics.mit.edu/files/397":(click to download asciiqob.zip from economics.mit.edu)}{p_end} {phang2}. {stata "unzipfile asciiqob.zip"}{p_end} {phang2}. {stata "infix lnwage 1-9 educ 10-20 yob 21-31 qob 32-42 pob 43-53 using asciiqob.txt"}{p_end} {phang2}. {stata "xtset pob"}{p_end} {pstd}State (place of birth) fixed effects; regressors are year of birth, quarter of birth and QOBxYOB.{p_end} {phang2}. {stata "rlasso educ i.yob# #i.qob, fe"}{p_end} {pstd}As above but explicit penalized state dummies and all categories (no base category) for all factor vars.{p_end} {pstd}Note that the (unpenalized) constant is reported.{p_end} {phang2}. {stata "rlasso educ ibn.yob# #ibn.qob ibn.pob"}{p_end} {pstd}State fixed effects; regressors are YOB, QOB and QOBxYOB; cluster on state.{p_end} {phang2}. {stata "rlasso educ i.yob# #i.qob, fe cluster(pob)"}{p_end} {title:Example using data from Belloni et al. ({helpb rlasso##BCH2015:2015})} {pstd}Load dataset on eminent domain (available at journal website).{p_end} {phang2}. {stata "clear"}{p_end} {phang2}. {stata "import excel using CSExampleData.xlsx, first"}{p_end} {pstd}Settings used in Belloni et al. ({helpb rlasso##BCH2015:2015}) - results as in text discussion (p=147):{p_end} {phang2}. {stata "rlasso NumProCase Z* BA BL DF, robust lalt corrnum(0) maxupsiter(100)"}{p_end} {phang2}. {stata "di e(p)"}{p_end} {pstd}Settings used in Belloni et al. ({helpb rlasso##BCH2015:2015}) - results as in journal replication file (p=144):{p_end} {phang2}. {stata "rlasso NumProCase Z*, robust lalt corrnum(0) maxupsiter(100)"}{p_end} {phang2}. {stata "di e(p)"}{p_end} {marker saved_results}{...} {title:Saved results} {pstd} {cmd:rlasso} saves the following in {cmd:e()}: {synoptset 19 tabbed}{...} {p2col 5 19 23 2: scalars}{p_end} {synopt:{cmd:e(N)}}sample size{p_end} {synopt:{cmd:e(N_clust)}}number of clusters in cluster-robust estimation{p_end} {synopt:{cmd:e(N_g)}}number of groups in fixed-effects model{p_end} {synopt:{cmd:e(p)}}number of penalized regressors in model{p_end} {synopt:{cmd:e(s)}}number of selected regressors{p_end} {synopt:{cmd:e(s0)}}number of selected and unpenalized regressors including constant (if present){p_end} {synopt:{cmd:e(lambda0)}}penalty level excluding rmse (default = 2c*sqrt(N)*invnormal(1-(gamma/log(N))/(2*p))){p_end} {synopt:{cmd:e(lambda)}}lasso: penalty level including rmse (=lambda0*rmse); sqrt-lasso: lambda=lambda0{p_end} {synopt:{cmd:e(slambda)}}standardized lambda; equiv to lambda used on standardized data; lasso: slambda=lambda/SD(depvar); sqrt-lasso: slambda=lambda0{p_end} {synopt:{cmd:e(c)}}parameter in penalty level lambda{p_end} {synopt:{cmd:e(gamma)}}parameter in penalty level lambda{p_end} {synopt:{cmd:e(gammad)}}parameter in penalty level lambda{p_end} {synopt:{cmd:e(niter)}}number of iterations for shooting algorithm{p_end} {synopt:{cmd:e(maxiter)}}max number of iterations for shooting algorithm{p_end} {synopt:{cmd:e(nupsiter)}}number of iterations for loadings algorithm{p_end} {synopt:{cmd:e(maxupsiter)}}max iterations for loadings algorithm{p_end} {synopt:{cmd:e(rmse)}}rmse using lasso resduals{p_end} {synopt:{cmd:e(rmseOLS)}}rmse using post-lasso residuals{p_end} {synopt:{cmd:e(cons)}}=1 if constant in model, =0 otherwise{p_end} {synopt:{cmd:e(fe)}}=1 if fixed-effects model, =0 otherwise{p_end} {synopt:{cmd:e(center)}}=1 if moments have been centered{p_end} {synopt:{cmd:e(supscore)}}sup-score statistic{p_end} {synopt:{cmd:e(supscore_p)}}sup-score p-value{p_end} {synopt:{cmd:e(supscore_cv)}}sup-score critical value (asymptotic bound){p_end} {synoptset 19 tabbed}{...} {p2col 5 19 23 2: macros}{p_end} {synopt:{cmd:e(cmd)}}rlasso{p_end} {synopt:{cmd:e(depvar)}}name of dependent variable{p_end} {synopt:{cmd:e(varX)}}all regressors{p_end} {synopt:{cmd:e(varXmodel)}}penalized regressors{p_end} {synopt:{cmd:e(pnotpen)}}unpenalized regressors{p_end} {synopt:{cmd:e(partial)}}partialled-out regressors{p_end} {synopt:{cmd:e(selected)}}selected and penalized regressors{p_end} {synopt:{cmd:e(selected0)}}all selected regressors including unpenalized and constant (if present){p_end} {synopt:{cmd:e(method)}}lasso or sqrt-lasso{p_end} {synopt:{cmd:e(estimator)}}lasso, sqrt-lasso or post-lasso posted in e(b){p_end} {synopt:{cmd:e(robust)}}heteroskedastic-robust penalty loadings{p_end} {synopt:{cmd:e(clustvar)}}variable defining clusters for cluster-robust penalty loadings{p_end} {synopt:{cmd:e(ivar)}}variable defining groups for fixed-effects model{p_end} {synoptset 19 tabbed}{...} {p2col 5 19 23 2: matrices}{p_end} {synopt:{cmd:e(b)}}posted coefficient vector{p_end} {synopt:{cmd:e(beta)}}lasso or sqrt-lasso coefficient vector{p_end} {synopt:{cmd:e(betaOLS)}}post-lasso coefficient vector{p_end} {synopt:{cmd:e(betaAll)}}full lasso or sqrt-lasso coefficient vector including omitted, factor base variables, etc.{p_end} {synopt:{cmd:e(betaAllOLS)}}full post-lasso coefficient vector including omitted, factor base variables, etc.{p_end} {synopt:{cmd:e(eUps)}}estimated penalty loadings{p_end} {synopt:{cmd:e(sUps)}}standardized penalty loadings (vector of 1s in homoskedastic case{p_end} {synoptset 19 tabbed}{...} {p2col 5 19 23 2: functions}{p_end} {synopt:{cmd:e(sample)}}estimation sample{p_end} {p2colreset}{...} {marker references}{...} {title:References} {marker AJR2001}{...} {phang} Acemoglu, D., Johnson, S. and Robinson, J.A. 2001. The colonial origins of comparative development: An empirical investigation. {it:American Economic Review}, 91(5):1369-1401. {browse "https://economics.mit.edu/files/4123":https://economics.mit.edu/files/4123} {p_end} {marker AK1991}{...} {phang} Angrist, J. and Kruger, A. 1991. Does compulsory school attendance affect schooling and earnings? {it:Quarterly Journal of Economics} 106(4):979-1014. {browse "http://www.jstor.org/stable/2937954":http://www.jstor.org/stable/2937954} {p_end} {marker BC2011}{...} {phang} Belloni, A. and Chernozhukov, V. 2011. High-dimensional sparse econometric models: An introduction. In Alquier, P., Gautier E., and Stoltz, G. (eds.), Inverse problems and high-dimensional estimation. Lecture notes in statistics, vol. 203. Springer, Berlin, Heidelberg. {browse "https://arxiv.org/pdf/1106.5242.pdf":https://arxiv.org/pdf/1106.5242.pdf} {p_end} {marker BCW2011}{...} {phang} Belloni, A., Chernozhukov, V. and Wang, L. 2011. Square-root lasso: Pivotal recovery of sparse signals via conic programming. {it:Biometrika} 98:791-806. {browse "https://doi.org/10.1214/14-AOS1204"} {p_end} {marker BCCH2012}{...} {phang} Belloni, A., Chen, D., Chernozhukov, V. and Hansen, C. 2012. Sparse models and methods for optimal instruments with an application to eminent domain. {it:Econometrica} 80(6):2369-2429. {browse "http://onlinelibrary.wiley.com/doi/10.3982/ECTA9626/abstract"} {p_end} {marker BCH2013}{...} {phang} Belloni, A., Chernozhukov, V. and Hansen, C. 2013. Inference for high-dimensional sparse econometric models. In {it:Advances in Economics and Econometrics: 10th World Congress}, Vol. 3: Econometrics, Cambridge University Press: Cambridge, 245-295. {browse "http://arxiv.org/abs/1201.0220"} {p_end} {marker BCH2014}{...} {phang} Belloni, A., Chernozhukov, V. and Hansen, C. 2014. Inference on treatment effects after selection among high-dimensional controls. {it:Review of Economic Studies} 81:608-650. {browse "https://doi.org/10.1093/restud/rdt044"} {p_end} {marker BCH2015}{...} {phang} Belloni, A., Chernozhukov, V. and Hansen, C. 2015. High-dimensional methods and inference on structural and treatment effects. {it:Journal of Economic Perspectives} 28(2):29-50. {browse "http://www.aeaweb.org/articles.php?doi=10.1257/jep.28.2.29"} {p_end} {marker BCHK2016}{...} {phang} Belloni, A., Chernozhukov, V., Hansen, C. and Kozbur, D. 2016. Inference in high dimensional panel models with an application to gun control. {it:Journal of Business and Economic Statistics} 34(4):590-605. {browse "http://amstat.tandfonline.com/doi/full/10.1080/07350015.2015.1102733"} {p_end} {marker BCW2014}{...} {phang} Belloni, A., Chernozhukov, V. and Wang, L. 2014. Pivotal estimation via square-root-lasso in nonparametric regression. {it:Annals of Statistics} 42(2):757-788. {browse "https://doi.org/10.1214/14-AOS1204"} {p_end} {marker CCK2013}{...} {phang} Chernozhukov, V., Chetverikov, D. and Kato, K. 2013. Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. {it:Annals of Statistics} 41(6):2786-2819. {browse "https://projecteuclid.org/euclid.aos/1387313390"} {p_end} {marker FHT2010}{...} {phang} Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. {it:Journal of Statistical Software} 33(1), 1–22. {browse "https://doi.org/10.18637/jss.v033.i01"} {p_end} {marker FU1998}{...} {phang} Fu, W.J. 1998. Penalized regressions: The bridge versus the lasso. {it:Journal of Computational and Graphical Statistics} 7(3):397-416. {browse "http://www.tandfonline.com/doi/abs/10.1080/10618600.1998.10474784"} {p_end} {marker HTF2009}{...} {phang} Hastie, T., Tibshirani, R. and Friedman, J. 2009. {it:The elements of statistical learning} (2nd ed.). New York: Springer-Verlag. {browse "https://web.stanford.edu/~hastie/ElemStatLearn/":https://web.stanford.edu/~hastie/ElemStatLearn/} {p_end} {marker SCH2016}{...} {phang} Spindler, M., Chernozhukov, V. and Hansen, C. 2016. High-dimensional metrics. {browse "https://cran.r-project.org/package=hdm":https://cran.r-project.org/package=hdm}. {p_end} {marker Tib1996}{...} {phang} Tibshirani, R. 1996. Regression shrinkage and selection via the lasso. {it:Journal of the Royal Statistical Society. Series B (Methodological)} 58(1):267-288. {browse "https://doi.org/10.2307/2346178"} {p_end} {marker Yam2017}{...} {phang} Yamada, H. 2017. The Frisch-Waugh-Lovell Theorem for the lasso and the ridge regression. {it:Communications in Statistics - Theory and Methods} 46(21):10897-10902. {browse "http://dx.doi.org/10.1080/03610926.2016.1252403"} {p_end} {marker acknowledgements}{title:Acknowledgements} {p}Thanks to Alexandre Belloni for providing Matlab code for the square-root-lasso. {marker citation}{...} {title:Citation of rlasso} {pstd}{opt rlasso} is not an official Stata command. It is a free contribution to the research community, like a paper. Please cite it as such: {p_end} {phang}Ahrens, A., Hansen, C.B., Schaffer, M.E. 2017. rlasso: Progam for lasso and sqrt-lasso estimation with data-driven penalization. {browse "http://ideas.repec.org/c/boc/bocode/xxxxx.html"}{p_end} {title:Authors} Achim Ahrens, Economic and Social Research Institute, Ireland achim.ahrens@esri.ie Christian B. Hansen, University of Chicago, USA Christian.Hansen@chicagobooth.edu Mark E Schaffer, Heriot-Watt University, UK m.e.schaffer@hw.ac.uk {title:Also see} {p 7 14 2} Help: {helpb lasso2}, {helpb cvlasso}, {help pdslasso}, {help ivlasso} (if installed){p_end}