Title
Least Angle Regression, Forward Stagewise Regression, Lasso estimation
Syntax
lars [varlist] [if] [in] [ , options]
options Description ------------------------------------------------------------------------- Main algorithm(string) specifies the algorithm to be used in estimation, default is lars. eps(#) is the small number taken as the "machine zero", the default is 0.000001. graph specifies whether a graph of the entire model sequence is plotted. gopt(string) specifies options to be included in the graph. nooutput do not display any output.
Prediction type(string) specifies that coefficients are produced from a single realisation of the algorithm. mode(string) specifies the mode of the prediction, the default is step. s(#) specifies the value at which the prediction is made, the default is 0.5 that means halfway between steps 0 and 1. -------------------------------------------------------------------------
Description
Least Angle Regression is a model-building algorithm that considers parsimony as well as prediction accuracy. This method is covered in detail by the paper Efron, Hastie, Johnstone and Tibshirani (2004), published in The Annals of Statistics. Their motivation for this method was a computationally simpler algorithm for the Lasso and Forward Stagewise regression.
There are many criticisms of stepwise regression, one of which is that it is a "greedy" algorithm and that the regression coefficients are too large. Ridge regression is one method of model-building that shrinks the coefficients by making the sum of the squared coefficients less than some constant. The Lasso is similar but the constaint is that the sum of the "mod" coefficients is less than a constant. One implication of this will be that the solution will contain coefficients that are exactly 0 and hence have the property of parsimony i.e. a simpler model.
The method implemented here is Least Angle Regression but the same algorithm can be used to get the Lasso solution or the Forward Stagewise solution. It is nearly a complete port of the LARS package written by Hastie and Efron but I have not translated everything so if anyone spots anything needed or a bug just email me.
Options
+------+ ----+ Main +-------------------------------------------------------------
algorithm(string) specifies the algorithm to be used in estimation. There are three choices: Least angle regression; Lasso and; Forward Stagewise. The connection between these is discussed in the Efron paper.
eps(#) is the small number taken as the "machine zero", the default is 0.000001.
type(string) specified that coefficients are produced from a single realisation of the algorithm. The default string must be "coefficients". The original code also produced fitted values, this will be implemented in the future.
mode(string) specifies the mode of the prediction, the default is step.
s(#) specifies the value at which the prediction is made, the default is 4.1 that means .1 between steps 4 and 5.
graph specifies whether a graph of the entire model sequence is plotted. The default graph and only one implemented at this moment contains the coefficients on the y-axis and steps on the x-axis.
gopt(string) specifies options to be included in the automatic graph.
nooutput do not display any output. The estimation still occurs and the usual output is placed in the saved results.
Examples
The command can be demonstrated by clicking the text below
sysuse auto,replace <---- click to load up dataset *the old dataset is lost*!
Use lars to find the least angle regression solution. The coefficients are displayed for the model with the lowest Cp statistic.
lars price weight length mpg turn rep78 headroom trunk displacement gear_ratio foreign
Now the Lasso estimation, in this case gives the same answer as least angle regression but using the g() option plots the model sequqnce.
lars price weight length mpg turn rep78 headroom trunk displacement gear_ratio foreign, a(lasso) g
Again forward stagewise gives the same solution so this example demonstrates the use of extra options for the graph.
lars price weight length mpg turn rep78 headroom trunk displacement gear_ratio foreign, a(stagewise) g gopt(title(stagewise))
Saved Results
r(RSS) the residual sums of squares for each step r(R2) the r-squared values r(newbetas) the coefficients from the prediction part r(cp) the Cp statistic for each step r(normx) the sum of squares for the covariates, i.e. the normalising con > stants. r(beta) the beta coefficients for each step r(sbeta) the beta coefficients multiplied by the normx matrix
Author
Adrian Mander, MRC Biostatistics Unit, Cambridge, UK.
Email adrian.mander@mrc-bsu.cam.ac.uk
See Also
Related commands: