-------------------------------------------------------------------------------
help: ridgereg                                                   dialog: ridger
> eg
-------------------------------------------------------------------------------

+-------+ ----+ Title +------------------------------------------------------------

ridgereg: OLS-Ridge Regression Models and Diagnostic Tests

+-------------------+ ----+ Table of Contents +------------------------------------------------

Syntax Description Ridge Model Options Weight Options Weighted Variable Type Options Options Model Selection Diagnostic Criteria Multicollinearity Diagnostic Tests Saved Results References

*** Examples

Author

+--------+ ----+ Syntax +-----------------------------------------------------------

ridgereg depvar indepvars [if] [in] , model(orr|grr1|grr2|grr3) [ weights(yh|yh2|abse|e2|le2|x|xi|x2|xi2) wvar(varname) kr(#) lmcol diag dn tolog mfx(lin, log) noconstant predict(new_var) resid(new_var) coll level(#) ]

+-------------+ ----+ Description +------------------------------------------------------

ridgereg estimates (OLS-Ridge Regression models, and computes many tests, i.e., Mmulticollinearity Tests, and Model Selection Diagnostic Criteria, and Marginal Effects and Elasticities.

R2, R2 Adjusted, and F-Test, are obtained from 4 ways: 1- (Buse 1973) R2. 2- Raw Moments R2. 3- squared correlation between predicted (Yh) and observed dependent variable (Y). 4- Ratio of variance between predicted (Yh) and observed dependent variable (Y).

- Adjusted R2: R2_a=1-(1-R2)*(N-1)/(N-K-1). - F-Test=R2/(1-R2)*(N-K-1)/(K).

+---------------------+ ----+ Ridge Model Options +----------------------------------------------

kr(#) Ridge k value, must be in the range (0 < k < 1).

IF kr(0) in ridge(orr, grr1, grr2, grr3), the model will be an OLS regression.

model(orr) : Ordinary Ridge Regression [Judge,et al(1988,p.878) eq.21.4.2] > . model(grr1): Generalized Ridge Regression [Judge,et al(1988,p.881) eq.21.4.12 > ]. model(grr2): Iterative Generalized Ridge [Judge,et al(1988,p.881) eq.21.4.12 > ]. model(grr3): Adaptive Generalized Ridge [Strawderman(1978)].

ridgereg estimates Ordinary Ridge regression as a multicollinearity remediation method. General form of Ridge Coefficients and Covariance Matrix are:

Br = inv[X'X + kI] X'Y

Cov=Sig^2 * inv[X'X + kI] (X'X) inv[X'X + kI]

where: Br = Ridge Coefficients Vector (k x 1). Cov = Ridge Covariance Matrix (k x k). Y = Dependent Variable Vector (N x 1). X = Independent Variables Matrix (N x k). k = Ridge Value (0 < k < 1). I = Diagonal Matrix of Cross Product Matrix (Xs'Xs). Xs = Standardized Variables Matrix in Deviation from Mean. Sig2 = (Y-X*Br)'(Y-X*Br)/DF

+----------------+ ----+ Weight Options +---------------------------------------------------

wvar(varname) Weighted Variable Name

+--------------------------------+ ----+ Weighted Variable Type Options +-----------------------------------

weights Options Description

weights(yh) Yh - Predicted Value weights(yh2) Yh^2 - Predicted Value Squared weights(abse) abs(E) - Absolute Value of Residual weights(e2) E^2 - Residual Squared weights(le2) log(E^2) - Log Residual Squared weights(x) (x) Variable weights(xi) (1/x) Inverse Variable weights(x2) (x^2) Squared Variable weights(xi2) (1/x^2) Inverse Squared Variable

+---------+ ----+ Options +----------------------------------------------------------

dn Use (N) divisor instead of (N-K) for Degrees of Freedom (DF)

noconstant Exclude Constant Term from Equation

predict(new_var)}Predicted values variable

resid(new_var)}Residuals values variable

mfx(lin, log) functional form: Linear model (lin), or Log-Log model (log), to compute Marginal Effects and Elasticities - In Linear model: marginal effects are the coefficients (Bm), and elasticities are (Es = Bm X/Y). - In Log-Log model: elasticities are the coefficients (Es), and the marginal effects are (Bm = Es Y/X). - mfx(log) and tolog options must be combined, to transform variables to l > og form.

tolog Convert dependent and independent variables to LOG Form in the memory for Log-Log regression. tolog Transforms depvar and indepvars to Log Form without lost the original data variables

coll keep collinear variables; default is removing collinear vari > ables.

+-------------------------------------+ ----+ Model Selection Diagnostic Criteria +------------------------------

diag Model Selection Diagnostic Criteria

- Log Likelihood Function LLF - Akaike Information Criterion (1974) AIC - Akaike Information Criterion (1973) Log AIC - Schwarz Criterion (1978) SC - Schwarz Criterion (1978) Log SC - Amemiya Prediction Criterion (1969) FPE - Hannan-Quinn Criterion (1979) HQ - Rice Criterion (1984) Rice - Shibata Criterion (1981) Shibata - Craven-Wahba Generalized Cross Validation (1979) GCV

+------------------------------------+ ----+ Multicollinearity Diagnostic Tests +-------------------------------

lmcol Multicollinearity Diagnostic Tests * Correlation Matrix * Multicollinearity Diagnostic Criteria * Farrar-Glauber Multicollinearity Tests Ho: No Multicollinearity - Ha: Multicollinearity * (1) Farrar-Glauber Multicollinearity Chi2-Test * (2) Farrar-Glauber Multicollinearity F-Test * (3) Farrar-Glauber Multicollinearity t-Test * Multicollinearity Ranges * Determinant of |X'X| * Theil R2 Multicollinearity Effect: - Gleason-Staelin Q0 - Heo Range Q1

- Multicollinearity Detection: 1. A high F statistic or R2 leads to reject the joint hypothesis that all of the coefficients are zero, but individual t-statistics are low. 2. High simple correlation coefficients are sufficient but not necessary for multicollinearity. 3. One can compute condition number. That is, the ratio of largest to smallest root of the matrix x'x. This may not always be useful as the standard errors of the estimates depend on the ratios of elements of characteristic vectors to the roots.

- Multicollinearity Remediation: 1. Use prior information or restrictions on the coefficients. One clever way to do this was developed by Theil and Goldberger. See tgmixed, and Theil(1971, pp 347-352). 2. Use additional data sources. This does not mean more of the same. It means pooling cross section and time series. 3. Transform the data. For example, inversion or differencing. 4. Use a principal components estimator. This involves using a weighted average of the regressors, rather than all of the regressors. 5. Another alternative regression technique is ridge regression. This involves putting extra weight on the main diagonal of X'X. 6. Dropping troublesome RHS variables. This begs the question of specification error.

+---------------+ ----+ Saved Results +----------------------------------------------------

ridgereg saves the following in e():

*** Model Selection Diagnostic Criteria: e(N) number of observations e(r2bu) R-squared (Buse 1973) e(r2bu_a) R-squared Adj (Buse 1973) e(r2raw) Raw Moments R2 e(r2raw_a) Raw Moments R2 Adj e(f) F-test e(fp) F-test P-Value e(wald) Wald-test e(waldp) Wald-test P-Value e(r2h) R2 Between Predicted (Yh) and Observed DepVar (Y) e(r2h_a) Adjusted r2h e(fh) F-test due to r2h e(fhp) F-test due to r2h P-Value

e(llf) Log Likelihood Function LLF e(aic) Akaike Information Criterion (1974) AIC e(laic) Akaike Information Criterion (1973) Log AIC e(sc) Schwarz Criterion (1978) SC e(lsc) Schwarz Criterion (1978) Log SC e(fpe) Amemiya Prediction Criterion (1969) FPE e(hq) Hannan-Quinn Criterion (1979) HQ e(rice) Rice Criterion (1984) Rice e(shibata) Shibata Criterion (1981) Shibata e(gcv) Craven-Wahba Generalized Cross Validation (1979) GCV

Matrixes e(b) coefficient vector e(V) variance-covariance matrix of the estimators e(mfxlin) Marginal Effect and Elasticity in Lin Form e(mfxlog) Marginal Effect and Elasticity in Log Form

+------------+ ----+ References +-------------------------------------------------------

D. Belsley (1991) "Conditioning Diagnostics, Collinearity and Weak Data in Regression", John Wiley & Sons, Inc., New York, USA.

D. Belsley, E. Kuh, and R. Welsch (1980) "Regression Diagnostics: Identifying Influential Data and Sources of Collinearity", John Wiley & Sons, Inc., New York, USA.

Damodar Gujarati (1995) "Basic Econometrics" 3rd Edition, McGraw Hill, New York, USA.

Evagelia, Mitsaki (2011) "Ridge Regression Analysis of Collinear Data", http://www.stat-athens.aueb.gr/~jpan/diatrives/Mitsaki/chapter2.pdf

Farrar, D. and Glauber, R. (1976) "Multicollinearity in Regression Analysis: the Problem Revisited", Review of Economics and Statistics, 49; 92-107.

Greene, William (1993) "Econometric Analysis", 2nd ed., Macmillan Publishing Company Inc., New York, USA; 616-618.

Greene, William (2007) "Econometric Analysis", 6th ed., Upper Saddle River, NJ: Prentice-Hall; 387-388.

Hoerl A. E. (1962) "Application of Ridge Analysis to Regression Problems", Chemical Engineering Progress, 58; 54-59.

Hoerl, A. E. and R. W. Kennard (1970a) "Ridge Regression: Biased Estimation for Non-Orthogonal Problems", Technometrics, 12; 55-67.

Hoerl, A. E. and R. W. Kennard (1970b) "Ridge Regression: Applications to Non-Orthogonal Problems", Technometrics, 12; 69-82.

Hoerl, A. E. ,R. W. Kennard, & K. Baldwin (1975) "Ridge Regression: Some Simulations", Communications in Statistics, A, 4; 105-123.

Hoerl, A. E. and R. W. Kennard (1976) "Ridge Regression: Iterative Estimation of the Biasing Parameter", Communications in Statistics, A, 5; 77-88.

Marquardt D.W. (1970) "Generalized Inverses, Ridge Regression, Biased Linear Estimation, and Nonlinear Estimation", Technometrics, 12; 591-612.

Marquardt D.W. & R. Snee (1975) "Ridge Regression in Practice", The American Statistician, 29; 3-19.

Pidot, George (1969) "A Principal Components Analysis of the Determinants of Local Government Fiscal Patterns", Review of Economics and Statistics, Vol. 51; 176-188.

Rencher, Alvin C. (1998) "Multivariate Statistical Inference and Applications", John Wiley & Sons, Inc., New York, USA; 21-22.

Strawderman, W. E. (1978) "Minimax Adaptive Generalized Ridge Regression Estimators", Journal American Statistical Association, 73; 623-627.

Theil, Henri (1971) "Principles of Econometrics", John Wiley & Sons, Inc., New York, USA.

+----------+ ----+ Examples +---------------------------------------------------------

(1) Example of Ridge regression models, is decribed in: [Judge, et al(1988, p.882)], and also Theil R2 Multicollinearity Effect in: [Judge, et al(1988, p.872)], for Klein-Goldberger data.

clear all

sysuse ridgereg1.dta, clear

ridgereg y x1 x2 x3 , model(orr) kr(0.5) mfx(lin) lmcol diag

ridgereg y x1 x2 x3 , model(orr) kr(0.5) mfx(lin) weights(x) wvar(x1)

ridgereg y x1 x2 x3 , model(grr1) mfx(lin)

ridgereg y x1 x2 x3 , model(grr2) mfx(lin)

ridgereg y x1 x2 x3 , model(grr3) mfx(lin)

(2) Example of Gleason-Staelin, and Heo Multicollinearity Ranges, is decribed in: [Rencher(1998, pp. 20-22)].

clear all

sysuse ridgereg2.dta, clear

ridgereg y x1 x2 x3 x4 x5 , model(orr) lmcol

(3) Example of Farrar-Glauber Multicollinearity Chi2, F, t Tests is decribed in:[Evagelia(2011, chap.2, p.23)].

clear all

sysuse ridgereg3.dta, clear

ridgereg y x1 x2 x3 x4 x5 x6 , model(orr) lmcol -------------------------------------------------------------------------------

. clear all . sysuse ridgereg1.dta , clear . ridgereg y x1 x2 x3 , model(orr) kr(0) diag lmcol mfx(lin)

============================================================================== * (OLS) Ridge Regression - Ordinary Ridge Regression ============================================================================== y = x1 + x2 + x3 ------------------------------------------------------------------------------ Ridge k Value = 0.00000 | Ordinary Ridge Regression ------------------------------------------------------------------------------ Sample Size = 20 Wald Test = 322.1130 | P-Value > Chi2(3) = 0.0000 F-Test = 107.3710 | P-Value > F(3 , 16) = 0.0000 (Buse 1973) R2 = 0.9527 | Raw Moments R2 = 0.9971 (Buse 1973) R2 Adj = 0.9438 | Raw Moments R2 Adj = 0.9965 Root MSE (Sigma) = 4.5272 | Log Likelihood Function = -56.3495 ------------------------------------------------------------------------------ - R2h= 0.9527 R2h Adj= 0.9438 F-Test = 107.37 P-Value > F(3 , 16) 0.0000 - R2v= 0.9527 R2v Adj= 0.9438 F-Test = 107.37 P-Value > F(3 , 16) 0.0000 ------------------------------------------------------------------------------ y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- x1 | 1.058783 .173579 6.10 0.000 .6908121 1.426754 x2 | .4522435 .6557569 0.69 0.500 -.9378991 1.842386 x3 | .1211505 1.087042 0.11 0.913 -2.183275 2.425576 _cons | 8.132845 8.921103 0.91 0.375 -10.77905 27.04474 ------------------------------------------------------------------------------

============================================================================== * OLS Model Selection Diagnostic Criteria - Model= (orr) ============================================================================== - Log Likelihood Function LLF = -56.3495 - Akaike Final Prediction Error AIC = 22.1330 - Schwarz Criterion SC = 25.6984 - Akaike Information Criterion ln AIC = 3.0971 - Schwarz Criterion ln SC = 3.2464 - Amemiya Prediction Criterion FPE = 23.5700 - Hannan-Quinn Criterion HQ = 22.7878 - Rice Criterion Rice = 23.4236 - Shibata Criterion Shibata = 21.3155 - Craven-Wahba Generalized Cross Validation GCV = 22.6941 ------------------------------------------------------------------------------

============================================================================== *** Multicollinearity Diagnostic Tests - Model= (orr) ==============================================================================

* Correlation Matrix (obs=20)

| x1 x2 x3 -------------+--------------------------- x1 | 1.0000 x2 | 0.7185 1.0000 x3 | 0.9152 0.6306 1.0000

* Multicollinearity Diagnostic Criteria +------------------------------------------------------------------------------ > -+ | Var | Eigenval | C_Number | C_Index | VIF | 1/VIF | R2_xi,X > | |-------+-----------+-----------+-----------+-----------+-----------+---------- > -| | x1 | 2.5160 | 1.0000 | 1.0000 | 7.7349 | 0.1293 | 0.8707 > | | x2 | 0.4081 | 6.1651 | 2.4830 | 2.0862 | 0.4793 | 0.5207 > | | x3 | 0.0758 | 33.1767 | 5.7599 | 6.2127 | 0.1610 | 0.8390 > | +------------------------------------------------------------------------------ > -+

* Farrar-Glauber Multicollinearity Tests Ho: No Multicollinearity - Ha: Multicollinearity --------------------------------------------------

* (1) Farrar-Glauber Multicollinearity Chi2-Test: Chi2 Test = 43.8210 P-Value > Chi2(3) 0.0000

* (2) Farrar-Glauber Multicollinearity F-Test: +--------------------------------------------------------+ | Variable | F_Test | DF1 | DF2 | P_Value | |------------+----------+----------+----------+----------| | x1 | 57.246 | 17.000 | 3.000 | 0.003 | | x2 | 9.233 | 17.000 | 3.000 | 0.046 | | x3 | 44.308 | 17.000 | 3.000 | 0.005 | +--------------------------------------------------------+

* (3) Farrar-Glauber Multicollinearity t-Test: +-------------------------------------+ | Variable | x1 | x2 | x3 | |----------+--------+--------+--------| | x1 | . | | | | x2 | 4.259 | . | | | x3 | 9.362 | 3.350 | . | +-------------------------------------+

* |X'X| Determinant: |X'X| = 0 Multicollinearity - |X'X| = 1 No Multicollinearity |X'X| Determinant: (0 < 0.0779 < 1) ---------------------------------------------------------------

* Theil R2 Multicollinearity Effect: R2 = 0 No Multicollinearity - R2 = 1 Multicollinearity - Theil R2: (0 < 0.8412 < 1) ---------------------------------------------------------------

* Multicollinearity Range: Q = 0 No Multicollinearity - Q = 1 Multicollinearity - Gleason-Staelin Q0: (0 < 0.7641 < 1) 1- Heo Range Q1: (0 < 0.8581 < 1) 2- Heo Range Q2: (0 < 0.8129 < 1) 3- Heo Range Q3: (0 < 0.7209 < 1) 4- Heo Range Q4: (0 < 0.7681 < 1) 5- Heo Range Q5: (0 < 0.8798 < 1) 6- Heo Range Q6: (0 < 0.7435 < 1) ------------------------------------------------------------------------------

* Marginal Effect - Elasticity (Model= orr): Linear *

+---------------------------------------------------------------------------+ | Variable | Marginal_Effect(B) | Elasticity(Es) | Mean | |------------+--------------------+--------------------+--------------------| | x1 | 1.0588 | 0.7683 | 52.5840 | | x2 | 0.4522 | 0.1106 | 17.7245 | | x3 | 0.1212 | 0.0088 | 5.2935 | +---------------------------------------------------------------------------+ Mean of Dependent Variable = 72.4650

+--------+ ----+ Author +-----------------------------------------------------------

Emad Abd Elmessih Shehata Professor (PhD Economics) Agricultural Research Center - Agricultural Economics Research Institute - Eg > ypt Email: emadstat@hotmail.com WebPage: http://emadstat.110mb.com/stata.htm WebPage at IDEAS: http://ideas.repec.org/f/psh494.html WebPage at EconPapers: http://econpapers.repec.org/RAS/psh494.htm

+-------------------+ ----+ RIDGEREG Citation +------------------------------------------------

Shehata, Emad Abd Elmessih (2012) RIDGEREG: "OLS-Ridge Regression Models and Diagnostic Tests"

http://ideas.repec.org/c/boc/bocode/s457347.html

http://econpapers.repec.org/software/bocbocode/s457347.htm

Online Help:

* Econometric Regression Models:

* (1) (OLS) * Ordinary Least Squares Regression Models: olsreg OLS Econometric Ridge & Weighted Regression Models: Stata Module Too > lkit ridgereg OLS Ridge Regression Models gmmreg OLS Generalized Method of Moments (GMM): Ridge & Weighted Regression chowreg OLS Structural Change Regressions and Chow Test --------------------------------------------------------------------------- * (2) (2SLS-IV) * Two-Stage Least Squares & Instrumental Variables Regression M > odels: reg2 2SLS-IV Econometric Ridge & Weighted Regression Models: Stata Module > Toolkit gmmreg2 2SLS-IV Generalized Method of Moments (GMM): Ridge & Weighted Regres > sion limlreg2 Limited-Information Maximum Likelihood (LIML) IV Regression meloreg2 Minimum Expected Loss (MELO) IV Regression ridgereg2 Ridge 2SLS-LIML-GMM-MELO-Fuller-kClass IV Regression ridge2sls Two-Stage Least Squares Ridge Regression ridgegmm Generalized Method of Moments (GMM) IV Ridge Regression ridgeliml Limited-Information Maximum Likelihood (LIML) IV Ridge Regression ridgemelo Minimum Expected Loss (MELO) IV Ridge Regression --------------------------------------------------------------------------- * (3) * Panel Data Regression Models: regxt Panel Data Econometric Ridge & Weighted Regression Models: Stata Mod > ule Toolkit xtregdhp Han-Philips (2010) Linear Dynamic Panel Data Regression xtregam Amemiya Random-Effects Panel Data: Ridge & Weighted Regression xtregbem Between-Effects Panel Data: Ridge & Weighted Regression xtregbn Balestra-Nerlove Random-Effects Panel Data: Ridge & Weighted Regress > ion xtregfem Fixed-Effects Panel Data: Ridge & Weighted Regression xtregmle Trevor Breusch MLE Random-Effects Panel Data: Ridge & Weighted Regre > ssion xtregrem Fuller-Battese GLS Random-Effects Panel Data: Ridge & Weighted Regre > ssion xtregsam Swamy-Arora Random-Effects Panel Data: Ridge & Weighted Regression xtregwem Within-Effects Panel Data: Ridge & Weighted Regression xtregwhm Wallace-Hussain Random-Effects Panel Data: Ridge & Weighted Regressi > on xtreghet MLE Random-Effects Multiplicative Heteroscedasticity Panel Data Regr > ession --------------------------------------------------------------------------- * (4) (MLE) * Maximum Likelihood Estimation Regression Models: mlereg MLE Econometric Regression Models: Stata Module Toolkit mleregn MLE Normal Regression mleregln MLE Log Normal Regression mlereghn MLE Half Normal Regression mlerege MLE Exponential Regression mleregle MLE Log Exponential Regression mleregg MLE Gamma Regression mlereglg MLE Log Gamma Regression mlereggg MLE Generalized Gamma Regression mlereglgg MLE Log Generalized Gamma Regression mleregb MLE Beta Regression mleregev MLE Extreme Value Regression mleregw MLE Weibull Regression mlereglw MLE Log Weibull Regression mleregilg MLE Inverse Log Gauss Regression --------------------------------------------------------------------------- * (5) * Autocorrelation Regression Models: autoreg Autoregressive Least Squares Regression Models: Stata Module Toolkit alsmle Beach-Mackinnon AR(1) Autoregressive Maximum Likelihood Estimation R > egression automle Beach-Mackinnon AR(1) Autoregressive Maximum Likelihood Estimation R > egression autopagan Pagan AR(p) Conditional Autoregressive Least Squares Regression autoyw Yule-Walker AR(p) Unconditional Autoregressive Least Squares Regress > ion autopw Prais-Winsten AR(p) Autoregressive Least Squares Regression autoco Cochrane-Orcutt AR(p) Autoregressive Least Squares Regression autofair Fair AR(1) Autoregressive Least Squares Regression --------------------------------------------------------------------------- * (6) * Heteroscedasticity Regression Models: hetdep MLE Dependent Variable Heteroscedasticity hetmult MLE Multiplicative Heteroscedasticity Regression hetstd MLE Standard Deviation Heteroscedasticity Regression hetvar MLE Variance Deviation Heteroscedasticity Regression glsreg Generalized Least Squares Regression --------------------------------------------------------------------------- * (7) * Non Normality Regression Models: robgme MLE Robust Generalized Multivariate Error t Distribution bcchreg Classical Box-Cox Multiplicative Heteroscedasticity Regression bccreg Classical Box-Cox Regression bcereg Extended Box-Cox Regression --------------------------------------------------------------------------- * (8) (NLS) * Nonlinear Least Squares Regression Regression Models: autonls Non Linear Autoregressive Least Squares Regression qregnls Non Linear Quantile Regression --------------------------------------------------------------------------- * (9) * Logit Regression Models: logithetm Logit Multiplicative Heteroscedasticity Regression mnlogit Multinomial Logit Regression --------------------------------------------------------------------------- * (10) * Probit Regression Models: probithetm Probit Multiplicative Heteroscedasticity Regression mnprobit Multinomial Probit Regression --------------------------------------------------------------------------- * (11) * Tobit Regression Models: tobithetm Tobit Multiplicative Heteroscedasticity Regression --------------------------------------------------------------------------- * Multicollinearity Tests: lmcol OLS Multicollinearity Diagnostic Tests fgtest Farrar-Glauber Multicollinearity Chi2, F, t Tests theilr2 Theil R2 Multicollinearity Effect ---------------------------------------------------------------------------