{smcl}
{* version 1.1.0 28January2016}{...}
{cmd: help xtcce}{right: ...}
{hline}
{title:Title}
{phang}
{bf:xtcce} {hline 2} Common Correlated Effects Estimation for Static/Dynamic Panels with Cross-Sectional Dependence.
{title:Syntax}
{p 4 4 2}
{cmd:xtcce} {depvar} [{varlist1}] ([{varlist2}] = [{varlist3}]) {ifin}
[{cmd:,} {it:dynamic gmm pooled full weighted cov(varlist4) alags(#) res(resname)}]{p_end}
{p 4 4 2}Items in [brackets] are optional, {varlist1} contains any exogenous explanatory variables, {varlist2}
contains any endogenous variables, and lastly {varlist3} contains the instruments.
You must {cmd:xtset} your data before using {cmd:xtcce}; see {helpb xtset}.{p_end}
{title:Description}
{pstd}
{cmd:xtcce} is for large static or dynamic panel data models (medium to large N and T) that suffer from
cross-sectional dependence (also known as unobserved common factors or common shocks), slope heterogeneity,
and endogenous regressors. It implements the Pesaran (2006) Common Correlated Effects ('CCE') estimator for
static panel estimation, the Chudik & Pesaran (2015) Dynamic CCE estimator for dynamic panel estimation, and
finally the Neal (2015) 2SLS and GMM extensions of both models.
{p 4 4 2}Consider the following panel model:{p_end}
{p 4 4 2}y_it = rho_i*y_(it-1) + beta_i*x_it + mu_i + gamma_i*f_t + v_it{p_end}
{p 4 4 2}x_it = Gamma_i*f_t + e_it{p_end}
{p 4 4 2}where rho_i is the autoregressive coefficient for individual i, x_it is a NTxK matrix of regressors,
beta_i is a 1xK vector of coefficients for individual i, mu_i is the individual-specific fixed effect, f_t is a 1xM
vector of unobserved common factors, gamma_i and Gamma_i are the heterogeneous factor loadings, and v_it and
e_it are the idiosyncratic error terms.
{p 4 4 2}Since both the regressors x_it and the dependent variable y_it depend on the vector of unobserved
common factors f_t, pooled or mean group OLS will provide an inconsistent estimate of rho or beta. The
presence of unobserved common factors is one representation of cross-sectional dependence in panel data. The
idea of common correlated effects estimation, introduced in Pesaran (2006), is to approximate the projection
space of the unobserved common factors with the inclusion of cross section averages of the variables in the
regression equation.{p_end}
{p 4 4 2}Pesaran (2006) proposed the common correlated effects model (here called CCE-OLS) to consistently
estimate beta_i in the equation above when rho_i = 0 for all i and the regressors are strictly exogenous. It
can be estimated with pooled or mean group OLS, with the latter accounting for slope heterogeneity among panel
units. Chudik and Pesaran (2015) extended this model to allow for a dynamic specification (i.e. rho_i => 0)
and weakly exogenous regressors (here called DCCE-OLS). It achieves this by adding lags of the cross section
averages to the regression. Neal (2015) further extended the CCE/DCCE approach by estimating the regressions
equation(s) with 2SLS or GMM to account for endogenous regressors and improve the efficiency of the DCCE
estimator (please see Neal (2015) for Monte Carlo simulation results that demonstrate this), using further
lags of the variables to form the instrument set.{p_end}
{p 4 4 2}The result is a powerful and flexible suite of estimation options for large panel models that allows
for cross-sectional dependence, static or dynamic specifications, exogenous or endogenous regressors, fixed effects, and heterogeneous
slopes. Please see the references below for further information on the usefulness of these estimators.{p_end}
{title:References and Further Reading}
{p 4 4 2}Pesaran, M.H. (2006) "Estimation and inference in large heterogeneous panels with a multifactor error structure", Econometrica, 74(4), p.967-1012{p_end}
{p 4 4 2}Pesaran, M.H. and Chudik, A. (2015) "Common correlated effects estimation of heterogeneous dynamic panel data models with weakly exogenous regressors", Journal of Econometrics, 188(2), p.393-420{p_end}
{p 4 4 2}Neal, T. (2015) "Estimating Heterogeneous Coefficients in Panel Data Models with Endogenous Regressors and Common Factors", Working Paper{p_end}
{title:Options}
{synoptset 20 tabbed}{...}
{synopthdr}
{synoptline}
{synopt:{opt dynamic}}Required option whenever one or more lag of the dependent variable is added as an
exogenous or endogenous regressor. It adds lags to the cross section averages to ensure consistency, as in
Chudik and Pesaran (2015). The number of lags is automatically set to T^(1/3), but can be changed manually
with the option {opt alags(#)} below. {p_end}
{synopt:{opt gmm}}Uses GMM estimation if endogenous and instrumental variables are present in the regression
equation. If this option is not enabled, 2SLS will be used in those instances.{p_end}
{synopt:{opt weighted}}Weights the mean group coefficient results by the standard errors of the individual
coefficients. This option is recommended as a robustness check, and whenever the distribution of beta
coefficients across panel units is volatile.{p_end}
{synopt:{opt full}}Shows regression output for each panel unit in addition to the mean group results at the end.
Has no effect if the option {opt pooled} is selected.{p_end}
{synopt:{opt pooled}}Uses pooled regression, as opposed to mean group regression which is the default. To account for
individual-specific factor loadings it interacts the id variable with the cross section averages.
Enabling this option is not recommended when slope heterogeneity is suspected.{p_end}
{synopt:{opt cov(varlist4)}}Adds cross section averages of {it: varlist4} to the regression, without including them
as regressors. This is recommended whenever it is suspected that the number of
unobserved common factors exceeds the number of variables in the model.{p_end}
{synopt:{opt alags(#)}}Changes the number of lags of the cross section averages used in dynamic models to #.
It has no effect if the option {opt dynamic} is not enabled. {p_end}
{synopt:{opt res(resname)}} Stores the residuals of the regressions to resname. {p_end}
{title:Examples}
{p 4 4 2}In cases where the regressor x is strictly exogenous:{p_end}
{p 4 4 2}CCE-OLS:{p_end}{phang}{cmd:. xtcce y x}
{p 4 4 2}DCCE-OLS:{p_end} {phang}{cmd:. xtcce y l.y x, dynamic}
{p 4 4 2}For efficiency improvements in the dynamic specification, one might use:{p_end}
{p 4 4 2}DCCE-2SLS: {p_end}{phang}{cmd:. xtcce y x (l.y = l(2/4).y), dynamic}
{p 4 4 2}DCCE-GMM:{p_end} {phang}{cmd:. xtcce y x (l.y = l(2/4).y), dynamic gmm}
{p 4 4 2}In cases where the regressor x is endogenous, one might use:{p_end}
{p 4 4 2}CCE-2SLS:{p_end} {phang}{cmd:. xtcce y (x = l(1/2).x l(1/2).y)}
{p 4 4 2}CCE-GMM{p_end} {phang}{cmd:. xtcce y (x = l(1/2).x l(1/2).y), gmm}
{p 4 4 2}DCCE-2SLS:{p_end} {phang}{cmd:. xtcce y (l.y x = l(2/3).y l(1/3).x), dynamic}
{p 4 4 2}DCCE-GMM{p_end} {phang}{cmd:. xtcce y (l.y x = l(2/3).y l(1/3).x), dynamic gmm}
{title:Tip}
{p 4 4 2}It is always worthwhile to check the sensitivity of results and try a variety of specifications.
Check the {cmd:e(bfull)} vector after estimation to see the degree of volatility in the estimated beta coefficient
across panel units. Use the option {opt full} to see if there are significant outliers or strange results in some of
the panel units, and then consider excluding them. Try a static and a dynamic version of the model. Try
treating either the lagged dependent variable or the regressors as endogenous, and use lags to form the
instrument set. See if the results vary significantly between 2SLS and GMM.{p_end}
{title:Known Issues}
{p 4 4 2}This command will not add cross section averages of any variables with time series operators in order to prevent them
clashing in dynamic models. This is only problematic with differenced variables (e.g. d.x) or when lags of
regressors are used and the contemporaneous observation is not (i.e. l(1/3).x but not x). In these situations, the
variables should be manually transformed prior to estimation.{p_end}
{p 4 4 2}GMM will not use a HAC weight matrix when the estimator is pooled (due to the limitation of the
command {cmd: ivregress}), in mean group regressions (the default) it will use a HAC weight matrix.{p_end}
{p 4 4 2}Errors will usually arise when mean group estimation is used and one or more of the panel units have
very small T. Exclude these panel units from the sample to solve this problem.{p_end}
{title:Saved results}
{pstd}{cmd:xtcce} saves the following in {cmd:e()}:
{synoptset 20 tabbed}{...}
{p2col 5 20 24 2: Scalars}{p_end}
{synopt:{cmd:e(N)}}Number of usable observations{p_end}
{synopt:{cmd:e(g_min)}}Fewest number of observations in an included
panel unit{p_end}
{synopt:{cmd:e(g_max)}}Largest number of observations in an included
panel unit{p_end}
{synopt:{cmd:e(g_avg)}}The average number of observations in an included
panel unit{p_end}
{synopt:{cmd:e(N_g)}}Number of panel units{p_end}
{synopt:{cmd:e(chi2)}}Chi-squared{p_end}
{synopt:{cmd:e(df_m)}}Model degrees of freedom{p_end}
{synoptset 20 tabbed}{...}
{p2col 5 20 24 2: Macros}{p_end}
{synopt:{cmd:e(ivar)}}Panel unit identification variable{p_end}
{synopt:{cmd:e(tvar)}}Time variable{p_end}
{synopt:{cmd:e(depvar)}}Dependent variable{p_end}
{synoptset 20 tabbed}{...}
{p2col 5 20 24 2: Matrices}{p_end}
{synopt:{cmd:e(b)}}Vector of coefficients{p_end}
{synopt:{cmd:e(V)}}Variance-covariance matrix of the estimates{p_end}
{synopt:{cmd:e(bfull)}}Complete matrix of the individual-level regression coefficients
(note: not available with option {opt pooled}){p_end}
{title:Author}
{pstd}Timothy Neal{p_end}
{pstd}School of Economics{p_end}
{pstd}University of New South Wales{p_end}
{pstd}Sydney, Australia{p_end}
{pstd}{browse "mailto:timothy.neal@unsw.edu.au":timothy.neal@unsw.edu.au} {p_end}
{pstd}{browse "https://sites.google.com/site/tjrneal/stata-code":https://sites.google.com/site/tjrneal/stata-code} {p_end}
{title:Also see}
{psee}
{space 2}Online: {helpb xtmg}, {helpb xtpedroni}, {helpb xtset}, {helpb xtpmg}
{p_end}