{smcl}
{hline}
{hi:help xtdcce2}{right: v. 4.7 - 06. June 2024}
{right:SJ18-3: st0536}
{right:SJ21-3: st0536_1}
{hline}
{title:Title}
{p 4 4}{cmd:xtdcce2} - estimating heterogeneous coefficient models using common correlated effects in a dynamic panel
with a large number of observations over groups and time periods.{p_end}
{title:Syntax}
{p 4 13}{cmd:xtdcce2} {depvar} [{indepvars}] [{varlist}2 = {varlist}_iv] {ifin} {cmd:,}
{cmdab:cr:osssectional(}{help varlist}{cmd:cr1 [,options1])}
[
{cmdab:globalcr:osssectional(}{help varlist}{cmd:cr2 [,options1])}
{cmdab:clustercr:osssectional(}{help varlist}{cmd:cr3 [,options1])}
{cmdab:p:ooled}({varlist})
{cmd:cr_lags}({it:string})
{cmdab:nocross:sectional}
{cmdab:ivreg2:options}({it:string})
{cmd:e_ivreg2}
{cmd:ivslow}
{cmdab:noi:sily}
{cmd:lr}({varlist})
{cmd:lr_options}({it:string})
{cmdab:pooledc:onstant}
{cmd:pooledvce}({it:string})
{cmdab:reportc:onstant}
{cmdab:noconst:ant}
{cmd:trend}
{cmdab:pooledt:rend}
{cmdab:jack:knife}
{cmdab:rec:ursive}
{cmdab:mgmis:sing}
{cmdab:expo:nent}
{cmdab:xtcse2:options(string)}
{cmd:nocd}
{cmdab:showi:ndividual}
{cmd:fullsample}
{cmd:fast}
{cmd:fast2}
{cmdab:blockdiag:use}
{cmdab:nodim:check}
{cmd:useqr}
{cmd:useinvsym}
{cmdab:NOOMIT:ted}]{p_end}
{p 4}{ul:Options for cross-sectional averages}{p_end}
{synoptset 20}{...}
{synopt:options1}Description{p_end}
{synoptline}
{synopt:{cmd:cr_lags(}{help numlist})}number of lags of cross-section averages{p_end}
{synopt:{cmdab:cl:ustercr(}{help varlist})}name of cluster variable(s), only for {cmd:clustercrosssectional}{p_end}
{synopt:{cmd:rcce}}use regularized CCE, see Juodis (2022) with the ER criterion from Ahn and Horenstein (2013) to select the number of eigenvectors for static panels, see {help xtdcce2##rcce:details}.{p_end}
{synopt:{cmd:rcce(options)}}as {cmd:rcce} but with further options options, see {help xtdcce2##options:Options}.{p_end}
{synopt:{cmdab:rccl:assifier}}perform test for rank condition, see De Vos et al. (2024). RC = 1 implies rank condition holds.{p_end}
{synopt:{cmdab:rccl:assifier(options)}}as {cmd:rctest} but with further options options, see {help xtdcce2##options:Options}.{p_end}
{synoptline}
{p2colreset}{...}
{p 4 4} where {varlist}2 are endogenous variables and {varlist}_iv the instruments.{p_end}
{p 4 4}Data has to be {cmd:xtset} before using {cmd:xtdcce2}; see {help tsset}.
{it:varlists} may contain time-series operators, see {help tsvarlist}, or factor variables, see {help fvvarlist}
and note on {help xtdcce2##collinearity:collinearity issues}.{break}
{cmd:xtdcce2} requires the {help moremata} package.{p_end}
{p 4 4}For a speed optimized version see {help xtdcce2fast}.{p_end}
{title:Contents}
{p 4}{help xtdcce2##description:Description}{p_end}
{p 4}{help xtdcce2##options:Options}{p_end}
{p 4}{help xtdcce2##model:Econometric and Empirical Model}{p_end}
{p 8}{help xtdcce2##EconometricModel:Econometric Model}{p_end}
{p 8}{help xtdcce2##EmpiricalModel:Empirical Model}{p_end}
{p 8}{help xtdcce2##R2:Coefficient of Determination (R2)}{p_end}
{p 8}{help xtdcce2##rcc:Rank Classifier}{p_end}
{p 8}{help xtdcce2##ic:Information Criteria}{p_end}
{p 8}{help xtdcce2##collinearity:Issues with Collinearity}{p_end}
{p 4}{help xtdcce2##SpeedLargePanels:Large panels and speed}{p_end}
{p 4}{help xtdcce2##saved_vales:Saved Values}{p_end}
{p 4}{help xtdcce2##postestimation: Postestimation commands}{p_end}
{p 8}{help xtdcce2##postestpredict: predict}{p_end}
{p 8}{help xtdcce2##postestestat: estat}{p_end}
{p 8}{help xtdcce2##postestboot: Bootstrap}{p_end}
{p 8}{help xtdcce2##postestic: Information Criteria}{p_end}
{p 4}{help xtdcce2##examples:Examples}{p_end}
{p 4}{help xtdcce2##references:References}{p_end}
{p 4}{help xtdcce2##about:About}{p_end}
{marker description}{title:Description}
{p 4 4}{cmd:xtdcce2} estimates a heterogeneous coefficient model in a large panel with dependence between cross sectional units.
A panel is large if the number of cross-sectional units (or groups) and the number of time periods are going to infinity.{break}
It fits the following estimation methods:{p_end}
{p 8 8}
i) The Mean Group Estimator (MG, Pesaran and Smith 1995).{break}
ii) The Common Correlated Effects Estimator (CCE, Pesaran 2006),{break}
iii) The Dynamic Common Correlated Effects Estimator (DCCE, Chudik and Pesaran 2015), and{p_end}
{p 4 4}For a dynamic model, several methods to estimate long run effects are possible:{p_end}
{p 8 8} a) The Pooled Mean Group Estimator (PMG, Shin et. al 1999) based on an Error Correction Model, {break}
b) The Cross-Sectional Augmented Distributed Lag (CS-DL, Chudik et. al 2016) estimator which directly estimates the long run coefficients
from a dynamic equation, and {break}
c) The Cross-Sectional ARDL (CS-ARDL, Chudik et. al 2016) estimator using an ARDL model.{p_end}
{p 4 4}For a further discussion see Ditzen (2019).
Additionally {cmd:xtdcce2} tests for cross sectional dependence (see {help xtcd2}) and
estimates the exponent of the cross sectional dependence alpha (see {help xtcse2}).
{cmd:xtdcce2} also implements the regularized CCE estimator from Juodis (2022) for static panels.
The rank condition can be checked in static panels using the classifier from De Vos et al. (2024).
Information criteria to select the optimal number of cross-section averages in static panels can be calculated using {cmd:estat}.
It also supports instrumental variable estimations (see {help ivreg2}).{p_end}
{marker options}{title:Options}
{p 4 8}{cmdab:cr:osssectional(}{help varlist}{cmd:cr1 [,cr_lags(#)} {cmd:rcce[(}{cmdab:c:riterion(string)} {cmdab:sc:ale}{cmd: npc(real)]])} defines the variables which are added as cross sectional averages to the equation.
Variables in {cmd:crosssectional()} may be included in {cmd:pooled()}, {cmd:exogenous_vars()}, {cmd:endogenous_vars()} and {cmd:lr()}.
Variables in {cmd: crosssectional()} are partialled out, the coefficients not estimated and reported.{p_end}
{p 8 8}{cmd:crosssectional}(_all) adds all variables as cross sectional averages.
No cross sectional averages are added if {cmd:crosssectional}(_none) is used, which is equivalent to {cmd:nocrosssectional}.
{cmd:crosssectional}() is a required option but can be substituted by {cmd:nocrosssectional}.{break}
If {cmd:cr(..., cr_lags())} is used, then the global option {cmd:cr_lags()} (see below) is ignored.{p_end}
{p 8 10}{cmd:rcce[}{cmdab:c:riterion(string)} {cmdab:sc:ale}{cmd: npc(real)]} invokes regularized cross-section averages from Juodis (2022) for {bf:static panels}.
The number of common factors in the cross-section averages is estimated and
then the respective number of eigenvectors from the cross-section averages is used. For more details see {help xtdcce2##rcce:details}.{break}
Detailed Options are:{p_end}
{col 14}{cmd:criterion(er|gr)} specifies criterion to identify number of common factors using the ER or GR criterion from Ahn and Horenstein (2013), see {help xtnumfac}.
{col 14}{cmd:scale} scales cross-section averages, see Juodis (2022).
{col 14}{cmd:npc(real)} specifies number of eigenvectors without estimating it. Cannot be combined with {cmd:criterion}.
{p 8 10}{cmdab:rccl:assifier[er gr }{cmdab:rep:lications(integer) }{cmdab:stand:ardize(integer) }{cmdab:random:shrinkage} {cmdab:noshrink:age]} performs the test of the rank condition, see De Vos et al. (2024).
RC = 1 implies the rank condition holds. CCE is inconsistent if the rank condition does not hold.
The classifier compares the rank of the matrix of cross-sectional averages of the data with the number of factors estimated from the data.
For the rank condition to hold, the number of factors has to be smaller than or equal to the rank of the matrix of cross-sectional averages.
The test can be applied for static panels only.{break}
Detailed Options are:{p_end}
{col 14}{cmd:er|gr} specifies criterion to identify number of common factors using the ER or GR criterion from Ahn and Horenstein (2013), see {help xtnumfac}. Default is {cmd:gr}.
{col 14}{cmdab:stand:ardize(integer)} standardize data prior to estimation of number of common factors, see see {help xtnumfac}. Default is {cmd:5}.
{col 14}{cmdab:rep:lications(integer)} sets number of replication for bootstrapping variance of the rank estimator of the unobserved matrix of average factor loadings.
{col 14}{cmdab:random:shrinkage} Instead of fold-over matrix, use matrix with entries drawn from random normal distribution.
{col 14}{cmdab:noshrink:age]} No shrinkage.
{p 4 8}{cmdab:globalcr:osssectional(}{help varlist}{cmd:cr1 [,cr_lags(#)])} define global cross-section averages.
{cmd:global} cross-section averages are cross-section averages based on observations which are excluded using {help if} statements.
If {cmd:cr(..., cr_lags())} is used, then the global option {cmd:cr_lags()} (see below) is ignored.{p_end}
{p 4 8}{cmdab:cluster:osssectional(}{help varlist}{cmd:cr1 [,cr_lags(#)] clustercr(varlist))} are clustered or local cross-section averages.
That is, the cross-section averages are the same for each realization of the variables defined in {cmd:clustercr()}.
For example, we have data observations regions of multiple countries, defined by variable {it:country}
Now we want to add cross-section averages for each country.
We can define those by using the option {cmd:clustercr(varlist , clustercr(country))}.{break}
If {cmd:cr(..., cr_lags())} is used, then the global option {cmd:cr_lags()} (see below) is ignored.{p_end}
{p 4 8}{cmdab:p:ooled}({varlist}) specifies variables which estimated coefficients are constrained to be equal across all cross sectional units.
Variables may occur in {indepvars}.
Variables in {cmd:exogenous_vars()}, {cmd:endogenous_vars()} and {cmd:lr()} may be pooled as well.{p_end}
{p 4 8 12}{cmd:cr_lags}({it:integers}) sets the number of lags of the cross sectional averages.
If not defined but {cmd:crosssectional()} contains a varlist, then only contemporaneous cross sectional averages are added but no lags.
{cmd:cr_lags(0)} is the equivalent.
The number of lags can be different for different variables, where the order is the same as defined in {cmd:cr()}.
For example if {cmd:cr(y x)} and only contemporaneous cross-sectional averages of y but 2 lags of x are added,
then {cmd:cr_lags(0 2)}.{p_end}
{p 4 8 12}{cmdab:nocross:sectional} suppresses adding any cross sectional averages
Results will be equivalent to the Mean Group estimator.{p_end}
{p 4 8 12}{cmdab:pooledc:onstant} restricts the constant term to be the same across all cross sectional units.{p_end}
{p 4 8 12}{cmdab:reportc:onstant} reports the constant term. If not specified the constant is partialled out.{p_end}
{p 4 8 12}{cmdab:noconst:ant} suppresses the constant term.{p_end}
{p 4 8 12}{cmdab:mgmis:sing} if it is not possible to estimate individual coefficient for a cross-section
because of missing data or perfect collinearity, individual coefficient is excluded for MG estimation.
Coefficient will still be displayed as zero in e(bi). {p_end}
{p 4 8 12}{cmd:pooledvce(}{it:type}{cmd:)} specifies the variance estimator for pooled regression.
The default is the non-parametric variance estimator from Pesaran (2006).
{it:type} can be {cmd:nw} for Newey West heteroscedasticity autocorrelation robust standard errors (Pesaran 2006) or
{cmd:wpn} for fixed T adjusted standard errors from Westerlund et. al (2019).
{cmd:ols} for standard OLS standard errors.{p_end}
{p 4 8}{cmd:xtdcce2} supports IV regressions using {help ivreg2}.
The IV specific options are:{break}
{cmdab:ivreg2:options}({it:string}) passes further options to {cmd:ivreg2}, see {help ivreg2##s_options:ivreg2, options}.{break}
{cmd:e_ivreg2} posts all available results from {cmd:ivreg2} in {cmd: e()} with prefix {it:ivreg2_}, see {help ivreg2##s_macros: ivreg2, macros}.{break}
{cmdab:noi:sily} displays output of {cmd:ivreg2}.{break}
{cmd:ivslow}: For the calculation of standard errors for pooled coefficients an auxiliary regressions is performed.
In case of an IV regression, xtdcce2 runs a simple IV regression for the auxiliary regressions.
this is faster.
If option is used {cmd:ivslow}, then xtdcce2 calls ivreg2 for the auxiliary regression.
This is advisable as soon as ivreg2 specific options are used.{p_end}
{p 4 8}{cmd:xtdcce2} is able to estimate long run coefficients.
Three models are supported:
The pooled mean group models (Shin et. al 1999), similar to {help xtpmg}, the CS-DL (see {help xtdcce2##csdl: xtdcce2, csdl})
and CS-ARDL method (see {help xtdcce2##ardl: xtdcce2, ardl}) as developed in Chudik et. al 2016.
No options for the CS-DL model are necessary.{p_end}
{p 8 8}{cmd:lr}({varlist}) specifies the variables to be included in the long-run cointegration vector.
The first variable(s) is/are the error-correction speed of adjustment term.
The default is to use the pmg model.
In this case each estimated coefficient is divided by the negative of the long-run cointegration vector (the first variable).
If the option {cmd:ardl} is used, then the long run coefficients are estimated as the sum over the coefficients relating to a variable,
divided by the sum of the coefficients of the dependent variable.{break}
{cmd:lr_options}({it:string}), options for the long run coefficients. Options are:{break}{break}{p_end}
{col 12}{cmd:ardl} estimates the CS-ARDL estimator. For further details see {help xtdcce2##ardl:xtdcce2, ardl}.
{col 12}{cmd:nodivide} coefficients are not divided by the error correction speed of adjustment vector. Equation (7) is estimated, see {help xtdcce2##pmg:xtdcce2, pmg}.
{col 12}{cmd:xtpmgnames} coefficient names in {cmd: e(b_p_mg)} (or {cmd: e(b_full)}) and {cmd: e(V_p_mg)} (or {cmd: e(V_full)}) match the name convention from {help xtpmg}.
{col 12}{cmd:nominus} does not subtract -1 from the coefficient of the dependent variable.
{p 4 8 12}{cmd:trend} adds a linear unit specific trend. May not be combined with {cmd:pooledtrend}.{p_end}
{p 4 8 12}{cmdab:pooledt:rend} adds a linear common trend. May not be combined with {cmd:trend}.{p_end}
{p 4 8}Two methods for small sample time series bias correction are supported:{break}
{cmdab:jack:knife} applies the jackknife bias correction method. May not be combined with {cmd:recursive}.{break}
{cmdab:rec:ursive} applies the recursive mean adjustment method. May not be combined with {cmd:jackknife}.{p_end}
{p 4 8 12}{cmdab:expo:nent} uses {help xtcse2} to estimate the exponent of the cross-sectional
dependence of the residuals. A value above 0.5 indicates cross-sectional dependence,
see {help xtcse2}.{p_end}
{p 4 8 12}{cmdab:xtcse2:options} passes options to {help xtcse2}, see {help xtcse2##options:xtcse2, options}.{p_end}
{p 4 8 12}{cmd: nocd} suppresses calculation of CD test. For details about the CD test see {help xtcd2}.{p_end}
{p 4 8 12}{cmdab:showi:ndividual} reports unit individual estimates in output.{p_end}
{p 4 8 12}{cmd:fullsample} uses entire sample available for calculation of cross sectional averages.
Any observations which are lost due to lags will be included in calculating the
cross sectional averages (but are not included in the estimation itself).
The option does {cmd:not} remove any {help if} statements.
This means, that if an {cmd:if} removes certain cross-sectional units from
the estimation sample, {cmd:xtdcce2} will not use those (as specified by {cmd:if}),
even if {cmd:fullsample} is used.
{p_end}
{p 4 8 12}{cmd:fast} omit calculation of unit specific standard errors.{p_end}
{p 4 8 12}{cmd:fast2} use {help xtdcce2fast} instead of {cmd:xtdcce2}.{p_end}
{p 4 8 12}{cmdab:blockdiag:use} uses {help mata blockdiag}
rather than an alternative algorithm.
{cmd: mata blockdiag} is slower, but might produce more stable results.{p_end}
{p 4 8 12}{cmdab:nodim:check} Does not check for dimension.
Before estimating a model, {cmd:xtdcce2} automatically checks if the
time dimension within each panel is long enough to run a mean group regression.
Panel units with an insufficient number are automatically dropped.
{p_end}
{p 4 8}{cmd:xtdcce2} checks for collinearity in three different ways.
It checks if matrix of the cross-sectional averages is of full rank.
After partialling out the cross-sectional averages, it checks if the entire model
across all cross-sectional units exhibits multicollinearity.
The final check is on a cross-sectional level.
The outcome of the checks influence which method is used to invert matrices.
If a check fails {cmd:xtdcce2} posts a warning message.
The default is {help mata cholinv:cholinv} and {help mata invsym:invsym}
if a matrix is of rank-deficient.
For a further discussion see {help xtdcce2##collinearity: collinearity issues}.{break}
The following options are available to alter the behaviour of {cmd:xtdcce2}
with respect to matrices of not full rank:{p_end}
{col 12}{cmd:useqr} calculates the generalized inverse via QR decomposition. This was the default for rank-deficient matrices for {cmd:xtdcce2} pre version 1.35.
{col 12}{cmd:useinvsym} calculates the generalized inverse via {help mata invsym}.
{col 12}{cmdab:noomit:ted} no omitted variable checks on the entire model.
{marker model}{title:Econometric and Empirical Model}
{p 2}{ul: Econometric Model}{p_end}
{marker EconometricModel}
{p 4}Assume the following dynamic panel data model with heterogeneous coefficients:{p_end}
{col 10} (1) {col 20} y(i,t) = b0(i) + b1(i)*y(i,t-1) + x(i,t)*b2(i) + x(i,t-1)*b3(i) + u(i,t)
{col 20} u(i,t) = g(i)*f(t) + e(i,t)
{p 4 4} where f(t) is an unobserved common factor loading, g(i) a heterogeneous factor loading, x(i,t) is a (1 x K) vector
and b2(i) and b3(i) the coefficient vectors.
The error e(i,t) is iid and the heterogeneous coefficients b1(i), b2(i) and b3(i) are randomly distributed around a common mean.
It is assumed that x(i,t) is strictly exogenous.
In the case of a static panel model (b1(i) = 0) Pesaran (2006) shows that average of the coefficients b0, b2 and b3
(for example for b2(mg) = 1/N sum(b2(i)))
can be consistently estimated by adding cross sectional means of the dependent and all independent variables.
The cross sectional means approximate the unobserved factors.
In a dynamic panel data model (b1(i) <> 0) pT lags of the cross sectional means are added to achieve consistency (Chudik and Pesaran 2015).
The mean group estimates for b1, b2 and b3 are consistently estimated as long as N,T and pT go to infinity.
This implies that the number of cross sectional units and time periods is assumed to grow with the same rate.
In an empirical setting this can be interpreted as N/T being constant.
A dataset with one dimension being large in comparison to the other would lead to inconsistent estimates, even if both dimension are large in numbers.
For example a financial dataset on stock markets returns on a monthly basis over 30 years (T=360) of 10,000 firms would not be sufficient.
While individually both dimension can be interpreted as large, they do not grow with the same rate and the ratio would not be constant.
Therefore an estimator relying on fixed T asymptotics and large N would be appropriate.
On the other hand a dataset with lets say N = 30 and T = 34 would qualify as appropriate, if N and T grow with the same rate.{p_end}
{p 4 4}The variance of the mean group coefficient b1(mg) is estimated as:{p_end}
{col 10} var(b(mg)) = 1/N sum(i=1,N) (b1(i) - b1(mg))^2
{p 4 4}or if the vector pi(mg) = (b0(mg),b1(mg)) as:{p_end}
{col 10} var(pi(mg)) = 1/N sum(i=1,N) (pi(i) - pi(mg))(p(i)-pi(mg))'
{p 2}{ul: Empirical Model}{p_end}
{marker EmpiricalModel}
{p 4 4}The empirical model of equation (1) without the lag of variable x is:{p_end}
{col 10}(2){col 20} y(i,t) = b0(i) + b1(i)*y(i,t-1) + x(i,t)*b2(i) + sum[d(i)*z(i,s)] + e(i,t),
{p 4 4} where z(i,s) is a (1 x K+1) vector including the cross sectional means at time s and the sum is over s=t...t-pT.
{cmd:xtdcce2} supports several different specifications of equation (2).{p_end}
{p 4 4} {cmd:xtdcce2} partials out the cross sectional means internally.
For consistency of the cross sectional specific estimates, the matrix z = (z(1,1),...,z(N,T)) has to be of full column rank.
This condition is checked for each cross section.
{cmd:xtdcce2} will return a warning if z is not full column rank.
It will, however, continue estimating the cross sectional specific coefficients and then calculate the mean group estimates,
see {help xtdcce2##collinearity: collinearity issues}.
The mean group estimates will be consistent.
For further reading see, Chudik, Pesaran (2015, Journal of Econometrics), Assumption 6 and page 398.{p_end}
{p 4 4} {cmd:xtdcce2} evaluates the rank condition further.
The rank condition can fail if the constant is partialled out and one or more variables are binary.
In this case {cmd:xtdcce2} restarts, however forces the constant to be calculated.{p_end}
{p 4 4}The following models can be estimated:{p_end}
{p 2}{ul: i) Mean Group}{p_end}
{p 4 4} If no cross sectional averages are added (d(i) = 0), then the estimator is the Mean Group Estimator as proposed by Pesaran and Smith (1995).
The estimated equation is:
{col 10}(3){col 20} y(i,t) = b0(i) + b1(i)*y(i,t-1) + x(i,t)*b2(i) + e(i,t).
{p 4 4} Equation (3) can be estimated by using the {cmd:nocross} option of {cmd:xtdcce2}. The model can be either static (b(1) = 0) or dynamic (b(1) <> 0).{p_end}
{p 4}{help xtdcce2##e_mg:See example}{p_end}
{p 2}{ul: ii) Common Correlated Effects}{p_end}
{p 4 4} The model in equation (3) does not account for unobserved common factors between units.
To do so, cross sectional averages are added in the fashion of Pesaran (2006):{p_end}
{col 10}(4){col 20} y(i,t) = b0(i) + x(i,t)*b2(i) + d(i)*z(i,t) + e(i,t).
{p 4 4} Equation (4) is the default equation of {cmd:xtdcce2}.
Including the dependent and independent variables in {cmd:crosssectional()} and setting {cmd:cr_lags(0)} leads to the same result.
{cmd:crosssectional()} defines the variables to be included in z(i,t).
Important to notice is, that b1(i) is set to zero. {p_end}
{p 4}{help xtdcce2##example_cce:See example}{p_end}
{p 2}{ul: iii) Dynamic Common Correlated Effects}{p_end}
{p 4 4} If a lag of the dependent variable is added, endogeneity occurs and adding solely contemporaneous cross sectional averages is not sufficient any longer to achieve consistency.
Chudik and Pesaran (2015) show that consistency is gained if pT lags of the cross sectional averages are added:{p_end}
{col 10}(5){col 20} y(i,t) = b0(i) + b1(i)*y(i,t-1) + x(i,t)*b2(i) + sum [d(i)*z(i,s)] + e(i,t).
{p 4 4} where s = t,...,t-pT. Equation (5) is estimated if the option {cmd:cr_lags()} contains a positive number.{p_end}
{p 4}{help xtdcce2##example_dcce:See example}{p_end}
{p 2}{ul: iv) Pooled Estimators}{p_end}
{p 4 4} Equations (3) - (5) can be constrained that the parameters are the same across units. Hence the equations become:{p_end}
{col 10}(3-p){col 20} y(i,t) = b0 + b1*y(i,t-1) + x(i,t)*b2 + e(i,t),
{col 10}(4-p){col 20} y(i,t) = b0 + x(i,t)*b2 + d(i)*z(i,t) + e(i,t),
{col 10}(5-p){col 20} y(i,t) = b0 + b1*y(i,t-1) + x(i,t)*b2 + sum [d(i)*z(i,s)] + e(i,t).
{p 4 4}Variables with pooled (homogenous) coefficients are specified using the {cmd:pooled({varlist})} option.
The constant is pooled by using the option {cmd:pooledconstant}. {p_end}
{p 4 4}In case of a pooled estimation {cmd:xtdcce2} offers the estimation of three different type of standard errors.
The default is the non-parametric variance estimator from Pesaran (2006), Eq. 67 - 69.
The non-parametric estimator takes the difference between the mean group and individual coefficients into account.
Therefore a mean group regression is performed in the background.
The first alternative estimator is the Newey West type heteroscedasticity and auto correlation robust standard errors from
Pesaran (2006), Eq. 51, 52 and 74.
The length of the window size is defined as round(4*(T/100)^(2/9)) and follows the convention in
for HAC standard errors (see Bai and Ng, 2004).
Another alternative is the fixed-T variance estimator from Westerlund et. al (2019), Eq. 10 and 11.
This estimator heteroscedasticity robust and allows for panels with a fixed time dimension.{p_end}
{p 4}{help xtdcce2##example_pooled:See example}{p_end}
{p 2}{ul: v) Instrumental Variables}{p_end}
{p 4 4}{cmd:xtdcce2} supports estimations of instrumental variables by using the {help ivreg2} package.
Endogenous variables (to be instrumented) are defined in {varlist}2 and their instruments are defined in {varlist}_iv.{p_end}
{p 4}{help xtdcce2##example_iv:See example}{p_end}
{marker pmg}{p 2}{ul: vi) Cross-Section Augmented Error Correction Models (CS-ECM)}{p_end}
{p 4 4} Two version of the CS-ECM are supported.
The Pooled Mean Group model, which is an intermediate between the mean group and a pooled estimation, Shin et. al (1999) differentiate between homogenous long run and heterogeneous short run effects.
Therefore the model includes mean group as well as pooled coefficients.
Equation (1) (without the lag of the explanatory variable x and for a better readability without the cross sectional averages) is transformed into an ARDL model:{p_end}
{col 10}(6){col 20}y(i,t) = phi(i)*(y(i,t-1) - w0(i) - x(i,t)*w2(i)) + g1(i)*[y(i,t)-y(i,t-1)] + [x(i,t) - x(i,t-1)] * g2(i) + e(i,t),
{p 4 4}where phi(i) is the cointegration vector, w(i) captures the long run effects and g1(i) and g2(i) the short run effects.
Shin et. al estimate the long run coefficients by ML and the short run coefficients by OLS.
{cmd:xtdcce2} estimates a slightly different version by OLS:{p_end}
{col 10}(7){col 20}y(i,t) = o0(i) + phi(i)*y(i,t-1) + x(i,t)*o2(i) + g1(i)*[y(i,t)-y(i,t-1)] + [x(i,t) - x(i,t-1)] * g2(i) + e(i,t),
{p 4 4}where w2(i) = - o2(i) / phi(i) and w0(i) = - o0(i)/phi(i).
Equation (7) is estimated by including the levels of y and x as long run variables
using the {cmd:lr({varlist})} and {cmd:pooled({varlist})} options and
adding the first differences as independent variables.
{cmd:xtdcce2} estimates equation (7) but automatically calculates estimates for w(i) = (w0(i),...,wk(i)).
The advantage estimating equation (7) by OLS is that it is possible to use IV regressions and add cross sectional averages to account for dependencies between units.
The variance/covariance matrix is calculated using the delta method,
for a further discussion, see Ditzen (2018).{p_end}
{p 4}{help xtdcce2##example_pmg:See example}{p_end}
{p 4 4}The restriction of homogenous long run relationships can be relaxed.
This is then an unrestricted CS-ECM.
The results are equivalent to an CS-ARDL if only one lag is used.{p_end}
{marker csdl}{p 2}{ul: vii) Cross-Section Augmented Distributed Lag (CS-DL)}{p_end}
{p 4 4}Chudik et. al (2016) show that the long run effect of variable x on variable y in equation (1) can be directly estimated.
Therefore they fit the following model, based on equation (1):{p_end}
{col 10}(8){col 20}y(i,t) = w0(i) + x(i,t) * w2(i) + delta(i) * (x(i,t) - x(i,t-1)) + sum [d(i)*z(i,s)] + e(i,t)
{p 4 4}where w2(i) is the long effect and sum [d(i)*z(i,s)] the cross-sectional averages with
an appropriate number of lags.
To account for the lags of the dependent variable, the corresponding number of first differences are added.
If the model is an ARDL(1,1), then only the first difference of the explanatory variable is added.
In the case of an ARDL(1,2) model, the first and the second difference are added.
The advantage of the CS-DL approach is, that no short run coefficients need to be estimated. {break}
A general ARDL(py,px) model is estimated by:{p_end}
{col 10}(8){col 20}y(i,t) = w0(i) + x(i,t) * w2(i) + sum(l=1,px) delta(i,l) * (x(i,t-l) - x(i,t-l-1)) + sum [d(i)*z(i,s)] + e(i,t)
{p 4 4}The mean group coefficients are calculated as the unweighted averages of all cross-sectional specific coefficient estimates.
The variance/covariance matrix is estimated as in the case of a Mean Group Estimation.{p_end}
{p 4}{help xtdcce2##example_csdl:See example}{p_end}
{marker ardl}{p 2}{ul: viii) Cross-Section Augmented ARDL (CS-ARDL)}{p_end}
{p 4 4}As an alternative approach the long run coefficients can be estimated by first estimating the short run coefficients and then the long run coefficients.
For a general ARDL(py,px) model including cross-sectional averages such as:{p_end}
{col 10}(9){col 20}y(i,t) = b0(i) + sum(l=1,py) b1(i,l) y(i,t-l) + sum(l=0,px) b2(i,l) x(i,t-l) + sum [d(i)*z(i,s)] + e(i,t),
{p 4 4}the long run coefficients for the independent variables are calculated as:{p_end}
{marker eq_10}{col 10}(10){col 20}w2(i) = sum(l=0,px) b2(i,l) / ( 1 - sum(l=1,py) b1(i,l))
{p 4 4}and for the dependent variable as:{p_end}
{col 10}(11){col 20}w1(i) = 1 - sum(l=1,py) b1(i,l).
{p 4 4}This is the CS-ARDL estimator in Chudik et. al (2016).
The variables belonging to w(1,i) need to be enclosed in parenthesis, or {help tsvarlist} need to be used.
For example coding {cmd:lr(y x L.x)} is equivalent to {cmd:lr(y (x lx))}, where
lx is a variable containing the first lag of x (lx = L.x).{break}
The disadvantage of this approach is, that py and px need to be known.
The variance/covariance matrix is calculated using the delta method, see Ditzen (2019).{p_end}
{p 4}{help xtdcce2##example_ardl:See example}{p_end}
{p 2}{ul: ix) Regularized CCE}{p_end}{marker rcce}
{p 4 4}The CCE approach can involve a large number of cross-section averages which is larger than the number of factors and
can lead to a non-trivial bias for the pooled and mean group estimator, see Karabiyik et. al. (2017).
Juodis (2022) propose a solution for linear static panels which uses singular value decomposition to remove redundant singular values in the cross-section averages.
The so-called rCCE method involves the following steps:
{p_end}
{p 12 14}1. Calculate cross-sectional averages.{p_end}
{p 12 14}2. Estimate number of common factors using the ER or GR criterion from Ahn and Horenstein (2013).{p_end}
{p 12 14}3. Replace the cross-sectional averages with eigenvectors from the cross-section averages. The eigenvectors are the eigenvectors of the largest eigenvalues and the number is obtained in step 2.{p_end}
{p 4 4}The method requires bootstrapped standard errors, see {help xtdcce2##postestboot:bootstrapping}.{p_end}
{p 4}{help xtdcce2##example_rcce:See example}{p_end}
{p 2}{ul:Coefficient of Determination (R2)}{p_end}
{marker R2}
{p 4 4}{cmd:xtdcce2} calculates up to three different coefficients of determination (R2).
It calculates the standard un-adjusted R2 and the adjusted R2 as common in the literature.
If all coefficients are either pooled or heterogeneous, {cmd:xtdcce2} calculates
an adjusted R2 following Holly et. al (2010); Eq. 3.14 and 3.15.
The R2 and adjusted R2 are calculated even if the pooled or mean group adjusted
R2 is calculated.
However the pooled or mean group adjusted R2 is displayed instead of the adjusted R2
if calculated.{p_end}
{p 4 4}In the case of a pure homogeneous model, the adjusted R2 is calculated as:{p_end}
{col 10}R2(CCEP) = 1 - s(p)^2 / s^2
{p 4 4}where {it:s(p)^2} is the error variance estimator from the pooled regressions
and {it:s^2} the overall error variance estimator. They are defined as{p_end}
{col 10}s(p)^2 = sum(i=1,N) e(i)'e(i) / [N ( T - k - 2) - k],
{col 10}s^2 = 1/(N (T -1)) sum(i=1,N) sum(t=1,T) (y(i,t) - ybar(i) )^2.
{p 4 4}k is the number of regressors, {cmd:e(i)} is a vector of residuals
and {cmd:ybar(i)} is the cross sectional specific mean of the dependent variable.{p_end}
{p 4 4}For mean group regressions the adjusted R2 is the mean of the cross-sectional
individual R2 weighted by the overall error variance:{p_end}
{col 10}R2(CCEMG) = 1 - s(mg)^2 / s^2
{col 10}s(mg)^2 = 1/N sum(i=1,N) e(i)'e(i) / [T - 2k - 2].
{p 2}{ul: Rank Classifier}{marker rcc}
{p 4 4}A key condition for a consistent estimation in the presence of common factors
and strong cross-sectional dependence is the so called Rank Condition in Pesaran (2016) and
Chudik and Pesaran (2015).
The rank condition implies that the rank of the average factor loadings is is smaller than the number of common factors.
Karabiyik et al. (2017) show that if rank condition fails, the CCE estimator is inconsistent.
In a empirical setting this implies that 1) the unknown number of unobserved common factors has to be equal or larger
than the rank of the unobserved average factor loadings;
2) cross-section averages with a zero loading can pose a problem if the number of cross-section averages is small.
{p_end}
{p 4 4}DeVos et al. (2024) propose an classifier which indicates if the rank conditions holds:{p_end}
{col 10}RC = 1 − I (g = 1962 , lr(L.c pi y) p(L.c pi y) cr(_all) cr_lags(2)}{p_end}
{p 4 4}{cmd:xtdcce2} internally estimates equation (7) and then recalculates the long run coefficients, such that estimation results for equation (8) are obtained.
Equation (7) can be estimated adding {cmd:nodivide} to {cmd:lr_options()}.
A second option is {cmd:xtpmgnames} in order to match the naming convention from {help xtpmg}.{p_end}
{p 8}{stata xtdcce2 d.c d.pi d.y if year >= 1962 , lr(L.c pi y) p(L.c pi y) cr(_all) cr_lags(2) lr_options(nodivide)}{p_end}
{p 8}{stata xtdcce2 d.c d.pi d.y if year >= 1962 , lr(L.c pi y) p(L.c pi y) cr(_all) cr_lags(2) lr_options(xtpmgnames)}{p_end}
{marker example_csdl}{p 4}{ul: Cross-Section Augmented Distributed Lag (CS-DL)}{p_end}
{p 4 4}Chudik et. al (2013) estimate the long run effects of public debt on output growth
(the data is available {browse www.econ.cam.ac.uk/people-files/faculty/km418/CMPR_Data.zip:here}
on {browse www.econ.cam.ac.uk/people-files/faculty/km418/research.html:Kamiar Mohaddes' personal webpage}).
In the dataset, the dependent variable is {it:d.y} and the independent variables are the inflation rate (dp) and debt to GDP ratio (d.gd).
For an ARDL(1,1,1) only the first difference of dp and d.gd are added as further covariates.
Only a contemporaneous lag of the cross-sectional averages (i.e. cr_lags(0)) of the dependent variable and
3 lags of the independent variables are added.
The lag structure is implemented by defining a {it:numlist} rather than a number in {cmd:cr_lags()}.
For the example here {cmd:cr_lags(0 3 3)} is used, where the first
number refers to the first variable defined in {cmd:cr()}, the second to the second etc.{p_end}
{p 4 4}To replicate the results in Table 18, the following command line is used:{p_end}
{p 8}{stata xtdcce2 d.y dp d.gd d.(dp d.gd), cr(d.y dp d.gd) cr_lags(0 3 3) fullsample}{p_end}
{p 4 4}For an ARDL(1,3,3) model the first and second lag are of the first differences are added by putting {cmd:L(0/2)} in front of the {cmd:d.(dp d.gd)}:{p_end}
{p 8}{stata xtdcce2 d.y dp d.gd L(0/2).d.(dp d.gd), cr(d.y dp d.gd) cr_lags(0 3 3) fullsample}{p_end}
{p 4 4}Note, the {cmd:fullsample} option is used to reproduce the results in Chudik et. al (2013).{p_end}
{marker example_ardl}{p 4}{ul: Cross-Section Augmented ARDL (CS-ARDL)}{p_end}
{p 4 4}Chudik et. al (2013) estimate besides the CS-DL model a CS-ARDL model.
To estimate this model all variables are treated as long run coefficients and thus added to {varlist} in {cmd:lr({varlist})}.
{cmd:xtdcce2} first estimates the short run coefficients and then calculates the long run coefficients,
following {help xtdcce2##eq_10:Equation 10}.
The option {cmd:lr_options(ardl)} is used to invoke the estimation of the long run coefficients.
Variables with the same base (i.e. forming the same long run coefficient) need to be either
enclosed in parenthesis or {help tsvarlist} operators need to be used.
In Table 17 an ARDL(1,1,1) model is estimated with three lags of the cross-sectional averages:{p_end}
{p 8}{stata xtdcce2 d.y , lr(L.d.y dp L.dp d.gd L.d.gd) lr_options(ardl) cr(d.y dp d.gd) cr_lags(3) fullsample} {p_end}
{p 4 4}{cmd:xtdcce2} calculates the long run effects identifying the variables by their base.
For example it recognizes that {it:dp} and {it:L.dp} relate to the same variable.
If the lag of {it:dp} is called {it:ldp}, then the variables need to be enclosed in parenthesis.{p_end}
{p 4 4}Estimating the same model but as an ARDL(3,3,3) and with enclosed parenthesis reads:{p_end}
{p 8}{stata xtdcce2 d.y , lr((L(1/3).d.y) (L(0/3).dp) (L(0/3).d.gd) ) lr_options(ardl) cr(d.y dp d.gd) cr_lags(3) fullsample}{p_end}
{p 4 4}which is equivalent to coding without parenthesis:{p_end}
{p 8}{stata xtdcce2 d.y , lr(L(1/3).d.y L(0/3).dp L(0/3).d.gd) lr_options(ardl) cr(d.y dp d.gd) cr_lags(3) fullsample}{p_end}
{marker example_rcce}{p 4}{ul: Regularized CCE and bootstrapping}{p_end}
{p 4 4}The regularized CCE approach is only possible for static models.
To estimate a static model of growth on human, physical capital and population growth, we can use:{p_end}
{p 8}{stata xtdcce2 log_rgdpo log_hc log_ck log_ngd , cr(log_rgdpo log_hc log_ck log_ngd, rcce)}{p_end}
{p 4 4}{cmd:xtdcce2} selects the first and second eigenvector of the cross-section averages and adds it as a variable.
The selection criterion is the ER criterion from Ahn and Horenstein (2013).
To use the GR criterion instead, the option {cmd:criterion(gr)} is used:{p_end}
{p 8}{stata xtdcce2 log_rgdpo log_hc log_ck log_ngd , cr(log_rgdpo log_hc log_ck log_ngd, rcce(criterion(gr)))}{p_end}
{p 4 4}Three regularized cross-section averages are added.
To ensure standard errors are correct, a bootstrap is run with a fixed seed:{p_end}
{p 8}{stata estat bootstrap, seed(123)}{p_end}
{p 4 4}To run a wild bootstrap and bootstrap confidence intervals, the options {cmd:wild} and {cmd:percentile} are added:{p_end}
{p 8}{stata estat bootstrap, seed(123) wild percentile}{p_end}
{p 4 4}Instead of specifying the criteria to estimate the number of eigenvectors of the rcce approach, we can hard set it using the option {cmd:npc()}:{p_end}
{p 8}{stata xtdcce2 log_rgdpo log_hc log_ck log_ngd , cr(log_rgdpo log_hc log_ck log_ngd, rcce(npc(3)))}{p_end}
{marker example_rcc}{p 4}{ul: Rank Condition Classifier}{p_end}
{p 4 4}The rank condition is key for a consistent estimation using the CCE estimator.
Rank condition implies that - loosely speaking - the estimated rank of the matrix of cross-sectional averages of the data has to be larger or equal to the rank of the factors.
To calculate the classifier from DeVos et al. (2024), option {cmd:rcclassifier} is added to the static model from the first
Example:
{p_end}
{p 8}{stata xtdcce2 d.log_rgdpo log_hc log_ck log_ngd , cr(_all, rccl) reportc}{p_end}
{p 4 4}The estimated rank of the matrix of cross-sectional averages of the data is 3 and the rank of the factors is 1, thus the rank condition holds.{break}
Using the ER criterion to estimate the number of common factors in the cross-section averages and use fold-over matrix based on
random normal values to shrink dimension:{p_end}
{p 8}{stata xtdcce2 d.log_rgdpo log_hc log_ck log_ngd , cr(_all, rccl(er random))}{p_end}
{p 4 4}The estimated rank of the averages factor loadings reduces to 2, but is still arger than the number of factors. Rank condition holds.{p_end}
{marker example_ic}{p 4}{ul: Information Criteria}{p_end}
{p 4 4}The Information Criteria from Margaritella and Westerlund (2023) can be used to identify the optimal set of cross-section averages in static panels.
We return to the model from the first Example:{p_end}
{p 8}{stata xtdcce2 d.log_rgdpo log_hc log_ck log_ngd , cr(_all) reportc}{p_end}
{p 4 4}Obtain IC1 and IC2 for the current set of cross-section averages defined in {cmd:cr(_all)}:{p_end}
{p 8}{stata estat ic}{p_end}
{p 4 4}Next, calculate IC for all possible combination of cross-section and indicate the lowest ones:{p_end}
{p 8}{stata estat ic, sequential}{p_end}
{p 4 4}IC for three different sets of cross-section averages indicated by {cmd:model(}{it:(model1) (model2) ... (modelK)}{cmd:)}, where
{it:model1} is the reference model with the largest set of cross-section averages:{p_end}
{p 8}{stata estat ic, model( (d.log_rgdpo log_hc log_ck log_ngd) (log_hc log_ck log_ngd) (log_hc log_ck ) )}{p_end}
{p 4 4}Get IC for a single model of cross-section averages using option {cmd:single}.
Compare to output when option {cmd:single} not used, then all combinations are tried.{p_end}
{p 8}{stata estat ic, model(log_hc log_ck ) single}{p_end}
{p 8}{stata estat ic, model(log_hc log_ck log_ngd)}{p_end}
{marker references}{title:References}
{p 4 8}Ahn, S. C., & Horenstein, A. R. 2013.
Eigenvalue ratio test for the number of factors.
Econometrica, 81(3), 1203–1227.{p_end}
{p 4 8}Baum, C. F., M. E. Schaffer, and S. Stillman 2007.
Enhanced routines for instrumental variables/generalized method of moments estimation and testing.
Stata Journal 7(4): 465-506{p_end}
{p 4 8}Blackburne, E. F., and M. W. Frank. 2007.
Estimation of nonstationary heterogeneous panels.
Stata Journal 7(2): 197-208.{p_end}
{p 4 8}Chudik, A., K. Mohaddes, M. H. Pesaran, and M. Raissi. 2013.
Debt, Inflation and Growth: Robust Estimation of Long-Run Effects in Dynamic Panel Data Model.{p_end}
{p 4 8}Chudik, A., and M. H. Pesaran. 2015.
Common correlated effects estimation of heterogeneous dynamic panel data models with weakly exogenous regressors.
Journal of Econometrics 188(2): 393-420.{p_end}
{p 4 8}Chudik, A., K. Mohaddes, M. H. Pesaran, and M. Raissi. 2016.
Long-Run Effects in Large Heterogeneous Panel Data Models with Cross-Sectionally Correlated Errors
Essays in Honor of Aman Ullah. 85-135.{p_end}
{p 4 8}De Vos, I., G. Everaert and V. Sarafidis. 2024.
A method to evaluate the rank condition for CCE estimators.
Econometric Reviews 43(2-4).{p_end}
{p 4 8}Ditzen, J. 2018.
Estimating Dynamic Common Correlated Effcts in Stata.
The Stata Journal, 18:3, 585 - 617.{p_end}
{p 4 8}Ditzen, J. 2021.
Estimating long run effects and the exponent of cross-sectional dependence: an update to xtdcce2.
The Stata Journal 21:3.{p_end}
{p 4 8}Eberhardt, M. 2012.
Estimating panel time series models with heterogeneous slopes.
Stata Journal 12(1): 61-71.{p_end}
{p 4 8}Holly, S., Pesaran, M. H., Yamagata, T. 2010.
A spatio-temporal model of house prices in the USA.
Journal of Econometrics 158: 160 - 172.{p_end}
{p 4 8}Feenstra, R. C., R. Inklaar, and M. Timmer. 2015.
The Next Generation of the Penn World Table. American Economic Review. www.ggdc.net/pwt{p_end}
{p 4 8}Goncalves, S., & Perron, B. 2014.
Bootstrapping factor-augmented regression models.
Journal of Econometrics, 182(1), 156–173.{p_end}
{p 4 8}Jann, B. 2005.
moremata: Stata module (Mata) to provide various functions.
Available from http://ideas.repec.org/c/boc/bocode/s455001.html.{p_end}
{p 4 8}Juodis, A. 2022.
A regularization approach to common correlated effects estimation.
Journal of Applied Econometrics, 37(4), 788– 810.{p_end}
{p 4 8}Karabıyık, H., Reese, S., & Westerlund, J. 2017.
On the role of the rank condition in cce estimation of factor-augmented panel regressions.
Journal of Econometrics, 197(1), 60–64.{p_end}
{p 4 8}Margaritella, L., and J. Westerlund. 2023. Using Information Criteria to
Select Averages in CCE. The Econometrics Journal. 26(3): 405-421.{p_end}
{p 4 8}Pesaran, M. H. 2006.
Estimation and inference in large heterogeneous panels with a multifactor error structure.
Econometrica 74(4): 967-1012.{p_end}
{p 4 8}Pesaran, M. H., and R. Smith. 1995.
Econometrics Estimating long-run relationships from dynamic heterogeneous panels.
Journal of Econometrics 68: 79-113.{p_end}
{p 4 8}Roodman, D., Nielsen, M. Ø., MacKinnon, J. G., & Webb, M. D. 2019.
Fast and wild: Bootstrap inference in Stata using boottest.
The Stata Journal, 19(1), 4–60.{p_end}
{p 4 8}Shin, Y., M. H. Pesaran, and R. P. Smith. 1999.
Pooled Mean Group Estimation of Dynamic Heterogeneous Panels.
Journal of the American Statistical Association 94(446): 621-634.{p_end}
{p 4 8}Westerlund, J., Petrova, Y., Norkute, M. 2019. CCE in fixed-T panels.
Journal of Applied Econometrics: 1-6.{p_end}
{marker about}{title:Author}
{p 4}Jan Ditzen (Free University of Bozen-Bolzano){p_end}
{p 4}Email: {browse "mailto:jan.ditzen@unibz.it":jan.ditzen@unibz.it}{p_end}
{p 4}Web: {browse "www.jan.ditzen.net":www.jan.ditzen.net}{p_end}
{p 4 8}I am grateful to Arnab Bhattacharjee, David M. Drukker, Markus Eberhardt, Tullio Gregori,
Sebastian Kripfganz,
Erich Gundlach,
Achim Ahrens,
Kyle McNabb,
Vasilis Sarafidis,
Sean Holly and Mark Schaffer, to the participants of the
2016 Stata Users Group meeting in London, 2018 Stata User Group meeting in Zuerich,
and two anonymous referees of The Stata Journal for many valuable comments and suggestions.
All remaining errors are my own.{p_end}
{p 4}The routine to check for positive definite or singular matrices was provided by Mark Schaffer, Heriot-Watt University, Edinburgh, UK.{p_end}
{p 4 4}{cmd:xtdcce2} was formally called {cmd:xtdcce}.{p_end}
{p 4 8}Please cite as follows:{break}
Ditzen, J. 2018. xtdcce2: Estimating dynamic common correlated effects in Stata. The Stata Journal. 18:3, 585 - 617.
{p_end}
{p 4 8}The latest versions can be obtained via {stata "net from https://github.com/JanDitzen/xtdcce2"}.{p_end}
{marker ChangLog}{title:Version History}
{p 4 8}Version 4.6 to 4.7 - June 2024{p_end}
{p 8 10} - bug fixes when using predict with xtdcce2fast.{p_end}
{p 8 10} - factor variables for option cr().{p_end}
{p 8 10} - added rank classifier and information criteria to select CSA.{p_end}
{p 4 8}Version 4.5 to 4.6 - January 2024{p_end}
{p 8 10} - bug fix in rcce option when using unbalanced panels{p_end}
{p 8 10} - bug fix when combining absorb() with csa (thanks to Vasilis Sarafidis for pointing it out){p_end}
{p 4 8}Version 4.4 to 4.5 - May 2023{p_end}
{p 8 10} - supports new reghdfe version{p_end}
{p 4 8}Version 4.3 to 4.4 - May 2023{p_end}
{p 8 10} - fixed bug in trend option{p_end}
{p 4 8}Version 4.2 to 4.3 - May 2023{p_end}
{p 8 10} - fixed bug in var/cov estimation when using pooled coefficients and R matrix is zero{p_end}
{p 4 8}Version 4.1 to 4.2 - April 2023{p_end}
{p 8 10} - fixed bug in xtcd2 and xtcse2{p_end}
{p 4 8}Version 4. to 4.1 - March 2023{p_end}
{p 8 10} - fixed bug when using different lag lengths for CSA{p_end}
{p 4 8}Version 3.01 to 4 - Feb 2023{p_end}
{p 8 10} - bootstrap support{p_end}
{p 8 10} - added option {cmd:mgmissing}{p_end}
{p 8 10} - added option rcce{p_end}
{p 8 10} - added option {cmd:fast2}{p_end}
{p 8 10} - fixed error when pooled and ardl used.{p_end}
{p 4 8}Version 3.0 to 3.01{p_end}
{p 8 10} - error if abbreviation is cr() used fixed.{p_end}
{p 4 8}Version 2.0 to 3.0{p_end}
{p 8 10} - improved support for factor variables.{p_end}
{p 8 10} - fix for mm_which2.{p_end}
{p 8 10} - message for large panels.{p_end}
{p 8 10} - error in calculation for variances of cross-sectional unit specific coefficients{p_end}
{p 8 10} - fix predict program: partial now only in-sample and bug fixed when xb2 and reportc was used (thanks to Tullio Gregori for the pointers).{p_end}
{p 8 10} - added global and local cross-sectional averages{p_end}
{p 4 8}Version 1.34 to 2.0{p_end}
{p 8 10} - Bug fix in calculation of minimal T dimension, added option nodimcheck.{p_end}
{p 8 10} - Speed improvements (thanks to Achim Ahrens for the suggestions).{p_end}
{p 8 10} - Bug fixes for jackknife (thanks to Collin Rabe for the pointer).{p_end}
{p 8 10} - Bug fix in predict and if (thanks for Deniey A. Purwanto and Tullio Gregori for the pointers).{p_end}
{p 8 10} - Bug fix if binary variable used and constant partialled out.{p_end}
{p 8 10} - Bug fixed in calculation of R2, added adjusted R2 for pooled and MG regressions.{p_end}
{p 8 10} - Newey West and Westerlund, Petrova, Norkute standard errors for pooled regressions.{p_end}
{p 8 10} - invsym for rank deficient matrices.{p_end}
{p 8 10} - Added {cmd:xtcse2} support.{p_end}
{p 4 8}Version 1.33 to Version 1.34{p_end}
{p 8 10} - small bug fixes in code and help file.{p_end}
{p 4 8}Version 1.32 to Version 1.33{p_end}
{p 8 10} - bug in if statements fixed.{p_end}
{p 8 10} - noomitted added, bug in cr(_all) fixed.{p_end}
{p 8 10} - added option "replace" and "cfresiduals" to predict.{p_end}
{p 8 10} - CS-DL and CS-ARDL method added.{p_end}
{p 8 10} - Output as in Stata Journal Version.{p_end}
{p 4 8}Version 1.31 to Version 1.32{p_end}
{p 8 10} - bug number of groups fixed{p_end}
{p 8 10} - predict, residual produced different results within xtdcce2 and after if panel unbalanced or trend used (thanks to Tullio Gregori for the pointer).{p_end}
{p 8 10} - check for rank condition.{p_end}
{p 8 10} - several bugs fixed.{p_end}
{p 4 8}Version 1.2 to Version 1.31{p_end}
{p 8 10} - code for regression in Mata{p_end}
{p 8 10} - corrected standard errors for pooled coefficients, option cluster not necessary any longer. Please rerun estimations if used option pooled(){p_end}
{p 8 10} - Fixed errors in unbalanced panel{p_end}
{p 8 10} - option post_full removed, individual estimates are posted in e(bi) and e(Vi){p_end}
{p 8 10} - added option ivslow.{p_end}
{p 8 10} - legacy control for endogenous_var(), exogenous_var() and residuals().{p_end}
{title:Also see}
{p 4 4}See also: {help xtdcce2fast}, {help xtcd2}, {help xtcse2}, {help ivreg2}, {help xtmg}, {help xtpmg}, {help moremata}{p_end}