help xtcd -------------------------------------------------------------------------------


xtcd -- Investigating Variable/Residual Cross-Section Dependence


xtcd varlist [if] [in] [, resid off]


xtcd implements the Pesaran (2004) CD test for cross-section dependence in panel time-series data. The routine performs the same CD test as the xtcsd varname, pesaran command by De Hoyos and Sarafidis (2006) but (i) allows for multiple variable series to be tested at the same time and (ii) is not a post-estimation command: xtcd can be applied to variable series (e.g. pre-estimation analysis of cross-section dependence in the data) as well as to residuals, provided these have been previously computed as a separate variable series (see example below).


Cross-section dependence in macro panel data has received a lot of attention in the emerging panel time series literature over the past decade (for an introduction to panel time series see Eberhardt, 2009). This type of correlation may arise from globally common shocks with heterogeneous impact across countries, such as the oil crises in the 1970s or the global financial crisis from 2007 onwards. Alternatively it can be the result of local spillover effects between countries or regions. For a detailed discussion of the topic within cross-country empirics see Eberhardt and Teal (2011). For a survey and application of existing cross-section dependence tests refer to Moscone and Tosetti (2009).

Empirical Implementation

The Pesaran CD-test employs the correlation-coefficients between the time-series for each panel member. In the example dataset for N=128 countries, for instance, this would be the 128 x 127 correlations between country i and all other countries, for i=1 to N-1. Referring to these estimated correlation coefficient between the time-series for country i and j as rho*_ij the Pesaran CD statistic is then computed as

CD = sqrt[2/(N(N-1))] * [SUM_(i=1 to N-1) SUM_(j=i+1 to N) sqrt(T_ij rho > *_ij)]

where T_ij is the number of observations for which the correlation coefficient was computed. Since macro panel data is frequently unbalanced we only present the above equation appropriate for both balanced and unbalanced panels. Under the null hypothesis of cross-section independence the above statistics is distributed standard normal for T_ij>3 and N sufficiently large. The test is robust to nonstationarity (the spuriousness would show up in the averaging), parameter heterogeneity or structural breaks and was shown to perform well even in small samples.


resid identifies the data series tested as residuals. This leads to a small transformation in the data series to allow for unbalancedness in the panel: imagine a panel which includes one group for which only 20 time-series observations are available, whereas for another group this number amounts to 40. If these are residual series from group-specific regressions, as is often the case in panel time series empirics, then the residuals will have been minimised over the time horizon, i.e. in the above example the residuals in the samples with T_1=20 and T_2=40 will average to zero (or close to zero) respectively. Imagine now that these two time series only overlap for T_12=10 years, since T_1 starts earlier than T_2 and the latter obviously stretches much more into the present than the former. In order to avoid distortions arising from the residuals for the two samples over the T_12 time horizon not to average to zero we first compute the deviations of each residual series from the time-series mean over the T_12 horizon before computing the correlation coefficient. Note: this option only makes a difference if the residuals are from a heterogeneous panel model (see example). The computations presently take somewhat longer than for the standard approach.

off turns off the output table.

Return values

Scalars r(N_g) Number of panel members

Matrices r(nobs) Total number of obs used in the correlations (N x (N- > 1) x T_ij) r(avgcorr) Averaged correlation coefficient r(abscorr) Averaged absolute correlation coefficient r(pesaran) Pesaran CD-statistic r(numb_coeff) Number of correlations computed r(avg_obs) Average number of observations for each correlation

Macros r(varname) Name(s) of variable or residual series tested


Download FAO production data for the agriculture sector in 128 countries (1961-2002, unbalanced). See Eberhardt and Teal (2010) for more details on data construction and deflation.

Variables used in illustration: ly log value-added per worker, ltr log tractors per worker, llive log livestock per worker, lf log fertilizer per worker, ln log land per worker (all with reference to the agricultural sector). Note that the dataset is very large, such that it would be advisable to increase the memory and matsize before loading the data.

.clear .set mem 100m .set matsize 8000

Once the dataset is loaded into the program, set the panel dimensions: time variable - year, country identifier - clist2 .tsset clist2 year

Investigate cross-section dependence in log agricultural value-added per worker .xtcd ly

Investigate cross-section dependence in all production function variables .xtcd ly ltr llive lf ln

Compute the residuals from an OLS production function with time fixed effects and test the residuals for cross-section dependence. .xi: reg ly ltr llive lf ln i.year .predict ols_res if e(sample), res .xtcd ols_res, resid

Compute the residuals from a heterogeneous parameter production function using the Pesaran & Smith (1995) Mean Group estimator (xtmg if installed) with a country-specific linear trend. Then test the residuals for cross-section independence .xtmg ly ltr llive lf ln, trend robust res(mg_res) .xtcd mg_res, resid

Compute the residuals from a heterogeneous parameter production function using the Pesaran (2006) CCE Mean Group estimator (xtmg if installed) and then test the residuals .xtmg ly ltr llive lf ln, cce robust res(cce_res) .xtcd cce_res, resid


Eberhardt, Markus (2009) 'Nonstationary Panel Econometrics and Common Factor Models: An Introductory Reader', unpublished mimeo, available from here.

Eberhardt, Markus and Francis Teal (2011) 'Econometrics for Grumblers: A New Look at the Literature on Cross-Country Growth Empirics', Journal of Economic Surveys, Vol.25(1), pp.109155.

Eberhardt, Markus and Francis Teal (2010) 'Mangos in the Tundra? Spatial Heterogeneity in Agricultural Productivity Analysis', Centre for the Study of African Economies, University of Oxford, unpublished working paper, available here.

Moscone, Francesco and Elisa Tosetti (2009) 'A Review And Comparison Of Tests Of Cross-Section Independence In Panels', Journal of Economic Surveys, Vol. 23(3), pp.528-561.

Pesaran, M. Hashem (2004) General Diagnostic Tests for Cross Section Dependence in Panels' IZA Discussion Paper No. 1240.

Pesaran, M. Hashem (2006) 'Estimation and inference in large heterogeneous panels with a multifactor error structure.' Econometrica, Vol. 74(4): pp.967-1012.

Acknowledgements and Disclaimer

This routine builds to a very large extent on the existing code for the Pesaran (2004) CD test (xtcsd) by De Hoyos and Sarafidis (2006). Users should refer to their help file for more details and acknowledge these authors. Any errors are of course my own.


Markus Eberhardt Centre for the Study of African Economies Department of Economics University of Oxford Manor Road, Oxford OX1 3UQ

Also see

Online: help for xtcsd (if installed), xtmg (if installed)