-------------------------------------------------------------------------------
help for apc_ie
-------------------------------------------------------------------------------

Intrinsic estimator for age-period-cohort effects in generalized linear models

Syntax

apc_ie depvar [indepvars] [if exp] [in range] [weight] , [age(varname) period(varname) cohort(varname) generate(newvarname) glm_options ]

Replay syntax

apc_ie [, noheader level(cilevel) eform ]

apc_ie allows all varlists and weights that are allowed by glm.

Description

apc_ie estimates generalized linear models with age, period and cohort effects using the intrinsic estimator (IE) described by Yang, Fu and Land (2004). The structure of the design matrix -- i.e., the numbers of age groups and time periods -- may affect the estimates obtained from conventional CGLIM estimators. (See apc_cglim.) The IE employs a special principal components regression that removes the influence of the null (column) space of the design matrix on the estimator.

Methods

The IE estimates a constrained parameter vector that corresponds to the projection of the model parameter on the non-null (column) subspace of the design matrix. apc_ie computes this constrained parameter vector by a special principal components regression.

The estimation algorithm proceeds by first applying an orthonormal matrix transformation to the X'X matrix of the linear model (or its equivalent in a generalized linear model). This transformation produces the nonzero eigenvalues and corresponding eigenvectors of the matrix. The principal components regression then is estimated by using these eigenvectors as variables. After estimation of the principal components regression model, the orthonormalizing matrix transformation is used in reverse to transform the estimated regression coefficients back to the original age, period, and cohort effects for ease of interpretation.

Instead of omitting one reference category from each set of indicator variables, the IE uses the constraint that the sum of coefficients in each set is zero. The algorithm in apc_ie adds an indicator variable for each unique value of age, period and cohort to the list of independent variables, but omits one category for each of age, period and cohort for computational purposes. Because age+cohort=period, one additional indicator variable is redundant. apc_ie therefore replaces the A-1+P-1+C-1 indicator variables with the A-1+P-1+C-2 principal components that correspond to nonzero eigenvalues. After estimating the principal components regression, the IE uses the zero-sum constraints to obtain estimates for the deleted age, period and cohort categories.

Because the IE -- including the coefficients obtained from zero-sum constraints -- is a linear transformation of the coefficients in the principal components model, the variance-covariance matrix of the IE is V_ie=B*V_pc*B', where B is a transformation matrix and V_pc is the variance-covariance matrix of the principal components estimator. See glm for options for computing variance-covariance matrices in generalized linear models; apc_ie simply applies a transformation to whatever variance-covariance matrix you choose to compute.

The IE is always defined based on principal components of a design matrix that has one observation per age-by-period cell. Missing data or multiple observations per cell do not affect which principal components are used in calculating the IE. Likewise, the principal components do not depend on indepvars or weights, if any. However, if any of the indepvars are collinear with age, period or cohort, or if too many cells in the age-by-period matrix have no data, eliminating only one principal component does not resolve the identification problem. apc_ie returns an error when this happens.

Options

age(varname), period(varname) and cohort(varname) specify the age, period and cohort variables. At least two of these three must be specified. If all three are specified, they must satisfy age+cohort=period. If only two are specified, the missing variable is generated according to age+cohort=period.

generate(newvarname) stores the generated value of age, period or cohort in a new variable.

glm_options can be any valid options for glm.

References

Yang, Y., Fu, W., and Land, K. 2004. A Methodological Comparison of Age-Period-Cohort Models: The Intrinsic Estimator and Conventional Generalized Linear Models. Sociological Methodology 34(1), 75-110.

Authors

Sam Schulhofer-Wohl Department of Economics The University of Chicago 1126 E. 59th St. Chicago, IL 60637 sschulh1@uchicago.edu

Yang Yang, Ph.D. Department of Sociology Population Research Center and Center on Aging at NORC The University of Chicago 1126 E. 59th St. Chicago, IL 60637 (O) 773-834-1113 yangy@uchicago.edu

Also see

Online: help for glm; apc_cglim (if installed).