{smcl} {* 30Jun2006}{...} {hline} help for {hi:apc_ie} {hline} {title:Intrinsic estimator for age-period-cohort effects in generalized linear models} {p 4}Syntax {p 8 14}{cmd:apc_ie} {it:depvar} [{it:indepvars}] [{cmd:if} {it:exp}] [{cmd:in} {it:range}] [{it:weight}] {cmd:,} {bind:[{cmd:age(}{it:varname}{cmd:)}} {cmd:period(}{it:varname}{cmd:)} {cmd:cohort(}{it:varname}{cmd:)} {cmdab:gen:erate(}{it:newvarname}{cmd:)} {bind:{it:glm_options} ]} {p 4}Replay syntax {p 8 14}{cmd:apc_ie} {bind:[{cmd:,}} {cmdab:nohead:er} {cmdab:le:vel(}{it:cilevel}{cmd:)} {bind:{cmdab:ef:orm} ]} {p 4} {cmd:apc_ie} allows all {it:varlists} and {it:weights} that are allowed by {cmd:glm}. {title:Description} {p} {cmd:apc_ie} estimates generalized linear models with age, period and cohort effects using the intrinsic estimator (IE) described by Yang, Fu and Land (2004). The structure of the design matrix -- i.e., the numbers of age groups and time periods -- may affect the estimates obtained from conventional CGLIM estimators. (See {help apc_cglim}.) The IE employs a special principal components regression that removes the influence of the null (column) space of the design matrix on the estimator. {title:Methods} {p} The IE estimates a constrained parameter vector that corresponds to the projection of the model parameter on the non-null (column) subspace of the design matrix. {cmd:apc_ie} computes this constrained parameter vector by a special principal components regression. {p} The estimation algorithm proceeds by first applying an orthonormal matrix transformation to the {it:X'X} matrix of the linear model (or its equivalent in a generalized linear model). This transformation produces the nonzero eigenvalues and corresponding eigenvectors of the matrix. The principal components regression then is estimated by using these eigenvectors as variables. After estimation of the principal components regression model, the orthonormalizing matrix transformation is used in reverse to transform the estimated regression coefficients back to the original age, period, and cohort effects for ease of interpretation. {p} Instead of omitting one reference category from each set of indicator variables, the IE uses the constraint that the sum of coefficients in each set is zero. The algorithm in {cmd:apc_ie} adds an indicator variable for each unique value of {it:age}, {it:period} and {it:cohort} to the list of independent variables, but omits one category for each of {it:age}, {it:period} and {it:cohort} for computational purposes. Because {it:age}+{it:cohort}={it:period}, one additional indicator variable is redundant. {cmd:apc_ie} therefore replaces the {it:A-1+P-1+C-1} indicator variables with the {it:A-1+P-1+C-2} principal components that correspond to nonzero eigenvalues. After estimating the principal components regression, the IE uses the zero-sum constraints to obtain estimates for the deleted age, period and cohort categories. {p} Because the IE -- including the coefficients obtained from zero-sum constraints -- is a linear transformation of the coefficients in the principal components model, the variance-covariance matrix of the IE is {it:V_ie=B*V_pc*B'}, where {it:B} is a transformation matrix and {it:V_pc} is the variance-covariance matrix of the principal components estimator. See {help glm} for options for computing variance-covariance matrices in generalized linear models; {cmd:apc_ie} simply applies a transformation to whatever variance-covariance matrix you choose to compute. {p} The IE is always defined based on principal components of a design matrix that has one observation per age-by-period cell. Missing data or multiple observations per cell do not affect which principal components are used in calculating the IE. Likewise, the principal components do not depend on {it:indepvars} or {it:weights}, if any. However, if any of the {it:indepvars} are collinear with {it:age}, {it:period} or {it:cohort}, or if too many cells in the age-by-period matrix have no data, eliminating only one principal component does not resolve the identification problem. {cmd:apc_ie} returns an error when this happens. {title:Options} {p 0 4} {cmd:age(}{it:varname}{cmd:)}, {cmd:period(}{it:varname}{cmd:)} and {cmd:cohort(}{it:varname}{cmd:)} specify the {it:age}, {it:period} and {it:cohort} variables. At least two of these three must be specified. If all three are specified, they must satisfy {it:age}+{it:cohort}={it:period}. If only two are specified, the missing variable is generated according to {it:age}+{it:cohort}={it:period}. {p 0 4} {cmd:generate(}{it:newvarname}{cmd:)} stores the generated value of {it:age}, {it:period} or {it:cohort} in a new variable. {p} {it:glm_options} can be any valid options for {cmd:glm}. {title:References} {p 0 4}Yang, Y., Fu, W., and Land, K. 2004. A Methodological Comparison of Age-Period-Cohort Models: The Intrinsic Estimator and Conventional Generalized Linear Models. {it:Sociological Methodology} 34(1), 75-110. {title:Authors} Sam Schulhofer-Wohl Department of Economics The University of Chicago 1126 E. 59th St. Chicago, IL 60637 sschulh1@uchicago.edu Yang Yang, Ph.D. Department of Sociology Population Research Center and Center on Aging at NORC The University of Chicago 1126 E. 59th St. Chicago, IL 60637 (O) 773-834-1113 yangy@uchicago.edu {title:Also see} {p 0 19} Online: help for {help glm}; {help apc_cglim} (if installed).{p_end}