{smcl} {* Copyright 2015 Brendan Halpin brendan.halpin@ul.ie } {* Distribution is permitted under the terms of the GNU General Public Licence } {* 17July2015}{...} {cmd:help mict_impute} {hline} {title:Title} {phang} {hi:mict_impute} {hline 2} Carry out multiple imputations in categorical time-series data {marker syntax}{...} {title:Syntax} {p 8 17 2} {cmd:mict_impute} [{cmd:,} {it:options}] {synoptset 20 tabbed}{...} {synopthdr} {synoptline} {synopt:{opt maxg:ap}}Maximum length of internal gap to impute{p_end} {synopt:{opt maxit:gap}}Maximum length of initial or terminal gap to impute{p_end} {synopt:{opt nimp:}}Number of imputations{p_end} {synopt:{opt off:set}}Offset for imputation sequence number{p_end} {marker description}{...} {title:Description} {pstd} {cmd:mict_impute} creates imputations in categorical cross-sectional time-series data such as lifecourse histories, as described in Halpin (2012, 2013). It takes data prepared by {help mict_prep}, and imputes values for internal, initial and terminal gaps, in a manner that is sensitive to longitudinal consistency. It imputes a single categorical state variable, which must be defined previously using the {cmd:mict_prep} command. The options determine the maximum gap length that will be imputed (internal gaps: {opt maxg:ap}, default 12; initial and terminal gaps: {opt maxit:gap}, default 6), the number of imputations ({opt nimp}, default 5), and an offset for the imputation number ({opt off:set}, default 0) -- e.g., if the offset is 5, the iterations will be numbered from 6 up (useful for parallel runs). {p_end} {pstd}The command leaves in place the imputations, in wide format with their original variable names, with the variable {cmd:_mct_iter} storing the imputation number. {p_end} {marker remarks}{...} {title:Remarks} {pstd}{cmd:mict_impute} carries out multiple imputation for cross-sectional time series data with a categorical state variable, in a manner that is longitudinally consistent. It is difficult to achieve such longitudinal consistency with conventional approaches such as {help mi impute chained} or {help ice}. Under the hood, it uses {help mi impute mlogit} in a monotone imputation sequence.{p_end} {pstd}It is intended for data with many cases and a moderate duration (high N, moderate T), where (apart from missingness) T is the same for all cases. It is best suited for situations where missingness tends to be autocorrelated, generating gaps rather than isolated missing observations. The imputation models are also more effective where the average observed spell length is distinctly greater than one time unit.{p_end} {pstd}For each imputation {cmd:mict_impute} uses Stata's {cmd:mi impute}, but handles the logic of chaining imputations together on its own, in the manner described in Halpin (2012). In brief, gaps are filled from their edges using prediction models that include (at a minimum) information on the nearest observed past and future timepoints. This permits a monotone sequence of imputations, in contrast to chained imputation which will utilise (at a minimum) information from the immediate prior and next states. {p_end} {title:Imputation models} {pstd} By default, very simple imputation models are used, with the next and last (not necessarily adjacent) observed state predicting the current state (in initial and terminal gaps, respectively only the next or last observed states are used). These simple models should be over-ridden by more adequate imputation models by re-defining programs {cmd:mict_model_gap}, {cmd:mict_model_initial} and {cmd:mict_model_terminal}:{p_end} {phang}{cmd:. capture program drop mict_model_gap}{p_end} {phang}{cmd:. program define mict_model_gap}{p_end} {phang}{cmd:. mi impute mlogit _mct_state i._mct_next i._mct_last _mct_before* _mct_after*, add(1) force augment}{p_end} {phang}{cmd:. end}{p_end} {phang}{cmd:. capture program drop mict_model_initial}{p_end} {phang}{cmd:. program define mict_model_initial}{p_end} {phang}{cmd:. mi impute mlogit _mct_state i._mct_next _mct_after*, add(1) force augment}{p_end} {phang}{cmd:. end}{p_end} {phang}{cmd:. capture program drop mict_model_terminal}{p_end} {phang}{cmd:. program define mict_model_terminal}{p_end} {phang}{cmd:. mi impute mlogit _mct_state i._mct_last _mct_before*, add(1) force augment}{p_end} {phang}{cmd:. end}{p_end} {pstd}These examples differ from the defaults by including the sets of variables {cmd:_mct_before*} and {cmd:_mct_after*} in the imputation models. These variables are created by {cmd:mict_prep} and contain the proportion of observed time spent in each state, respectively before and after the current time. {p_end} {pstd}This strategy can also be used to fall back on a simpler model if the complex model will not converge in some cases:{p_end} {phang}{cmd:. capture program drop mict_model_gap}{p_end} {phang}{cmd:. program define mict_model_gap}{p_end} {phang}{cmd:. di "Attempt first gap model"}{p_end} {phang}{cmd:. capture mi impute mlogit _mct_state i._mct_next##c._mct_t i._mct_last##c._mct_t, add(1) force augment iterate(100)}{p_end} {phang}{cmd:. if (_rc==430) {c -(}}{p_end} {phang}{cmd:. di as error "NO CONVERGENCE, fitting simplest gap model"}{p_end} {phang}{cmd:. mi impute mlogit _mct_state i._mct_next i._mct_last, add(1) force augment}{p_end} {phang}{cmd:. }{c )-}{p_end} {phang}{cmd:. else if _rc {c -(}}{p_end} {phang}{cmd:. exit _rc}{p_end} {phang}{cmd:. }{c )-}{p_end} {phang}{cmd:. end}{p_end} {pstd}In this example, we first try to fit a model that suggests the manner in which prior and subsequent state affects current state changes over time. If this fails to converge in 100 iterations, a simpler model is fitted. If this fails, or if the first model fails for a reason other than non-convergence, an error is signalled. This is a very useful facility because occasionally imputations can fail to converge, depending on values imputed in earlier iterations.{p_end} {pstd}For longitudinal consistency, the models must contain at a minimum the variables {cmd:_mct_next} and {cmd:_mct_last} (as appropriate). The built-in variables {cmd:_mct_before*} and {cmd:_mct_after*} are also available, as is {cmd:_mct_t}, the time-index. Other predictors that can be used include fixed (e.g., gender) or time-varying variables (state in another domain, e.g., using fully-observed parenthood status to predict incompletely-observed labour-market status).{p_end} {pstd}The {help mi impute} options, {cmd:add(1)}, {cmd:force} and {cmd:augment} are required. Respectively, they cause {help mi impute} to carry out one imputation, to proceed even where some predictors are missing, and to use augmented multinomial logistic regression if perfect prediction is detected.{p_end} {title:Author} {pstd}Brendan Halpin, brendan.halpin@ul.ie{p_end} {title:Examples} {pstd}These examples use {cmd:mvadmar.dta}, a version of {cmd:mvad.dta} with runs of missing values imposed at random. These datasets are provided as ancillary files. The {cmd:mvad.dta} data is from McVicar and Anyadike Danes (2002).{p_end} {phang}{cmd:. use mvadmar}{p_end} {phang}{cmd:. mict_prep state, id(id)}{p_end} {phang}{cmd:. program define mict_model_gap}{p_end} {phang}{cmd:. mi impute mlogit _mct_state i._mct_next i._mct_last _mct_before* _mct_after*, add(1) force augment}{p_end} {phang}{cmd:. end}{p_end} {phang}{cmd:. program define mict_model_initial}{p_end} {phang}{cmd:. mi impute mlogit _mct_state i._mct_next _mct_after*, add(1) force augment}{p_end} {phang}{cmd:. end}{p_end} {phang}{cmd:. program define mict_model_terminal}{p_end} {phang}{cmd:. mi impute mlogit _mct_state i._mct_last _mct_before*, add(1) force augment}{p_end} {phang}{cmd:. end}{p_end} {phang}{cmd:. mict_impute, nimp(10)}{p_end} {pstd}See also ancillary files {cmd:mict_example1.do} and {cmd:mict_example2.do}.{p_end} {marker references}{...} {title:References} {phang}Halpin, B, (2012) `Multiple Imputation for Life-Course Sequence Data', Dept of Sociology working paper WP2012-01, University of Limerick. {browse "http://www.ul.ie/sociology/pubs/wp2012-01.pdf"} {p_end} {phang}Halpin, B, (2013) `Imputing Sequence Data: extensions to initial and terminal gaps', Dept of Sociology working paper WP2013-01, University of Limerick. {browse "http://www.ul.ie/sociology/pubs/wp2013-01.pdf"}{p_end} {phang}McVicar, D, and Anyadike-Danes, M, (2002) `Predicting Successful and Unsuccessful Transitions from School to Work Using Sequence Methods', Journal of the Royal Statistical Society (Series A), 165, pp317-334{p_end}