{smcl} {* *! version 1.0 20240927}{...} {p2colset 1 25 27 2}{...} {p2col:{bf:mi impute from} {hline 2}}Impute using an external imputation model{p_end} {p2colreset}{...} {marker syntax}{...} {title:Syntax} {p 8 19 2}{cmd:mi} {cmdab:imp:ute} {cmdab:from} {it:ivar} [{it:{help if}}] [{cmd:,} {it:options}] {synoptset 22 tabbed}{...} {synopthdr:options} {synoptline} {syntab:Main} {synopt: {cmd:b(}matname{cmd:)}} vector of regression coefficients used to impute {p_end} {synopt: {cmd:v(}matname{cmd:)}} corresponding matrix of variance/covariances used to impute {p_end} {synopt: {cmd:qreg}} quantile regression model for the quantitative variable {it:ivar} {p_end} {synopt: {cmd:mlogit}} multinomial logistic regression model for the categorical variable {it:ivar} {p_end} {synopt: {cmd:logit}} logistic regression model for the binary variable {it:ivar} {p_end} {synoptline} {p 4 6 2} You must {cmd:mi set} your data before using {cmd:mi} {cmd:impute} {cmd:from}; see {manhelp mi_set MI:mi set}. Factor variables in {it:{help indepvars}} are not allowed. {p_end} {marker description}{...} {title:Description} {pstd} {cmd:mi} {cmd:impute} {cmd:from} fills in missing values using an estimated imputation model estimated in one or multiple studies. A quantitative missing variable can be imputed passing the estimates of 99 (q=0.01(.01).99) linear predictors of quantile regression models. A categorical missing variable with {it:k} levels can be imputed passing {it:k} linear predictors of multinomial logistic regression models. A binary (0/1) missing variable can be imputed passing the linear predictors of logistic regression models. The command assumes that variables used in the imputation model are available in the current data. The {helpb mi_impute_from_get:mi_impute_from_get} is reading the files containing the estimated models, estimate a weighted inverse variance least square model to combine regression coefficients of the imputation model across studies, and formatting matrices to be passed to {cmd:mi} {cmd:impute} {cmd:from}. {marker options}{...} {title:Options} {dlgtab:Main} {phang} {cmd:b(}matname{cmd:)} specifies a vector of regression coefficients for the imputation model to be used for {it:ivar}. {phang} {cmd:v(}matname{cmd:)} specifies a matrix of variance/covariances for the imputation model to be used for {it:ivar}. {phang} {cmd:qreg} specifies that the matrix {cmd:b(}matname{cmd:)} contains 99 (q=0.01(.01)0.99) linear predictors of the quantitative variable {it:ivar} to be imputed. {phang} {cmd:mlogit} the imputation model is a multinomial logistic regression model for the categorical variable {it:ivar}. {phang} {cmd:logit} the imputation model is a logistic regression model for the binary variable {it:ivar}. {phang} {cmd:add()}, {cmd:replace}, {cmd:rseed()}, {cmd:double}; see {manhelp mi_impute MI:mi impute}. {marker example}{...} {title:Example #1: Quantitative variable 100% missing} {stata "use http://www.stats4life.se/data/from/qreg_study_1, clear"} {stata "mi set wide"} {stata "mi register imputed z"} // Get imputation model from one study (tab delimited .txt) {stata "mi_impute_from_get , b(e_qreg_b_s2) v(e_qreg_v_s2) colnames(y x c _cons) imodel(qreg) path(http://www.stats4life.se/data/from/)"} {stata "mat ib = r(get_ib)"} {stata "mat iV = r(get_iV)"} {stata "mi impute from z , add(10) b(ib) v(iV) imodel(qreg)"} {stata "mi estimate, post eform imp(1/10): logit y x c z"} // Get imputation model from 4 different studies {stata "mi_impute_from_get , b(e_qreg_b_s2 e_qreg_b_s3 e_qreg_b_s4 e_qreg_b_s5) v(e_qreg_v_s2 e_qreg_v_s3 e_qreg_v_s4 e_qreg_v_s5) colnames(y x c _cons) imodel(qreg) path(http://www.stats4life.se/data/from/)"} {stata "mat ib = r(get_ib)"} {stata "mat iV = r(get_iV)"} {stata "mi impute from z , add(10) b(ib) v(iV) imodel(qreg)"} {stata "mi estimate, post eform imp(11/20): logit y x c z"} {title:Example #2: Categorical variable 100% missing} {stata "use http://www.stats4life.se/data/from/study_mlogit, clear"} {stata "mi set wide"} {stata "mi register imputed z"} // Get imputation model from one study (tab delimited .txt) {stata "mi_impute_from_get , b(e_mlogit_b_s2) v(e_mlogit_v_s2) values(0 1 2 3) colnames(y x c _cons) imodel(mlogit) path(http://www.stats4life.se/data/from/)"} {stata "mat ib = r(get_ib)"} {stata "mat iV = r(get_iV)"} {stata "mi impute from z , add(10) b(ib) v(iV) imodel(mlogit)"} {stata "mi estimate, post eform imp(1/10): logit y x c z"} // Get imputation model from 4 different studies {stata "mi_impute_from_get , b(e_mlogit_b_s2 e_mlogit_b_s3 e_mlogit_b_s4 e_mlogit_b_s5) v(e_mlogit_v_s2 e_mlogit_v_s3 e_mlogit_v_s4 e_mlogit_v_s5) colnames(y x c _cons) imodel(mlogit) path(http://www.stats4life.se/data/from/) values(0 1 2 3)"} {stata "mat ib = r(get_ib)"} {stata "mat iV = r(get_iV)"} {stata "mi impute from z , add(10) b(ib) v(iV) imodel(mlogit)"} {stata "mi estimate, post eform imp(11/20): logit y x c z"} {title:Example #3: Binary variable 100% missing} {stata "use http://www.stats4life.se/data/from/study_logit, clear"} {stata "mi set wide"} {stata "mi register imputed z"} // Get imputation model from one study (tab delimited .txt) {stata "mi_impute_from_get , b(e_logit_b_s2) v(e_logit_v_s2) colnames(y x c _cons) imodel(logit) path(http://www.stats4life.se/data/from/)"} {stata "mat ib = r(get_ib)"} {stata "mat iV = r(get_iV)"} {stata "mi impute from z , add(10) b(ib) v(iV) imodel(logit)"} {stata "mi estimate, post eform imp(1/10): logit y x c z"} // Get imputation model from 4 different studies {stata "mi_impute_from_get , b(e_logit_b_s2 e_logit_b_s3 e_logit_b_s4 e_logit_b_s5) v(e_logit_v_s2 e_logit_v_s3 e_logit_v_s4 e_logit_v_s5) colnames(y x c _cons) imodel(logit) path(http://www.stats4life.se/data/from/)"} {stata "mat ib = r(get_ib)"} {stata "mat iV = r(get_iV)"} {stata "mi impute from z , add(10) b(ib) v(iV) imodel(logit)"} {stata "mi estimate, post eform imp(11/20): logit y x c z"} {title:Reference} {p 4 8 2} Thiesmeier R., Bottai M, Orsini N. 2024. Systematically missing data in distributed research networks: multiple imputation when data cannot be pooled. {it:Journal of Statistical Computation and Simulation}. {p_end} {marker results}{...} {title:Stored results} {pstd} {cmd:mi impute from} stores the following in {cmd:r()}: {synoptset 25 tabbed}{...} {p2col 5 25 29 2: Scalars}{p_end} {synopt:{cmd:r(M)}}total number of imputations{p_end} {synopt:{cmd:r(N)}}total number of observations{p_end} {synopt:{cmd:r(N_incomplete)}}total number of missing observations{p_end} {synopt:{cmd:r(M_add)}}number of added imputations{p_end} {synopt:{cmd:r(M_update)}}number of updated imputations{p_end} {synopt:{cmd:r(k_ivars)}}number of imputed variables (always {cmd:1}){p_end} {synoptset 25 tabbed}{...} {p2col 5 25 29 2: Macros}{p_end} {synopt:{cmd:r(method)}}name of imputation method ({cmd:from}){p_end} {synopt:{cmd:r(ivars)}}name of imputation variable{p_end} {synopt:{cmd:r(rngstate)}}random-number state used{p_end}