/*** Title ----- {phang}{cmd:aprub} {hline 2} Estimate the upper bound on the average persuasion rate Syntax ------ > {cmd:aprub} _depvar_ _treatrvar_ _instrvar_ [_covariates_] [_if_] [_in_] [, {cmd:model}(_string_) {cmd:title}(_string_)] ### Options | _option_ | _Description_ | |-------------------|-------------------------| | {cmd:model}(_string_) | Regression model when _covariates_ are present | | {cmd:title}(_string_) | Title | Description ----------- __aprub__ estimates the upper bound on the average persuasion rate (APR). _varlist_ should include _depvar_ _treatrvar_ _instrvar_ _covariates_ in order. Here, _depvar_ is binary outcomes (_y_), _treatrvar_ is binary treatment (_t_), _instrvar_ is binary instruments (_z_), and _covariates_ (_x_) are optional. There are two cases: (i) _covariates_ are absent and (ii) _covariates_ are present. - Without _x_, the upper bound ({cmd:theta_U}) on the APR is defined by {cmd:theta_U} = {E[{it:A}|{it:z}=1] - E[{it:B}|{it:z}=0]}/{1 - E[{it:B}|{it:z}=0]}, where {it:A} = 1({it:y}=1,{it:t}=1)+1-1({it:t}=1) and {it:B} = 1({it:y}=1,{it:t}=0). The estimate and its standard error are obtained by the following procedure: 1. E[{it:A}|{it:z}=1] is estimated by regressing {it:A} on _z_. 2. E[{it:B}|{it:z}=0] is estimated by regressing {it:B} on _z_. 3. {cmd:theta_U} is computed using the estimates obtained above. 4. The standard error is computed via STATA command __nlcom__. - With _x_, the upper bound ({cmd:theta_U}) on the APR is defined by {cmd:theta_U} = E[{cmd:theta_U_num}({it:x})]/E[{cmd:theta_U_den}({it:x})], where {cmd:theta_U_num}({it:x}) = E[{it:A}|{it:z}=1,{it:x}] - E[{it:B}|{it:z}=0,{it:x}] and {cmd:theta_U_den}({it:x}) = 1 - E[{it:B}|{it:z}=0,{it:x}]. The estimate is obtained by the following procedure. If {cmd:model}("no_interaction") is selected (default choice), 1. E[{it:A}|{it:z}=1,{it:x}] is estimated by regressing {it:A} on _z_ and _x_. 2. E[{it:B}|{it:z}=0,{it:x}] is estimated by regressing {it:B} on _z_ and _x_. Alternatively, if {cmd:model}("interaction") is selected, 1. E[{it:A}|{it:z}=1,{it:x}] is estimated by regressing {it:A} on _x_ given _z_ = 1. 2. E[{it:B}|{it:z}=0,{it:x}] is estimated by regressing {it:B} on _x_ given _z_ = 0. Ater step 1, both options are followed by: {p 4 8 2}3. For each _x_ in the estimation sample, {cmd:theta_U_num}({it:x}) and {cmd:theta_U_den}({it:x}) are evaluated. {p 4 8 2}4. The estimates of {cmd:theta_U_num}({it:x}) and {cmd:theta_U_den}({it:x}) are averaged to estimate {cmd:theta_U}. When _covariates_ are present, the standard error is missing because an analytic formula for the standard error is complex. Bootstrap inference is implemented when this package's command __persuasio__ is called to conduct inference. Options ------- {cmd:model}(_string_) specifies a regression model. This option is only relevant when _x_ is present. The dependent variable is either {it:A} or {it:B}. The default option is "no_interaction" between _z_ and _x_. When "interaction" is selected, full interactions between _z_ and _x_ are allowed. {cmd:title}(_string_) specifies a title. Remarks ------- It is recommended to use this package's command __persuasio__ instead of calling __aprub__ directly. Examples -------- We first call the dataset included in the package. . use GKB_persuasio, clear The first example estimates the upper bound on the APR without covariates. . aprub voteddem_all readsome post The second example adds a covariate. . aprub voteddem_all readsome post MZwave2 The third example estimates the upper bound by the covariate. . by MZwave2,sort: aprub voteddem_all readsome post Stored results -------------- ### Scalars > __e(N)__: sample size > __e(ub_coef)__: estimate of the upper bound on the average persuasion rate > __e(ub_se)__: standard error of the upper bound on the average persuasion rate ### Macros > __e(outcome)__: variable name of the binary outcome variable > __e(treatment)__: variable name of the binary treatment variable > __e(instrument)__: variable name of the binary instrumental variable > __e(covariates)__: variable name(s) of the covariates if they exist > __e(model)__: regression model specification ("no_interaction" or "interaction") ### Functions: > __e(sample)__: 1 if the observations are used for estimation, and 0 otherwise. Authors ------- Sung Jae Jun, Penn State University, Sokbae Lee, Columbia University, License ------- GPL-3 References ---------- Sung Jae Jun and Sokbae Lee (2022), Identifying the Effect of Persuasion, [arXiv:1812.02276 [econ.EM]](https://arxiv.org/abs/1812.02276) Version ------- 0.2.1 20 November 2022 ***/ capture program drop aprub program aprub, eclass sortpreserve byable(recall) version 16.1 syntax varlist (min=3) [if] [in] [, model(string) title(string)] marksample touse gettoken Y varlist_without_Y : varlist gettoken T varlist_without_YT : varlist_without_Y gettoken Z X : varlist_without_YT quietly levelsof `Y' if "`r(levels)'" != "0 1" { display "`Y' is not a 0/1 variable" error 450 } quietly levelsof `T' if "`r(levels)'" != "0 1" { display "`T' is not a 0/1 variable" error 450 } quietly levelsof `Z' if "`r(levels)'" != "0 1" { display "`Z' is not a 0/1 variable" error 450 } display " " display as text "{hline 65}" display "{bf:aprub:} Estimating the Upper Bound on the Average Persuasion Rate" display as text "{hline 65}" display " " display " - Binary outcome: `Y'" display " - Binary treatment: `T'" display " - Binary instrument: `Z'" display " - Covariates (if exist): `X'" display " " * generate variables used in estimating persuation rates tempvar case_id a_u b_l gen `case_id' = _n /* generate temporary Case ID */ gen `a_u' = `Y'*`T' + (1-`T') gen `b_l' = `Y'*(1-`T') * if there are no covariates (X) if "`X'" == "" { quietly { reg `a_u' `Z' if `touse' local nobs = e(N) est store a_u_reg reg `b_l' `Z' if `touse' est store b_l_reg suest a_u_reg b_l_reg, vce(cluster `case_id') * estimate of the uppe bound for average persuation rate nlcom (upper_bound: ([a_u_reg_mean]_cons + [a_u_reg_mean]`Z' - [b_l_reg_mean]_cons)/(1-[b_l_reg_mean]_cons)) } tempname b V ub se matrix `b' = r(b) matrix `V' = r(V) ereturn post `b' `V', obs(`nobs') esample(`touse') ereturn display, nopv scalar `ub' = r(table)[1,1] scalar `se' = r(table)[2,1] display " " display "Note: It is recommended to use {bf:persuasio} for causal inference." display " " ereturn scalar ub_coef = `ub' ereturn scalar ub_se = `se' ereturn local outcome `Y' ereturn local treatment `T' ereturn local instrument `Z' estimates clear } * if there are covariates (X) if "`X'" != "" { tempvar yhat_a_u yhat_b_l yhat1 yhat0 thetahat_num thetahat_den thetahat if "`model'" == "" | "`model'" == "no_interaction" { quietly { tempname bhat b_coef reg `a_u' `Z' `X' if `touse' matrix `bhat' = e(b) scalar `b_coef' = `bhat'[1,1] predict `yhat_a_u' if `touse' gen `yhat1' = `yhat_a_u' + `b_coef' - `b_coef'*`Z' reg `b_l' `Z' `X' if `touse' matrix `bhat' = e(b) scalar `b_coef' = `bhat'[1,1] predict `yhat_b_l' if `touse' gen `yhat0' = `yhat_b_l' - `b_coef'*`Z' } } if "`model'" == "interaction" { quietly reg `a_u' `X' if `Z'==1 & `touse', robust quietly predict `yhat1' if `touse' quietly reg `b_l' `X' if `Z'==0 & `touse', robust quietly predict `yhat0' if `touse' } quietly replace `yhat1' = min(max(`yhat1',0),1) quietly replace `yhat0' = min(max(`yhat0',0),1) gen `thetahat_num' = `yhat1' - `yhat0' gen `thetahat_den' = 1 - `yhat0' tempname ub_num ub_den quietly sum `thetahat_num' if `touse' scalar `ub_num' = r(mean) quietly sum `thetahat_den' if `touse' scalar `ub_den' = r(mean) tempname upper_bound_coef local nobs = r(N) tempname b ub se scalar `ub' = `ub_num'/`ub_den' scalar `se' = . matrix `b' = `ub_num'/`ub_den' matrix colnames `b' = upper_bound ereturn post `b', obs(`nobs') esample(`touse') ereturn display, nopv display " " display "Notes: It is recommended to use {bf:persuasio} for causal inference." display " Standard errors are missing if covariates are present." display " " ereturn scalar ub_coef = `ub' ereturn scalar ub_se = `se' ereturn local outcome `Y' ereturn local treatment `T' ereturn local instrument `Z' ereturn local covariates `X' ereturn local model `model' } display "Reference: Jun and Lee (2022), arXiv:1812.02276 [econ.EM]" end