help cqiv -------------------------------------------------------------------------------

Title

Censored quantile instrumental variable (regression)

Syntax

cqiv depvar [varlist] (endogvar = instrument) [if] [in] [weight] [, options]

options Description -------------------------------------------------------------------------

Model quantiles(numlist) sets the quantile(s) (values between 0 to 100) at which the model is estimated. censorpt(#) censoring point of the dependent variable; default is 0. top right censoring of the dependent variable; otherwise, left censoring as default. uncensored uncensored quantile IV (QIV) estimation. exogenous censored quantile regression (CQR) with no endogeneity. firststage(string) determine the first stage estimation procedure, where string is quantile (default), distribution, ols. exclude excludes exogenous regressors other than instruments from the first stage estimation. nquant(#) determines the number of quantiles used in the first stage estimation when the estimation procedure is quantile; default is 50; it is advisable to choose a value between 20 to 100. nthresh(#) determines the number of thresholds used in the first stage estimation when the estimation procedure is distribution; default is 50; it is advisable to choose a value between 20 up to the value of the sample size. ldv1(string) determines the limited dependent variable (LDV) model used in the first stage estimation when the estimation procedure is distribution, where string is probit (default), logit. ldv2(string) determines the LDV model used in the first step of the second stage estimation, where string is probit (default), logit.

CQIV estimation corner calculates the (average) marginal quantile effects for censored dependent variable when the censoring is due to economic reasons such are corner solutions; only applicable to linear models. drop1(#) sets the proportion of observations q0 with probabilities of censoring above the quantile index that are dropped in the first step of the second stage (See Chernozhukov, Fernandez-Val and Kowalski (2010) for details); default is 10. drop2(#) sets the proportion of observations q1 with estimate of the conditional quantile above (below for right censoring) that are dropped in the second step of the second stage (See Chernozhukov, Fernandez-Val and Kowalski (2010) for details); default is 3. viewlog shows the intermediate estimation results; default is no log.

Inference confidence(string) type of confidence intervals, where string is no (no confidence intervals, the default), boot, weightboot. bootreps(#) number of repetition of bootstrap or weighted bootstrap; default is 100. setseed(#) initial seed number in repetition of bootstrap or weighted bootstrap; default is 777. level(#) sets confidence level; default is 95.

Robustness check norobust suppresses the robustness diagnostic test results. -------------------------------------------------------------------------

cqiv allows weights, aweights and pweights; see weight. No matter which weights are specified by the users, note that pweight is automatically forced for the probit or logit estimation in the procedure, and aweight for the quantile regression estimation. When confidence(weightboot) is implemented the multiplication of the bootstrap weights and the user-specified weights is used as the weights in the bootstrap procedure. break command pressed in the middle of the execution may not restore the original dataset.

Description

cqiv conducts censored quantile instrumental variable (CQIV) estimation. This command can implement both censored and uncensored quantile IV estimation either under exogeneity or endogeneity. The estimator proposed by Chernozhukov, Fernandez-Val and Kowalski (2010) is used if CQIV estimation is implemented. A parametric version of the estimator proposed by Lee (2007) is used if quantile IV estimation without censoring is implemented. The estimator proposed by Chernozhukov and Hong (2002) is used if censored quantile regression (CQR) is estimated without endogeneity. Note that all the variables in the parentheses of the syntax are those involved in the first stage estimation of CQIV and QIV.

Options

+-------+ ----+ Model +------------------------------------------------------------

quantiles(numlist) specifies the quantiles at which the model is estimated and should contain percentage numbers between 0 and 100. Note that this is not the list of quantiles for the first stage estimation with quantile specification.

censorpt(#) specifies the censoring point of the dependent variable, where the default is 0; inappropriately specified censoring point will generate errors in estimation.

top sets right censoring of the dependent variable; otherwise, left censoring is assumed as default.

uncensored selects uncensored quantile IV (QIV) estimation.

exogenous selects censored quantile regression (CQR) with no endogeneity, which is proposed by Chernozhukov and Hong (2002).

firststage(string) determines the first stage estimation procedure, where string is either quantile for quantile regression (the default), distribution for distribution regression (either probit or logit), or ols for ols estimation. Note that firststage(distribution) can take a considerable amount of time to execute.

exclude excludes exogenous regressors other than instruments from the first stage estimation.

nquant(#) determines the number of quantiles used in the first stage estimation when the estimation procedure is quantile; default is 50, that is, total 50 evenly-spaced quantiles are chosen in the estimation; it is advisable to choose a value between 20 to 100.

nthresh(#) determines the number of thresholds used in the first stage estimation when the estimation procedure is distribution; default is 50, that is, total 50 evenly-spaced thresholds are chosen in the estimation; it is advisable to choose a value between 20 and the value of the sample size; when the value is smaller than this range, the estimation may be subject to multicollinearity.

ldv1(string) determines the LDV model used in the first stage estimation when the estimation procedure is distribution, where string is either probit for probit estimation (the default), or logit for logit estimation.

ldv2(string) determines the LDV model used in the first step of the second stage estimation, where string is either probit (the default), or logit.

+-----------------+ ----+ CQIV estimation +--------------------------------------------------

corner calculates the (average) marginal quantile effects for censored dependent variable when the censoring is due to economic reasons such are corner solutions. Under this option, the reported coefficients are the average corner solution marginal effects if the underlying function is linear in the endogenous variable. For each observation, if the predicted value of depvar is beyond the censoring point, the marginal effect is set to zero; otherwise, it is set to the coefficient. The reported average corner solution marginal effect averages the marginal effects over all observations. If the underlying function is nonlinear in the endogenous variable, average marginal effects must be calculated directly from the coefficients without corner option. For details of the related concepts, see Section 2.1 of Chernozhukov, Fernandez-Val and Kowalski (2010). The relevant example can be found in the examples section of this help file.

drop1(#) sets the proportion of observations q0 with probabilities of censoring above the quantile index that are dropped in the first step of the second stage (See Chernozhukov, Fernandez-Val and Kowalski (2010) for details); default is 10.

drop2(#) sets the proportion of observations q1 with estimate of the conditional quantile above (below for right censoring) that are dropped in the second step of the second stage (See Chernozhukov, Fernandez-Val and Kowalski (2010) for details); default is 3.

viewlog shows the intermediate estimation results; the default is no log.

+-----------+ ----+ Inference +--------------------------------------------------------

confidence(string) specifies the type of confidence intervals. With string being no, which is the default, no confidence intervals are calculated. With string being boot or weightedboot, either nonparametric bootstrap or weighted bootstrap (respectively) confidence intervals are calculated. The weights of the weighted bootstrap are generated from the standard exponential distribution. Note that confidence(boot) and confidence(weightboot) can take a considerable amount of time to execute.

bootreps(#) sets the number of repetitions of bootstrap or weighted bootstrap if the confidence(boot) or confidence(weightboot) is selected. The default number of repetitions is 100.

setseed(#) sets the initial seed number in repetition of bootstrap or weighted bootstrap; the default is 777.

level(#) sets confidence level, and default is 95.

+------------------+ ----+ Robustness check +-------------------------------------------------

norobust suppresses the robustness diagnostic test results. No diagnostic test results to suppress when uncensored is employed.

Saved results

cqiv saves the following results in e():

Scalars e(obs) Number of observations e(censorpt) Censoring point e(drop1) q0 e(drop2) q1 e(bootreps) Number of bootstrap or weighted bootstrap repetitions e(level) Significance level of confidence interval

Macros e(command) Name of the command: cqiv e(regression) Name of the implemented regression: either cqiv, qiv, o > r cqr e(depvar) Name of the dependent variable e(endogvar) Name of the endogenous regressor e(instrument) Names of the instrumental variables e(regressors) Names of the regressors e(firststage) Type of the first stage estimation e(confidence) Type of confidence intervals

Matrices e(results) Matrix containing the estimated coefficients, mean, and > lower and upper bounds of confidence intervals. e(quantiles) Row vector containing the quantiles at which CQIV have > been estimated. e(robustcheck) Matrix containing the results for the robustness diagno > stic test results. (See Table B1 of Chernozhukov, Fernandez-Val and Kowalski > (2010).) Note that the entry complete denotes whether all the steps are include > d in the procedure; 1 when they are, and 0 otherwise. For other entries consu > lt the paper.

Examples

. webuse set http://www.econ.yale.edu/~ak669/ (to specify URL from which dataset will be obtained) . webuse alcoholengel (to load the dataset over the URL; See Blundell, Chen and Kristensen (2007) for data descriptions.) . cqiv alcohol logexp2 nkids (logexp = logwages nkids), quantiles(25 50 75) (This generates part of the empirical results of Chernozhukov, Fernandez-Val and Kowalski (2010).)

. cqiv alcohol logexp2 (logexp = logwages), quantiles(20 25 70(5)90) firststage(ols)

. cqiv alcohol logexp2 (logexp = logwages), firststage(distribution) ldv1(logit)

. cqiv alcohol logexp2 nkids (logexp = logwages nkids), uncensored (to run QIV)

. cqiv alcohol logexp logexp2 nkids, exogenous (to run CQR)

. cqiv alcohol logexp2 nkids (logexp = logwages nkids), confidence(weightboot) bootreps(10)

. cqiv alcohol nkids (logexp = logwages nkids), corner

Version requirements

This command requires Stata 10 or upper.

Methods and Formulas

See Chernozhukov, Fernandez-Val and Kowalski (2010).

References

Blundell, Chen and Kristensen (2007): Semi-nonparametric IV Estimation of Shape-Invariant Engel Curves, Econometrica, 75(6), 1613-1669.

Chernozhukov, Fernandez-Val and Kowalski (2010): Quantile Regression with Censoring and Endogeneity, Boston University Department of Economics Working Paper 2009-012.

Chernozhukov and Hong (2002): Three-Step Censored Quantile Regression and Extramarital Affairs, Journal of the American Statistical Association, 97, 872-882.

Kowalski (2010): Censored Quantile Instrumental Variable Estimates of the Price Elasticity of Expenditure on Medical Care, NBER Working Paper 15085.

Lee (2007): Endogeneity in Quantile Regression Models: A Control Function Approach, Journal of Econometrics, 141, 1131-1158.

Remarks

This is a preliminary version. Please feel free to share your comments, reports of bugs and propositions for extensions. We thank Richard Blundell for sharing the data used in the examples above. The data were derived by Richard Blundell from the 1995 U.K. Family Expenditure Survey (FES), following the criteria set forth in Blundell, Chen and Kristensen (2007).

Disclaimer

THIS SOFTWARE IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.

IN NO EVENT WILL THE COPYRIGHT HOLDERS OR THEIR EMPLOYERS, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THIS SOFTWARE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM.

Authors

Victor Chernozhukov, Ivan Fernandez-Val, Sukjin Han, and Amanda Kowalski MIT, Boston University, and Yale University vchern@mit.edu / ivanf@bu.edu / sukjin.han@yale.edu / amanda.kowalski@yale.edu Latest Version: August 2011 / First Version: December 2010