Template-Type: ReDIF-Paper 1.0
Title: Drivers of COVID-19 deaths in the United States: A two-stage modeling approach
File-URL: http://repec.org/dsug2023/Baum_DEStataConf2023.pdf
Author-Name: Andrés Garcia-Suaza
Author-Workplace-Name: University del Rosario
Author-Person: pga253
Author-Name: Miguel Henry
Author-Workplace-Name: Greylock McKinnon Associates
Author-Person: phe668
Author-Name: Jesús Otero
Author-Workplace-Name: University del Rosario
Author-Person: pot11
Author-Name: Kit Baum
Author-Workplace-Name: Boston College
Author-Person: pba1
Abstract: Our empirical strategy exploits the availability of two years (January 2020 through January 2022) of daily data on the number of confirmed deaths and cases of COVID-19 in the 3,000 U.S. counties of the 48 contiguous states and the District of Columbia.
 In the first stage of the analysis, we use daily time-series data on COVID-19 cases and deaths to fit mixed models of deaths against lagged confirmed cases for each county. Because the resulting coefficients are county specific, they relax the homogeneity assumption that is implicit when the analysis is performed using geographically aggregated cross-section units.
 In the second stage of the analysis, we assume that these county estimates are functions of economic and sociodemographic factors that are taken as fixed over the course of the pandemic. Here we employ the novel one-covariate-at-a-time variable-selection algorithm proposed by Chudik et al. (2018) to guide the choice of regressors.
Creation-Date: 20230615
Handle: RePEc:boc:dsug23:01

Template-Type: ReDIF-Paper 1.0
Author-Name: Daniel C. Schneider
Author-Workplace-Name: MPI for Demographic Research
Title: Discrete-time multistate regression models in Stata
File-URL: http://repec.org/dsug2023/germany23_Schneider.pdf
Abstract: Multistate life tables (MSLTs), or multistate survival models, have become a widely used analytical framework among epidemiologists, social scientists, and demographers.
 MSLTs can be cast in continuous time or discrete time. While the choice between the two approaches depends on the concrete research question and available data, discrete-time models have several appealing features: they are easy to apply; the computational cost is typically low; and today's empirical studies are frequently based on regularly spaced longitudinal data, which naturally suggests modeling in discrete time.
 Despite these appealing features, Stata community-contributed packages have so far been developed only for continuous-time models (Crowther and Lambert 2017; Metzger and Jones 2018) or for traditional demographic life-table calculations that do not allow for covariate adjustment (Muniz 2020). This presentation introduces the recently published Stata package dtms, which seeks to fill the gap in software availability for discrete-time multistate model estimation. The dtms package provides a well-documented and easy-to-apply set of commands that cover a large set of discrete-time MSLT techniques that currently exist in the literature. It also features inference based on newly derived asymptotic covariance matrices as well as inference on group contrasts.
Creation-Date: 20230615
Handle: RePEc:boc:dsug23:02

Template-Type: ReDIF-Paper 1.0
Author-Name: Daniel Krähmer
Author-Workplace-Name: Ludwig-Maximilians-University, Munich
Title: mfcurve: Visualizing results from multifactorial designs
File-URL: http://repec.org/dsug2023/germany23_Krahmer.pdf
Abstract: Multifactorial designs are used to study the (joint) impact of two or more factors on an outcome.
They typically occur in conjoint, choice, and factorial survey experiments but have recently gained increasing popularity in field experiments, too. Technically, they allow researchers to investigate moderation as an instance of treatment heterogeneity by crossing multiple treatments.
 Naturally, multifactorial designs quickly spawn a spiraling number of distinct treatment combinations: even a moderately complex design of two factors with three levels each yields 32 unique combinations. For more elaborate setups, full factorials can easily produce dozens of distinct combinations, rendering the visualization of results difficult.
 This presentation introduces the new Stata command mfcurve as a potential remedy. Mimicking the appearance of a specification curve, mfcurve produces a two-part chart: the graph’s upper panel displays average effects for all distinct treatment combinations; its lower panel indicates the presence or absence of any level given the respective treatment condition. Unlike existing visualization techniques, this enables researchers to plot and inspect results from multifactorial designs much more comprehensively. Highlighting potential applications, the presentation will demonstrate mfcurve’s most important features and options, which currently include replacing point estimates by box plots and testing results for statistical significance.
Creation-Date: 20230615
Handle: RePEc:boc:dsug23:03

Template-Type: ReDIF-Paper 1.0
Author-Name: Michael Bates
Author-Workplace-Name: University of California-Riverside
Author-Person: pba1462
Author-Name: Seolah Kim
Author-Workplace-Name: Albion College
File-URL: http://repec.org/dsug2023/germany23_Bates.pdf
Title: Estimating the price elasticity of gasoline demand in correlated random coefficient models with endogeneity
Abstract:  We propose a per-cluster instrumental-variables approach (PCIV) for estimating correlated random coefficient models in the presence of contemporaneous endogeneity and two-way fixed effects.
 We use variation across clusters to estimate coefficients with homogeneous slopes (such as time effects) and within-cluster variation to estimate the cluster-specific heterogeneity directly. We then aggregate them to population averages. We demonstrate consistency, showing robustness over standard estimators, and provide analytic standard errors for robust inference. Basic implementation is straightforward using standard software such as Stata.
 In Monte Carlo simulation, PCIV performs relatively well against pooled 2SLS and fixed-effects IV (FEIV) with a finite number of clusters or finite observations per cluster. We apply PCIV in estimating the price elasticity of gasoline demand using state fuel taxes as instrumental variables. PCIV estimation allows for greater transparency of the underlying data. In our setting, we provide evidence of correlation between heterogeneity in the first and second stages, violating a key assumption underpinning consistency of standard estimators. We see significant divergence in the implicit weighting when applying FEIV from the natural weights applied in PCIV. Overlooking effect heterogeneity with standard estimators is consequential. Our estimated distribution of elasticities reveals significant heterogeneity and meaningful differences in estimated averages.
Creation-Date: 20230615
Handle: RePEc:boc:dsug23:04

Template-Type: ReDIF-Paper 1.0
Author-Name: Annalivia Polselli
Author-Workplace-Name: Essex University
Title: Influence analysis with panel data using Stata
File-URL: http://repec.org/dsug2023/germany23_Polselli.pdf
Abstract: The presence of anomalous cases in a dataset (for example, vertical outliers, good and bad leverage points) can severely affect least-squares estimates (coefficients or standard errors) that are sensitive to extreme cases by construction.
 Cook (1979)’s distance is usually used to detect such anomalies in cross-sectional data. This metric may fail to flag multiple atypical cases (Atkinson 1985; Chatterjee and Hadi 1988; Rousseeuw and Van Zomeren 1990), while a local approach overcomes this limit (Lawrance 1995).
 I formalize statistical measures to quantify the degree of leverage and outlyingness of units in a panel-data framework. I hence develop a unitwise method to visually detect the type of anomaly, quantify its joint and conditional influence, and quantify the direction of the enhancing and masking effects. I conduct the proposed influence analysis using two community-contributed commands.
 First, xtinfluence calculates the joint and conditional influence of unit i on unit j and the relative enhancing and masking effects. A two-way scatter plot or the SSC heatplot can be used to visualize the influence exerted by each unit in the sample. Second, xtlvr2plot (a panel-data version for lvr2plot) produces unitwise plots displaying the average individual influence and the average normalized squared residual of unit i.
Creation-Date: 20230615
Handle: RePEc:boc:dsug23:05

Template-Type: ReDIF-Paper 1.0
Author-Name: Maik Hamjediers
Author-Workplace-Name: Humboldt Universität zu Berlin
Author-Name: Maximilian Sprengholz
Author-Workplace-Name: Humboldt Universität zu Berlin, Institut für Sozialwissenschaften
Author-Person: psp186
Title: Measuring associations and evaluating forecasts of categorical and discrete variables
File-URL: http://repec.org/dsug2023/germany23_Sprengholz_Hamjediers.pdf
Abstract: This presentation introduces a new Stata command, classify, that constructs a classification table and computes various measures of association between two categorical variables, as well as diagnostic scores of the
accuracy of probabilistic and deterministic forecasts of a categorical (binary and multiclass ordinal or nominal) variable. We compiled a comprehensive list of about 200 coefficients, along with the synonymy and bibliography associated with them. In addition to the general measures, the command also computes the class-specific measures for each class as well as their macro and weighted averages.
Creation-Date: 20230615
Handle: RePEc:boc:dsug23:06

Template-Type: ReDIF-Paper 1.0
Author-Name: Jeff Pitblado
Author-Workplace-Name: StataCorp
Author-Email: jpitblado@stata.com
Title: Linking frames in Stata
Abstract: This presentation gives an overview of data frames in Stata.  I
	demonstrate the basics of working with multiple datasets in
	Stata.  I cover most of the -frames- suite of commands, touching
	on frame creation and management, linking frames, and copying
	variables from linked frames.  I also cover new frames features
	in Stata 18, saving a set of frames to a single file and alias
	variables.
File-URL: http://repec.org/dsug2023/Pitblado_DEStataConf2023.pdf
Creation-Date: 20230615
Handle: RePEc:boc:dsug23:07

Template-Type: ReDIF-Paper 1.0
Author-Name: Joerg Luedicke
Author-Workplace-Name: StataCorp
Title: Causal Mediation Analysis using Stata
Abstract:    Causal inference is an essential goal in many research areas and aims
   at identifying and quantifying causal effects. By decomposing causal
   effects into direct and indirect effects, causal mediation provides
   further insight into underlying mechanisms through which causal
   effects operate. This talk presents the basic theoretical framework
   for causal mediation analysis and discusses a variety of examples
   using Stata's -mediate- command. Examples will include linear and
   generalized linear models using a variety of outcome and mediator
   variables as well as different types of treatments.
File-URL: http://repec.org/dsug2023/Luedicke_DEStataConf2023.pdf
Creation-Date: 20230615
Handle: RePEc:boc:dsug23:08

Template-Type: ReDIF-Paper 1.0
Author-Name: Harald Tauchmann
Author-Workplace-Name: FAU Erlangen-Nürnberg
Author-Person: pta144
Title: lgrgtest: Lagrange multiplier test after constrained maximum-likelihood estimation using Stata
File-URL: http://repec.org/dsug2023/germany23_Tauchmann.pdf
Abstract: Besides the Wald and the likelihood-ratio test, the Lagrange multiplier test (Rao 1948; Aitchison and Silvey 1958; Silvey, 1959)—also known as the score test—is the third canonical approach to testing hypotheses after maximum likelihood estimation.
 While the Stata commands test and lrtest implement the former two, real Stata does not have a general command for implementing the latter. This presentation introduces the new community-contributed Stata postestimation command lgrgtest that allows for straightforwardly using Lagrange multiplier test after constrained maximum-likelihood estimation.
 lgrgtest is intended to be compatible with all Stata estimation commands that use maximum likelihood and allow for the options constraints(), iterate(), and from() and obey Stata's standards for the syntax of estimation commands. lgrgtest can also be used after cnsreg. lgrgtest draws on Stata’s constraint command and the accompanying option constraints(), which only allows for imposing linear restrictions on a model. This results in the limitation of lgrgtest being confined to testing linear constraints only. A (partial) replication of Egger et al. (2011) illustrates the use of lgrgtest in applied empirical work.
Creation-Date: 20230615
Handle: RePEc:boc:dsug23:09

Template-Type: ReDIF-Paper 1.0
Author-Name: Lukas Fervers
Author-Workplace-Name: University of Cologne and Leibniz-Centre for Life-Long Learning
Title: Power boost or source of bias? Monte Carlo evidence on ML covariate adjustment in randomized trials in education
File-URL: http://repec.org/dsug2023/germany23_Fervers.pdf
Abstract: Statistical theory makes ambiguous predictions about covariate adjustment in randomized trials.
 While proponents highlight possible efficiency gains, opponents point to possible finite-sample bias, a loss of precision in the case of many and weak covariates, and as the increasing danger of false-positive results due to repeated model specification. This theoretical reasoning suggests that machine learning (variable selection) methods may be promising tools to keep the advantages of covariate adjustment, while simultaneously protecting against its downsides.
 In this presentation, I rely on recent developments of machine learning methods for causal effects and their implementation in Stata to assess the performance of ML methods in randomized trials. I rely on real-world data and simulate treatment effects on a wide range of different data structures, including different outcomes and sample sizes. (Preliminary) results suggests that ML adjusted estimates are unbiased and show considerable efficiency gains compared with unadjusted analysis.
 The results are fairly similar between different data structures used and robust to the choice of tuning parameters of the ML estimators. These results tend to support the more optimistic view on covariate adjustment and highlight the potential of ML methods in this field.
Creation-Date: 20230615
Handle: RePEc:boc:dsug23:10