Template-Type: ReDIF-Paper 1.0 Title: Recent developments in the fitting and assessment of flexible parametric survival models File-URL: http://repec.org/dsug2024/Germany24_Lambert.pdf Author-Name: Paul Lambert Author-Workplace-Name: University of Leicester Abstract: Flexible parametric survival models are an alternative to the Cox proportional hazards model and more standard parametric models for the modeling of survival (time-to-event) data. They are exible in that spline functions are used to model the baseline and potentially complex time-dependent effects. I will give a brief overview of the models and the advantages over the Cox model. However, I will concentrate on some recent developments. This will include the motivation for developing a new command to t the models (stpm3), which makes it much simpler to t more complex models with nonlinear functions, nonproportional hazards, and interactions and simplifies and extends postestimation predictions, particularly marginal (standardized) predictions. I will also describe some new postestimation tools that help in the evaluation of model fit and validation in prognostic models. Creation-Date: 20240612 Handle: RePEc:boc:dsug24:01 Template-Type: ReDIF-Paper 1.0 Author-Name: Elena Yurkevich Author-Workplace-Name: FAU Erlangen-Nurenberg Author-Name: Harald Tauchmann Author-Workplace-Name: FAU Erlangen-Nurenberg Author-Person: pta144 Title: cfbinout and xtdhazard: Control-function estimation of binary-outcome models and the discrete-time hazard model File-URL: http://repec.org/dsug2024/Germany24_Yurkevich.pdf Abstract: We introduce the new community-contributed Stata commands cfbinout and xtdhazard. The former generalizes ivprobit, twostep by allowing discrete endogenous regressors and different link functions than the normal link, specically logit and cloglog. In terms of the underlying econometric theory, cfbinout is guided by Wooldridge (2015). In terms of the implementation in Stata and Mata, cfbinout follows Terza (2017). xtdhazard is essentially a wrapper for either cfbinout or ivregress 2sls. When calling ivregress 2sls, xtdhazard implements the linear rst-differences (or higher-order differences) instrumental variables estimator suggested by Farbmacher and Tauchmann (2023) for dealing with time-invariant unobserved heterogeneity in the discrete-time hazard model. When calling cfbinout, xtdhazard implements—depending on the specied link function—several nonlinear counterparts of this estimator that are briey discussed in the online supplement to Farbmacher and Tauchmann (2023). Using xtdhazard—rather than directly using ivregress 2sls, ivprobit, twostep, or cfbinout— simplifies the implementation of these estimators, because generating the numerous instruments required can be cumbersome, especially when using factor-variables syntax. In addition, xtdhazard performs several checks that may prevent ivregress 2sls and ivprobit, twostep, from failing and reports issues like perfect first-stage predictions. An (extended) replication of Cantoni (2012) illustrates the use of cfbinout and xtdhazard in applied empirical work. Creation-Date: 20240612 Handle: RePEc:boc:dsug24:02 Template-Type: ReDIF-Paper 1.0 Author-Name: Peter Krause Author-Workplace-Name: DIW Berlin, SOEP Title: Multidimensional well-being, deprivation, and inequality Abstract: This presentation offers a brief summary for a set of Stata programs for extended multidimensional applications on well-being, deprivation, and inequality. The rst section illustrates the underlaying motivation by some empirical examples on decomposed multidimensional results. The second section on multidimensional well-being and deprivation measurement illustrates the conceptual background—based on the Alkire/Foster MPI framework (and CPI, N. Rippin)—which is also applied to well-being measurement, and extended by a parameter- driven xed-fuzzy approach—with several illustrations and further details on the options offered in the Stata deprivation and well-being programs. The third section on multidimensional inequalities refers to a multidimensional Gini-based row-rst measurement framework with a special emphasis on multiple within- and between-group inequalities—including conceptual extensions on horizontal between-group applications and further details on the options offered in the Stata inequality program. Section four summarizes and opens up for advice and discussion. Creation-Date: 20240612 Handle: RePEc:boc:dsug24:03 Template-Type: ReDIF-Paper 1.0 Author-Name: Wolfgang Langer Author-Workplace-Name: Martin-Luther-University Halle-Wittenberg File-URL: http://repec.org/dsug2024/Germeny24_Langer.pdf Title: How to assess the fit of choice models with Stata? Abstract: McFadden developed the conditional multinomial logit model in 1974 using it for rational choice modeling. In 1993, Stata introduced it in version 3. In 2007, Stata extended this model to asclogit or ascprobit being able to estimate the effects of alternative-specic and case-specific exogenous variables on the choice probability of the discrete alternatives. In 2021, Stata added the class of choice models, extending it to random-effect (mixed) and panel models. As it stands, Stata provides only a postestimation Wald chi-squared test to assess the overall model. However, although McFadden developed a pseudo-R-squared to assess the fit of the conditonal logit model in 1974, Stata still does not provide it even in version 18. Thus, I developed fit_cmlogit to calculate the McFadden pseudo-R-squared using a zero model with alternative-specific constants to correct the uneven distribution of alternatives. Furthermore, it calculates the corresponding likelihood-ratio chi-squared test, which is more reliable and conservative than the Wald test. The program uses the formulas of Hensher and Johnson (1981) and Ben-Akiva and Lerman (1985) for the McFdden pseudo-R-squared to correct the number of exogenous variables and faced alternatives. Train (2003) discussed these characteristics of the McFadden pseudo-R-squared in detail. Additionally, it calculates the log-likelihood-based pseudo-R-squares developed by Maddala (1983, 1988), Cragg and Uhler (1970), and Aldrich and Nelson (1984). The last uses the correction formula proposed by Veall and Zimmermann (1994). An empirical example of predicting voting behavior in the German federal election study of 1990 demonstrates the usefulness of the program to assess the fit of logit choice models with alternative-specific and case-specific exogenous variables. Creation-Date: 20240612 Handle: RePEc:boc:dsug24:04 Template-Type: ReDIF-Paper 1.0 Author-Name: Kristin MacDonald Author-Workplace-Name: StataCorp Title: Customizable tables File-URL: http://repec.org/dsug2024/Germany24_MacDonald.pdf Abstract: Presenting results effectively is a crucial step in statistical analyses, and creating tables is an important part of this step. Whether you need to create a cross-tabulation, a Table 1 reporting summary statistics, a table of regression results, or a highly customized table of results returned by multiple Stata commands, the tables features introduced in Stata 17 and Stata 18 provide ease and exibility for you to create, customize, and export your tables. In this presentation, I will demonstrate how to use the table, dtable, and etable commands to easily create a variety of tables. I will also show how to use the collect suite to build and customize tables and to create table styles with your favorite customizations that you can apply to any tables you create in the future. Finally, I will demonstrate how to export individual tables to Word, Excel, LaTeX, PDF, Markdown, and HTML and how to incorporate your tables into complete reports containing formatted text, graphs, and other Stata results. Creation-Date: 20240612 Handle: RePEc:boc:dsug24:05 Template-Type: ReDIF-Paper 1.0 Author-Name: Ben Jann Author-Workplace-Name: University of Bern Author-Person: pja61 Title: geoplot: A new command to draw maps File-URL: http://repec.org/dsug2024/Germany24_Jann.pdf Abstract: geoplot is a new command for drawing maps from shape files and other datasets. Multiple layers of elements such as regions, borders, lakes, roads, labels, and symbols can be freely combined and the look of elements (for example, color) can be varied depending on the values of variables. Compared with previous solutions in Stata, geoplot provides more user convenience, more functionality, and more flexibility. In this talk, I will introduce the basic components of the command and illustrate its use with examples. Creation-Date: 20240612 Handle: RePEc:boc:dsug24:06 Template-Type: ReDIF-Paper 1.0 Author-Name: Daniel Krähmer Author-Workplace-Name: Ludwig-Maximilians-Universität München Title: repreport: Facilitating reproducible research in Stata Abstract: In theory, Stata provides a stable computational environment and includes commands (for example, version) that are specically designed to ensure reproducibility. In practice, however, users often lack the time or the knowledge to exploit this potential. Insights from an ongoing research project on reproducibility in the social sciences show that computational reproducibility is regularly impeded by researchers being unaware of what files (for example, datasets and do-les), software components (for example, ados), infrastructure (for example, directories), and information (for example, ReadMe files) are needed to enable reproduction. This presentation introduces the new Stata command repreport as a potential remedy. The command works like a log, with one key difference: Instead of logging the entire analysis, repreport extracts specic pieces of information pertinent to reproduction (for example, names and locations of datasets, ados, etc.) and compiles them into a concise reproduction report. Furthermore, the command includes an option for generating a full-fledged reproduction package containing all components needed for push-button reproducibility. While repreport adds little value for researchers whose workow is already perfectly reproducible, it constitutes a powerful tool for those who strive to make their research in Stata more reproducible at (almost) no additional cost. File-URL: http://repec.org/dsug2024/Germany24_Krahmer.pdf Creation-Date: 20240612 Handle: RePEc:boc:dsug24:07 Template-Type: ReDIF-Paper 1.0 Author-Name: Maarten L. Buis Author-Workplace-Name: University of Konstanz Author-Person: pbu92 Title: mkproject and boilerplate: Automate the beginning Abstract: There is usually a set of commands that are included in every do-file a person makes, like clear all or log using. What those commands are can differ from person to person, but most persons have such a standard set. Similarly, a project usually has a standard set of directories and files. Starting a new do-le or a new project thus involves a number of steps that could easily be automated. Automating has the advantage of reducing the amount of work you need to do. However, the more important advantage of automating the start of a do-file or project is that it makes it easier to maintain your own workflow: it is so easy to start “quick and dirty” and promise to yourself that you will fix that “later”. If the start is automated, then you don’t need to fix it. The mkproject command automates the beginning of a project. It comes with a set of templates I find useful. A template contains all the actions (like creating sub-directories, creating files, running other Stata commands) that mkproject will take when it creates a new project. Since everybody’s workflow is different, mkproject allows users to create their own template. Similarly, the boilerplate command creates a new do-file with boilerplate code in it. It comes with a set of templates, but the user can create their own. This talk will illustrate the use of both mkproject and boilerplate and how to create your own templates. Creation-Date: 20240612 File-URL: http://repec.org/dsug2024/Germany24_Buis.zip Handle: RePEc:boc:dsug24:08 Template-Type: ReDIF-Paper 1.0 Author-Name: Daniel C. Schneider Author-Workplace-Name: Max Planck Institute for Demographic Research, Rostock Title: Data structures in Stata File-URL: http://repec.org/dsug2024/germany23_Tauchmann.pdf Abstract: This presentation starts out by enumerating and describing the main data structures in Stata (for example, datasets, frames, and matrices) and Mata (for example, string and numeric matrices, objects like associative arrays). It analyzes ways in which data can be represented and coerced from one data container into another. After assessing the strengths and limitations of existing data containers, it muses on potential additions of new data structures and on enriching the functionality of existing data structures and their interplay. Moreover, data structures from other languages, such as Python lists, are described and examined for their potential introduction into Stata and Mata. The goal of the presentation is to stimulate a discussion among Stata users and developers about ways in which the capabilities of Stata’s data structures could be enhanced in order to ease and open up new possibilities for data management and analysis. Creation-Date: 20240612 Handle: RePEc:boc:dsug24:09