Template-Type: ReDIF-Paper 1.0 Title: Too much or too little? New tools for the CCE estimator Abstract: This talk will cover new developments in the literature of common correlated effects (CCE) and their implementation into Stata. First, I will discuss regularized CCE (Juodis, 2022, Journal of Applied Econometrics). CCE is known to be sensitive to the selection of the number of cross-section averages. rCCE overcomes the problem by regularizing the cross-section averages. Second, I will discuss the test for the rank condition based on DeVos, Everaert, and Sarafidis (2024, Econometrics Reviews). If the rank condition fails, CCE will be inconsistent, and therefore testing the condition is key for any empirical application. Finally, I will discuss the selection of cross-section averages using the information criteria from Karabiyik, Urbain, and Westerlund (2019, Journal of Applied Econometrics) and Margaritella and Westerlund (2023, Econometrics Journal). Author-name: Jan Ditzen Author-workplace-name: Freie Universität Bozen-Bolzano Author-person: pdi434 File-URL: http://repec.org/neur2024/Northern_Europe24_Ditzen.pdf File-Format: application/pdf File-Function: presentation materials Handle: RePEc:boc:neur24:01 Template-Type: ReDIF-Paper 1.0 Title: The SCCS design Abstract: The SCCS design, in contrast to standard epidemiological observational designs like the cohort and case– control design, offers a more time- and cost-efficient approach. This efficiency is due to the larger sample sizes required by the standard designs. Further, the SCCS method automatically adjusts for known and unknown fixed confounders. The latter can be a significant challenge in standard designs. The SCCS method splits an observation period into one or more risk periods and one or more control periods. The risk periods are relative to an exposure event, whereas the observation period is either fixed or relative to the exposure event. Often, one adds time or age adjustments during the observation period. The basic idea is to compare incidence rates for the risk periods with the control period while adjusting for time or age and cases. The SCCS design originates from the desire to estimate the relative effect of vaccines, such as the MMR, on adverse events like meningitis. Compared with the classical design, it is a matter of asking when instead of who. I will discuss the SCCS design and present the Stata command sccsdta, which transforms datasets of times for events and exposures by cases into datasets marked into risk and control periods as well as time or age periods. After the dataset transformation, the analysis is simple, using fixed-effect Poisson regression. Author-name: Niels Henrik Bruun Author-workplace-name: Aalborg University Hospital File-URL: http://repec.org/neur2024/Northern_Europe24_Bruun.pdf File-Format: application/pdf File-Function: presentation materials Handle: RePEc:boc:neur24:02 Template-Type: ReDIF-Paper 1.0 Title: Improving the speed and accuracy when fitting flexible parametric survival models on the log-hazard scale Abstract: Flexible parametric survival models are an alternative to the Cox proportional hazards model and more standard parametric models for the modeling of survival (time-to-event) data. They are flexible in that spline functions are used to model the baseline and potentially complex time-dependent effects. In this talk, I will discuss using splines on the log-hazard scale. Models on this scale have some computational challenges because numerical integration is required to integrate the hazard function during estimation. The numerical integration is required for all individuals and for each call to likelihood/gradient/Hessian functions and can therefore be slow in large datasets. In addition, the models may have a singularity for the hazard function at t=0, which leads to precision issues. I will describe two recent updates to the stpm3 command that make these models faster to fit in large datasets and have improved accuracy for the numerical integration. First, the python option makes use of the mlad optimizer, which calls python, leading to major speed gains in large datasets. Second, there are different options for numerical integration of the hazard function, including tanh-sinh quadrature, which is now the default when the hazard function has a singularity at t=0. This leads to more accurate estimates compared with the more standard Gauss–Legendre quadrature. These speed and accuracy improvements make the use of these models more feasible in large datasets. Author-name: Paul Lambert Author-workplace-name: Cancer Registry of Norway–Norwegian Institute of Public Health Author-workplace-name: Karolinska Institutet File-URL: http://repec.org/neur2024/Northern_Europe24_Lambert.pdf File-Format: application/pdf File-Function: presentation materials Handle: RePEc:boc:neur24:03 Template-Type: ReDIF-Paper 1.0 Title: Example of modeling survival with registry data to assist with clinical decision making Abstract: The Cancer Registry of Norway contains several clinical registries with rich information on the diagnosis, treatment, and follow up of cancer patients. Since 2013, the Clinical Registry for Gynecological Cancer has collected information on residual disease (RD) diameter following ovarian cancer surgery, which is prognostic for survival. Internationally, attaining 1cm or less RD is considered “adequate” debulking. This cutoff has been widely used for making treatment decisions and is used to define high-risk patients in Norwegian treatment guidelines. However, few studies have evaluated ovarian cancer survival across continuous RD diameter. In flexible parametric models, I compared excess mortality of stage III–IV ovarian cancer patients across continuous RD diameter using restricted cubic splines. This presentation is an Author-name: Cassie Trewin-Nybråten Author-workplace-name: Cancer Registry of Norway–Norwegian Institute of Public Health File-URL: http://repec.org/neur2024/Northern_Europe24_Trewin-Nybraten.pptx File-Format: application/X-MS-Powerpoint File-Function: presentation materials Handle: RePEc:boc:neur24:04 Template-Type: ReDIF-Paper 1.0 Title: Limitations and comparison of the DFA, PP, and KPSS unit-root tests: Evidence for laboral market variables in Mexico Abstract: Unit-root tests have represented a great contribution to time-series analysis by detecting when a variable is stationary or not. However, they present limitations, which, although known, are still used, and it seems that these limitations go unnoticed when applied in time-series studies. Examples of these limitations, mainly Dickey–Fuller (DF) and Phillips–Perron (PP), are that they could be detecting the presence of a unit root when the series does not have it. Consequently, this presentation includes some of the criticisms that have been made to the unit-root tests to consequently execute in Stata the three best-known unit root tests (DFA, PP , and KPSS) for the main macroeconomic variables of Mexico, this with the intention of analyzing, both graphically and technically, whether the series are stationary or not. The main conclusion is that unit-root tests are often more related to statistical than economic issues. Author-name: Ricardo Rodolfo Retamoza Yocupicio Author-workplace-name: The National Autonomous University of Mexico File-URL: http://repec.org/neur2024/Northern_Europe24_Rodolfo.pdf File-Format: application/pdf File-Function: presentation materials Handle: RePEc:boc:neur24:05 Template-Type: ReDIF-Paper 1.0 Title: Using Stata with many datasets, methods, and variables Abstract: Complex data management and extensive analysis of data can be challenging in research projects. Compared with a classical textbook example with one clean dataset and a few selected variables and models, medical research projects often involve many datasets in different formats and use a range of statistical methods and many variables and outcomes. Stata has features for keeping track of datasets, automating statistical analyses, and summarizing results. Some experiences and practical tips with commands such as import, foreach, putexcel, and dtable in combination with the use of macros will be presented. These can be helpful for efficiently solving complex tasks, obtaining overviews of data and methods, and reporting statistical results to a multidisciplinary research group. Author-name: Are Hugo Pripp Author-workplace-name: Oslo Centre for Biostatistics and Epidemiology (OCBE) File-URL: http://repec.org/neur2024/Northern_Europe24_Pripp.pdf File-Format: application/pdf File-Function: presentation materials Handle: RePEc:boc:neur24:06 Template-Type: ReDIF-Paper 1.0 Title: Maps in Stata Abstract: This interactive talk will provide an introduction to the packages and code required for producing high-quality maps in Stata. I will show how to import shapefiles, plot different layer types (points, lines, polygons), and generate different types of choropleth and bivariate maps. Some basic customization options will also be discussed. Author-name: Asjad Naqvi Author-workplace-name: Austrian Institute for Economic Research (WIFO) Author-person: pna493 File-URL: http://repec.org/neur2024/Northern_Europe24_Naqvi1.pdf File-Format: application/pdf File-Function: presentation materials Handle: RePEc:boc:neur24:07 Template-Type: ReDIF-Paper 1.0 Title: Causal inference with time-to-event outcomes under competing risk Abstract: The occurrence of competing events often complicate the analysis of time-to-event outcomes. While there is a rich literature in the area of survival analysis on methods for handling competing risk that goes back a long way, there has also for a long time been some confusion regarding best approach and implementation when facing competing events in applied research. Recent advances in the use of estimands in causal inference has led to new developments and insights (and discussions) on how to best analyze time-to-event outcomes under competing risk. The role of classical statistical estimands are now better understood, and new causal estimands have been suggested for addressing more advanced causal questions. In this talk, I will briefly review this development and the estimation of the most basic estimands and discuss some extensions, such as when interest is in the effect of time-varying treatments. Author-name: Jon Michael Gran Author-workplace-name: Oslo Centre for Biostatistics and Epidemiology (OCBE) File-URL: http://repec.org/neur2024/Northern_Europe24_Gran.pdf File-Format: application/pdf File-Function: presentation materials Handle: RePEc:boc:neur24:08 Template-Type: ReDIF-Paper 1.0 Title: Extending standard reporting to improve communication of survival statistics Abstract: Routine reporting of cancer patient survival is important, both to monitor the effectiveness of healthcare and to inform about prognosis following a cancer diagnosis. A range of different survival measures exist, each serving different purposes and targeting different audiences. It is important that routine publications expand on current practice and provide estimates on a wider range of survival measures. Using data from The Cancer Registry of Norway, we examine the feasibility of automated production of such statistics. Author-name: Tor Åge Myklebust Author-workplace-name: Cancer Registry of Norway–Norwegian Institute of Public Health File-URL: http://repec.org/neur2024/Northern_Europe24_Myklebust.pdf File-Format: application/pdf File-Function: presentation materials Handle: RePEc:boc:neur24:09 Template-Type: ReDIF-Paper 1.0 Title: Bayesian estimation of disclosure risks for synthetic time-to-event data Abstract: Introduction: Generation of synthetic patient records can preserve the structure and statistical properties of the original data while maintaining privacy, providing access to high-quality data for research and innovation. Few synthesization methods account for the censoring mechanisms in time-to-event data, and formal privacy evaluations are often lacking. Improvements in synthetic data utility come with increased risks of privacy disclosure, necessitating a careful evaluation to obtain the proper balance. Methods: We generate synthetic time-to-event data based on colon cancer data from the Cancer Registry of Norway, using a sequence of conditional regression models and flexible parametric modeling of event times. Different levels of model complexity are used to investigate the impact on data utility and disclosure risk. The privacy risk is evaluated using Bayesian estimation of disclosure risks, which form the basis for a differential privacy audit. Results: Including more interaction terms and increasing degrees of freedom improves synthetic data utility and elevates privacy risks. While certain interactions substantially improve utility, others reduce privacy without much utility gain. The most complex model displays near-optimal utility scores. Conclusions: The results demonstrated a clear tradeoff between synthetic data utility and privacy risks. Interestingly, the relationship is nonlinear, because certain modeling choices increase synthetic data utility with little privacy loss, and vice versa. Author-name: Sigrid Leithe Author-workplace-name: Cancer Registry of Norway–Norwegian Institute of Public Health File-URL: http://repec.org/neur2024/Northern_Europe24_Leithe.pdf File-Format: application/pdf File-Function: presentation materials Handle: RePEc:boc:neur24:10 Template-Type: ReDIF-Paper 1.0 Title: How can Stata enable federated computing for decentralized data analysis? Abstract: Federated computing offers a transformative approach to data analysis, enabling the processing of distributed datasets without the need for centralization, thus aiming to preserve privacy and security. In this talk, I will explore how these principles can be applied within the Stata environment to address the growing challenges of data sharing and computational limits. I will highlight the current features in Stata that make federated computing possible and the challenges and future directions, setting the stage for innovation in decentralized data analysis. By integrating federated computing with Stata, researchers can perform complex analyses on sensitive, geographically dispersed data while maintaining the software's robust statistical capabilities. Author-name: Narasimha Raghavan Author-workplace-name: Cancer Registry of Norway–Norwegian Institute of Public Health File-URL: http://repec.org/neur2024/Northern_Europe24_Raghavan.pdf File-Format: application/pdf File-Function: presentation materials Handle: RePEc:boc:neur24:11 Template-Type: ReDIF-Paper 1.0 Title: Causal mediation Abstract: Causal inference aims to identify and quantify a causal effect. With traditional causal inference methods, we can estimate the overall effect of a treatment on an outcome. When we want to better understand a causal effect, we can use causal mediation analysis to decompose the effect into a direct effect of the treatment on the outcome and an indirect effect through another variable, the mediator. Causal mediation analysis can be performed in many situations—the outcome and mediator variables may be continuous, binary, or count, and the treatment variable may be binary, multivalued, or continuous. In this talk, I will introduce the framework for causal mediation analysis and demonstrate how to perform this analysis with the mediate command, which was introduced in Stata 18. Examples will include various combinations outcome, mediator, and treatment types. Author-name: Kristin MacDonald Author-workplace-name: StataCorp LLC File-URL: http://repec.org/neur2024/Northern_Europe24_MacDonald.pdf File-Format: application/pdf File-Function: presentation materials Handle: RePEc:boc:neur24:12 Template-Type: ReDIF-Paper 1.0 Title: Multivariate random-effects meta-analysis for sparse data using smvmeta Abstract: Multivariate meta-analysis is used to synthesize estimates of multiple quantities (“effect sizes”), such as risk factors or treatment effects, accounting for correlation and typically also heterogeneity. In the most general case, estimation can be intractable if data are sparse (for example, many risk factors but few studies) because the number of model parameters that must be estimated scales quadratically with the number of effect sizes. I will present a new meta-analysis model and Stata command, smvmeta, that make estimation tractable by modeling correlation and heterogeneity in a low-dimensional space via random projection and that provide more precise estimates than meta-regression (a reasonable alternative model that could be used when data are sparse). I will explain how to use smvmeta to analyze data from a recent meta-analysis of 23 risk factors for pain after total knee arthroplasty. Author-name: Chris Rose Author-workplace-name: Norwegian Institute of Public Health File-URL: http://repec.org/neur2024/Northern_Europe24_Rose.pdf File-Format: application/pdf File-Function: presentation materials Handle: RePEc:boc:neur24:13 Template-Type: ReDIF-Paper 1.0 Title: Advanced data visualizations with Stata, part VI: Visualizing more than two variables Abstract: The presentation will showcase how Stata can be utilized for visualizing data with more than two dimensions. The presentation will introduce extensions to existing visualization packages and will also launch two new packages. Author-name: Asjad Naqvi Author-workplace-name: Austrian Institute for Economic Research (WIFO) Author-person: pna493 File-URL: http://repec.org/neur2024/Northern_Europe24_Naqvi2.pdf File-Format: application/pdf File-Function: presentation materials Handle: RePEc:boc:neur24:14