NASUG 2001, Boston

Meeting Summary

The first North American Stata Users Group meeting took place at Longwood Galleria Conference Center, Boston, MA on 12th and 13th March. The meeting was organised by Christopher F. Baum, Boston College, Nicholas J. Cox, University of Durham, and Marcello Pagano, Harvard School of Public Health, with excellent logistic and other support from William Gould and David Drukker of Stata Corp. More than 40 people attended, including also Joe Newton, Texas A&M, Editor of the Stata Technical Bulletin. A single stream of presentations was capped by the now traditional "Report to users" by William Gould and an open session for users' wishes and grumbles (what they want in Stata and what they can't do or don't like at present). Following on the pattern of previous user meetings, which have been held in the UK, Spain and the Netherlands, there was also an informal (Indian) meal on Monday evening and much enjoyable and useful interchange in the breaks between presentations.

For those who have never attended one of these meetings, let us briefly describe the format. The presentations are interspersed with a relaxed discussion over coffee and food and the first day's meeting was followed by the aforementioned dinner. Presentations varied from descriptions of statistical methods, how to perform such analyses in Stata, and comparisons of the results so obtained with those of other packages, to descriptions of new user-programmed features, to advice on the use of Stata in teaching.

In all of these meetings, user presentations have come first and Stata's presentations at the end. What makes each of these meetings a valuable experience is the high quality of the work that is presented by users for users. Bill Gould, in his talk at the end, mentioned that a goal of Stata has been to (re)open the development of statistical software to the users (Bill emphasizes that, originally, statistical software was written by the users). Based on the evidence presented in the meetings, they have had much success.

The first session on Monday morning opened with a review of fitting GEE models in Stata by Nicholas Horton (Boston University), followed by an introduction with software to social network analysis (QAP or dyadic data) by William Simpson (Harvard Business School), and closing with Stas Kolenikov (U. North Carolina) discussing the use of ml to fit normal mixture composition models.

After a break, we returned to hear Jeremy Freese (U. Wisconsin) present his and Scott Long's post estimation commands for use with regression models for categorical and count data, followed by Jeroen Weesie (U. of Utrecht, visiting StataCorp) discussing his work on a new command for testing for omitted variables, which is to say, verifying specification of models.

We broke for lunch, and thereafter heard Michael Duggan (Suffolk U.) and Alicia Dowd (U. Mass.-Boston) discuss survey analysis, the main point being to compare (correct) answers calculated by Stata's svy commands with the ex-post F deflator approach for adjusting results which is popularly used with SPSS and SAS. Michael Blasnik (Blasnik Associates) then spoke on using Stata ado-files to produce reams of tables based on many statistics calculated by the svy commands. Richard Goldstein then closed the session by showing how a class of multi-level models could be estimated in Stata.

After a final break for the day, Rino Bellocco (Karolinska Institutet) spoke on the analysis of longitudinal data and compared results produced by Stata, SAS and S-Plus. Harriet Griesinger (Wellesley Child Care Res.), in "Date and time tags for filenames in WinXX", provided a solution for a problem that originally appeared on Statalist, and Kit Baum (Boston College) provided a summary of his work in analyzing multifrequency panel data with Stata: basically a i x i x t dataset (sic) and introduced us to the concepts of "long-long" data, "long-wide" data, etc. Petia Petrova (Boston College) spoke on using cross-year family individual files from the PSID, which is one of the most popular datasets used by labor economists.

Such was the first day; we broke, some went for drinks, others to attend to personal matters, and we met some hours later at the Indian restaurant.

The second day opened with a talk by Phil Ender (UCLA) on teaching with Stata, which real-time teaching tools he demonstrated to the delight of all. That was followed by David Kantor (Johns Hopkins Univ.) on three-valued logic (which talk probably produced the most questions and comments). Following that, Nicholas J. Cox (U. of Durham) spoke on analyzing circular data with Stata (circular referring to the fact that there are 360 degrees in a circle, and an important point being to emphasize that geography is *NOT* about providing answers to questions such as "Why is Albany the state capital of New York?) David Drukker (Stata Corp) then spoke on panel-data analysis, with an emphasis on the Arellano-Bond estimator.

After a break, the final formal session focused on Stata. Joe Newton (Texas A&M Univ. and Editor of the Stata Technical Bulletin) spoke on the STB and its future, and then Bill Gould (President, Stata Corp) spoke on Stata and its future, which he refers to as "Report to Users" and first popularized in London. These talks tend to be fairly honest assessments of recent successes and failures (and hence he does not make his talking notes available). The conference closed with a session on "wishes and grumbles", moderated by Kit Baum, in which users tossed out their wishlists and pet peeves, and Messrs Gould and Drukker responded. A summary of the "wishes and grumbles" session by Jeroen Weesie is available from the Stata meeting proceedings website.

Revised 06 April 2001