------------------------------------------------------------------------------- help forshufflevar-------------------------------------------------------------------------------

Randomly shuffle variables

shufflevarvarlist[,Joint DROPold cluster(varname)]

Description

shufflevartakesvarlistand either jointly or for each variable shufflesvarlistrelative to the rest of the dataset. This means any association betweenvarlistand the rest of the dataset will be random. Much like bootstrap or the Quadratic Assignment Procedure (QAP), one can build a distribution of results out of randomness to serve as a baseline against which to compare empirical results, especially for overall model-fit or clustering measures.

RemarksThe program is intended for situations where it is hard to model error formally, either because the parameter is exotic or because the application violates the parameter's assumptions. For instance, the algorithm has been used by Fernandez et. al. and Zuckerman to interpret network data, the author wrote this implementation for use in interpreting st frailty models with widely varying cluster sizes, and others have suggested using the metric for adjacency matrices in spatial analysis.

Much like bsample, the

shufflevarcommand is only really useful when worked into a forvalues loop or program that records the results of each iteration using postfile. See the example code below to see how to construct the loop.To avoid confusion with the actual data, the shuffled variables are renamed

varname_shuffled.This command is an implementation of an algorithm used in two papers that used it to measure network issues:

Fernandez, Roberto M., Emilio J. Castilla, and Paul Moore. 2000. "Social Capital at Work: Networks and Employment at a Phone Center."

AmericanJournal of Sociology105:1288-1356.Zuckerman, Ezra W. 2005. "Typecasting and Generalism in Firm and Market: Career-Based Career Concentration in the Feature Film Industry, 1935-1995."

Research in the Sociology of Organizations23:173-216.

Optionsjointspecifies thatvarlistwill be keep their actual relations to one another even as they are shuffled relative to the rest of the variables. Ifjointis omitted, each variable in thevarlistwill be shuffled separately.

dropoldspecifies that the original sort order versions ofvarlistwill be dropped.

cluster(varname) specifies that shuffling will occur byvarname.

Examples

. sysuse auto, clear. regress price weight. local obs_r2=`e(r2)'. tempname memhold. tempfile results. postfile `memhold' r2 using "`results'". forvalues i=1/100 {. shufflevar weight, cluster(foreign). quietly regress price weight_shuffled. post `memhold' (`e(r2)').}. postclose `memhold'. use "`results'", clear. sum r2. disp "The observed R^2 of " `obs_r2' " is "(`obs_r2'-`r(mean)')/`r(sd)' " sigmas out on the" _newline"distribution of shuffled R^2s."

AuthorGabriel Rossman, UCLA rossman@soc.ucla.edu

Also seeOn-line: help for bsample, help for forvalues, help for postfile, help for program, help for permute