Center (or standardize) variables
center varlist [weight] [if exp] [in range] [, casewise { prefix(prefix) | generate(newvar) } theta(#|varname) standardize replace double nolabel addtolabel(string) meansave[(prefix|newvar)] sdsave[(prefix|newvar)] ]
by ... : may be used with center; see help by.
aweights and fweights are allowed; see help weights.
Description
center centers (or standardizes) the variables in varlist. For each varname in varlist, a new variable c_varname is created containing the centered (or standardized) values of that variable.
Use the by prefix to center/standardize groupwise.
Options
addtolabel(string) may be used to specify text which is to be added to the end of the labels of the new variables. Defaults are " (centered)" and " (standardized)", respectively.
casewise specifies that cases with missing values be excluded listwise, i.e., that the centering/standardization be based on the sample that is not missing for any of the variables in varlist. The default is to use all the nonmissing values for each variable.
double enforces storage type double.
generate(newvar) specifies the name of the new variable to be created. Note that center may only be applied to one variable at a time if generate() is specified.
meansave[(prefix)] saves variables containing the means. The new variables will be named prefixvarname. The default prefix is m_. Note that the saved means will not be affected by theta(). If generate() is specified, meansave(newvar) saves the mean(s) in newvar.
nolabel suppresses the assignment of labels to the new variables. The default is to use the label (or name) of the original variable with some text added to the end of it (see addtolabel).
prefix(prefix) allows the user to supply a prefix for the centered/standardized variables. The default prefix is c_, thus the new variables will be named c_varname. prefix() is not allowed if generate() is specified.
replace permits center to overwrite existing variables.
sdsave[(prefix)] saves variables containing the standard deviations (allowed only if the standardize option is specified). The new variables will be named prefixvarname. The default prefix is sd_. If generate() is specified, sdsave(newvar) saves the standard deviation(s) in newvar.
standardize creates a variable containing the standardized values (zero sample mean and unit sample variance). Default is to create a variable containing the centered values (zero sample mean).
theta(#|varname) may be used for quasi-demeaning. Before subtraction, the means will be scaled by # or by the values of varname respectively. This is sometimes used in the context of panel data models (see Wooldridge, 2002:287, and the Methods and Formulas Section in [XT] xtreg).
Examples
. sysuse auto
. center mpg price weight
. center mpg price weight, prefix(z_) standardize
. bysort rep78: center mpg price weight, replace
. center mpg, generate(mpg0)
References
Wooldridge, J.M. 2002. Econometric Analysis of Cross Section and Panel Data. Cambridge (Mass.): The MIT Press.
Author
Ben Jann, ETH Zurich, jann@soz.gess.ethz.ch
I am grateful to Mark Schaffer, who provided the motivation for the theta option.
Also see
Manual: [R] summarize
On-line: help for summarize