help mata mm_colvar()


mm_colvar() -- Variance, by column


real rowvector mm_colvar(X [, w])

real matrix mm_meancolvar(X [, w])

real matrix mm_variance0(X [, w])

real matrix mm_meanvariance0(X [, w])

real matrix mm_mse(X, w, mu)

real rowvector mm_colmse(X, w, mu)

real matrix mm_sse(X, w, mu)

real rowvector mm_colsse(X, w, mu)


X: real matrix X (rows are observations, columns variables)

w: real colvector w

mu: real rowvector mu


mm_colvar(X, w) returns the variance of each column of X. Essentially,

mm_colvar(X, w) = diagonal(variance(X, w))'

See help for mean(). However, mm_colvar() does not compute the covariances and is therefore much faster than diagonal(variance()) if X contains more than one column. Furthermore, note that mm_colvar() omits missing values in X by column, whereas variance() omits missing values casewise.

mm_meancolvar(X, w) returns a matrix containing the mean and the variance of each column of X. (Means in row one, variances in row two.)

mm_variance0(X, w) returns the population variance matrix of X. mm_variance0() differs from official Stata's variance() (see help for mean()) in that it divides the deviation cross products by N instead of N-1, where N is the number of observations. Essentially,

mm_variance0(X, w) = variance(X, w) * (N-1)/N

However, mm_variance0() also produces correct results if N==1.

mm_meanvariance0(X, w) returns mean(X,w)\mm_variance0(X,w).

mm_mse(X, w, mu) computes the mean squared errors matrix, where errors are defined as X:-mu.

mm_colmse() computes mean squared errors by column.

mm_sse() and mm_colsse() compute the sum of squared errors.

w specifies the weights. Specify w as 1 to obtain unweighted results.


Examples for mm_colvar():

: x = invnormal(uniform(10000,3)) : mm_colvar(x, 1) 1 2 3 +-------------------------------------------+ 1 | 1.00018384 1.002621747 1.003480729 | +-------------------------------------------+

: mm_meancolvar(x, 1) 1 2 3 +----------------------------------------------+ 1 | -.0024994158 -.0091972878 -.0035865732 | 2 | 1.00018384 1.002621747 1.003480729 | +----------------------------------------------+

The formula implemented in mm_mse() and mm_colmse() computes the mean squared error as the sum of squared errors divided by N, where N is the number of observations (or sum of weights if w!=1).


mm_colvar(X, w): X: n x k w: n x 1 or 1 x 1 result: 1 x k

mm_meancolvar(X, w): X: n x k w: n x 1 or 1 x 1 result: 2 x k

mm_variance0(X, w): X: n x k w: n x 1 or 1 x 1 result: k x k

mm_meanvariance0(X, w): X: n x k w: n x 1 or 1 x 1 result: (k+1) x k

mm_mse(X, w, mu), mm_sse(X, w, mu): X: n x k w: n x 1 or 1 x 1 mu: 1 x k result: k x k

mm_colmse(X, w, mu), mm_colsse(X, w, mu): X: n x k w: n x 1 or 1 x 1 mu: 1 x k result: 1 x k


mm_variance0(), mm_meanvariance0(), mm_mse(), and mm_sse() omit from calculation rows of X or w that contain missing values (casewise deletion). If all rows contain missing values, then the returned result contains all missing values.

Contrarily, mm_colvar(), mm_meancolvar(), mm_colmse(), and mm_colsse() omit missing values by column (i.e. not casewise).

Source code

mm_colvar.mata, mm_variance0.mata, mm_mse.mata


Ben Jann, ETH Zurich,

Also see

Online: help for [M-5] mean(), moremata