```help mata mm_collapse()
-------------------------------------------------------------------------------

Title

mm_collapse() -- Make matrix of summary statistics by subgroups

Syntax

real matrix mm_collapse(X, w, id [, f, ...])

real matrix _mm_collapse(X, w, id [, f, ...])

where

X:  real matrix containing data (rows are observations, columns
are variables)
w:  real colvector containing weights or 1
id:  real colvector containing subgroup ID variable
f:  pointer scalar containing address of the function to be used,
i.e. f = &functionname(); the default function is mean()
...:  up to 10 optional arguments to pass through to f

Description

mm_collapse() returns a matrix of summary statistics by subgroups. It is
similar to Stata's collapse.

X provides the data. Rows are observations and columns are variables.
Summary statistics are computed for each variable.

w specifies weights associated with the observations (rows) in X. Specify
w as 1 to obtain unweighted results.

id specifies the subgroup identification numbers associated with the
observations (rows) in X. Each distinct value in id defines a subgroup or
panel for which to compute the summary statistics.

The default is to compute arithmetic means using the mean() function.
Alternatively, specify f, where f is a pointer to a function, i.e.
f = &functionname() (see [M-2] ftof). For example, specify &variance() to
compute variances. f is assumed to return a real scalar and take a data
column vector as first argument and weights as second argument.

_mm_collapse() is analogous to mm_collapse() but but assumes X, w, and id
to be sorted by id. _mm_collapse() is faster and uses less memory than
mm_collapse().

The matrix returned by mm_collapse() or _mm_collapse() contains the
subgroup codes in the first column; the second and following columns, one
for each variable in X, contain the computed statistics.

Remarks

Examples:

. sysuse auto
(1978 Automobile Data)

. preserve

. collapse (mean) price turn, by(rep78)

. list

+---------------------------+
| rep78     price      turn |
|---------------------------|
1. |     1   4,564.5        41 |
2. |     2   5,967.6    43.375 |
3. |     3   6,429.2   41.0667 |
4. |     4   6,071.5      38.5 |
5. |     5     5,913   35.6364 |
|---------------------------|
6. |     .   6,430.4      37.6 |
+---------------------------+

. restore

. mata: X  = st_data(., ("price", "turn"))

. mata: ID = st_data(., "rep78")

. mata: mm_collapse(X, 1, ID)
1             2             3
+-------------------------------------------+
1 |            1        4564.5            41  |
2 |            2      5967.625        43.375  |
3 |            3   6429.233333   41.06666667  |
4 |            4        6071.5          38.5  |
5 |            5          5913   35.63636364  |
6 |            .        6430.4          37.6  |
+-------------------------------------------+

. mata: w = st_data(., "weight")

. mata: mm_collapse(X, w, ID)
1             2             3
+-------------------------------------------+
1 |            1   4608.601613   41.11935484  |
2 |            2   6230.200895   43.59932911  |
3 |            3    7003.15439   41.80276852  |
4 |            4    6240.01355   39.81842818  |
5 |            5   6287.482192   35.74011742  |
6 |            .   6736.482783   38.01827126  |
+-------------------------------------------+

. mata: mm_collapse(X, w, ID, &mm_median())
1      2      3
+----------------------+
1 |     1   4934     42  |
2 |     2   5104     44  |
3 |     3   4816     42  |
4 |     4   5798     42  |
5 |     5   5719     36  |
6 |     .   4453     38  |
+----------------------+

. mata: mm_collapse(X, w, ID, &mm_quantile(), .25)
1      2      3
+----------------------+
1 |     1   4195     40  |
2 |     2   4060     41  |
3 |     3   4482     40  |
4 |     4   4890     35  |
5 |     5   4425     35  |
6 |     .   4424     35  |
+----------------------+

Conformability

mm_collapse(X, w, id, f, ...),
_mm_collapse(X, w, id, f, ...),
X:  n x k
w:  n x 1 or 1 x 1
id:  n x 1
f:  1 x 1
...:  (depending on f)
result:  g x (1 + k), where g is the number of subgroups

Diagnostics

mm_collapse() and _mm_collapse() cannot be used with built-in functions
(use wrappers).

mm_collapse() and _mm_collapse() return J(0, 1 + cols(X), .) if X and id
are void.

Source code

mm_collapse.mata

Author

Ben Jann, ETH Zurich, jannb@ethz.ch

Also see

```