{smcl} {hline} {cmd:help groupfunction}{right:v2.2 28/Jan/2021} {hline} {title:Title} {p2colset 5 24 26 2}{...} {p2col :{cmd:groupfunction} {hline 1}} Replaces several basic collapse functions (mean, sum, variance, standard deviation, first, max, min). The command is several orders of magnitude faster than collapse when summarizing multiple vectors on larger datasets.}{p_end} {p2colreset}{...} {title:Syntax} {p 8 23 2} {opt groupfunction} [if] [in] [aw pw fw] {cmd:,} {opt by(varlist)} {opt mean(varlist)} {opt sum(varlist)} {opt rawsum(varlist)} {opt first(varlist)} {opt min(varlist)} {opt max(varlist)} {opt count(varlist)} {opt sd(varlist)} {opt variance(varlist)} {opt gini(varlist)} {opt theil(varlist)} {opt xtile(varlist)} {opt nq(int)} {opt missing} {opt norestore} {title:Description} {pstd} {cmd:groupfunction} Replaces several collapse functions (mean, sum, variance, first, max, min). The command is several orders of magnitude faster than collapse {title:Options} {phang} {opt by(varlist)} Grouping for reporting estimates. {phang} {opt xtile(varlist)} Coupled with nq(), it creates variable with percentiles and adds it to the by() option. {phang} {opt nq(int)} Option only works when xtile() is specified. It indicates the number of quantiles. {phang} {opt mean(varlist)} Calculates means of specified variables. {phang} {opt sum(varlist)} Calculates total sum of specified variables. If weights are specified it will give population expanded total. {phang} {opt rawsum(varlist)} Calculates total sum of specified variables. If weights are specified it will ignore weights. {phang} {opt first(varlist)} Provides first observation by groups specified in by option. {phang} {opt min(varlist)} Provides minimum value, by groups, of vectors specified. {phang} {opt max(varlist)} Provides maximum value, by groups, of vectors specified. {phang} {opt count(varlist)} Provides observation count, by groups, of vectors specified. {phang} {opt sd(varlist)} Calculates standard deviations of specified variables. {phang} {opt variance(varlist)} Calculates variance of specified variables. {phang} {opt gini(varlist)} Calculates Gini coefficient of specified variables. {phang} {opt theil(varlist)} Calculates Theil coefficient of specified variables. {phang} {opt missing} This option is only relevant for sum and rawsum. If an entire group in by() is missing for the sum/rawsum variable the output will be missing instead of zero. {phang} {opt norestore} Drops all non-relevant variables before calculations to improve memory management. {phang} {opt slow} Use this option if you run into memory issues, it will get values one by one. {phang} {opt merge} Requests that values are not to be collapsed, it instead merges the new vectors to the dataset in memory. {title:Example} {p 8 12}{stata "sysuse auto, clear" :. sysuse auto, clear}{p_end} {p 8 12}{stata "groupfunction [aw=weight], mean(price) min(weight) by(foreign)" :. groupfunction [aw=weight], mean(price) min(weight) by(foreign)}{p_end} {title:Example 2 (Time comparisons)} clear all set more off set obs 300000 version 13 set seed 458267 gen regions = int(runiform()*20) replace region = 1 if inrange(region,0,3) replace region = 2 if inrange(region,4,5) //Income per capita forval z=1/200{ gen x_`z' = region*runiform()*4000 + rnormal()*200 } forval z=1/300{ gen y`z' = runiform()*100 + (rnormal()*20)^2 } gen weight = abs(rnormal()) //time collapse preserve timer on 1 collapse (mean) y* x_* [aw=w], by(region) timer off 1 restore //time fcollapse (Weights not supported) preserve timer on 2 fcollapse (mean) y* x_*, by(region) timer off 2 restore //time groupfunction preserve timer on 3 groupfunction [aw=weight], mean(y* x_*) by(region) timer off 3 restore timer list {title:Authors} {pstd} Paul Corral{break} The World Bank - Poverty and Equity Global Practice {break} Washington, DC{break} pcorralrodas@worldbank.org{p_end} {pstd} Minh C. Nguyen{break} The World Bank - Poverty and Equity Global Practice {break} Washington, DC{break} pcorralrodas@worldbank.org{p_end} {pstd} Joao Pedro Azevedo{break} The World Bank - Poverty and Equity Global Practice {break} Washington, DC{break} jazevedo@worldbank.org{p_end} Raul Andrés Castañeda provided valuable suggestions for this command.