help mog -------------------------------------------------------------------------------

Title

mog -- Produce one way or two way tables of means (or totals) and perform significance tests and quality control checks

Syntax

mog varname1 varname2 [varname3] [if] [in] [weight] [, options]

options Description ------------------------------------------------------------------------- Model total changes the command to estimate totals and not means

SE/Cluster survey mog will use svyset information for variance estimation and weighting

Reporting--estimates decimals(#) number of decimals of estimates, default is 2 round(#) estimates will be rounded to multiples of this number (see round function), default is 0--no rounding

Reporting--significance tests ref1(#) specifies the reference group to use for varname2, default is 1 ref2(#) specifies the reference group to use for varname3, default is 1 sl(#) controls the significance level of tests, default is 0.05 retest instructs mog to use estimates from last run notest suppresses the insertion of significance test symbols in the table Fwer display information on the family-wise error rate

Reporting--quality controls pubstand show quality control symbols for each estimate pubdichot use more conservative quality controls if varname1 is dichotomous mincount(#) minimum sample size of each estimate, default is 15 symbmincount(string) symbol that indicates the sample size is too low for an estimate, see mincount, default is "X" cvwarning(#) an estimate is "use with caution" grade if its cv is larger than this value and less than cvtoohigh, default is 1/6th symbwarning(string) symbol used to indicate an estimate falls in the situation described in option cvwarning, default is "E" cvtoohigh(#) an estimate is "too unreliable to use" grade if its cv is larger than this value, default is 1/3rd symbtoohigh(string) symbol used to indicate an estimate falls in the situation described in option cvtoohigh, default is "F" Reporting--miscellaneous nodetail displays only the table cellwidth(#) controls the width of table cells varwidth(#) controls the width of the first column displaying varname2 labels, default is 15 underscores replaces spaces with underscores in labels--helps when copying data ------------------------------------------------------------------------- varname1 is the variable for which you want to estimate means. varname2 is a categorical variable over which the means are grouped, creating a one way table. varname3 is an optional 2nd categorical variable over which the means are also grouped, creating a two way table. The svy prefix is not allowed. Use option survey instead. All weight types are allowed; see weight. Weights are not compatible with survey. If both are used, weight information is ignored. retest is combatible with changing any reporting option. Useful when estimation is time consuming.

Description

This progam calculates the mean of a variable across all the combinations of categories of up to two other categorical variables (the grouping variables) and produces a one or two way table showing the means. It is essentially a "front-end" for the mean and total commands.

Tests to calculate if there are significant differences from each cell's estimate and the reference category are performed. Symbols are placed in the table to indicate the results of the tests.

mog also performs quality control tests on the sample sizes and coefficients of variation (1/t-ratio) of each estimate. Symbols are placed in the table to indicate if any of these checks have failed.

Copy the fixed width tables into a spreadsheet using COPY TABLE. The underscore option helps copying tables that have labels with spaces.

The command will redisplay the table if it is entered without arguments, as do many stata commands.

Options

ref1 and ref2 change the reference groups for the grouping variables varname2 and varname3. Note, do not put in the actual value of the the variable, put in the integer that represents the ordinal rank of the category you want. For instance, if the grouping variable has values 1, 4, 5.5 and 6, then, if you want the second category to be the reference group, type in ref1=2, NOT ref1=4. Similarly, if you want the category with value 5.5 to be the reference group, then type in ref1=3 (because it is the 3rd category when they are sorted), NOT ref1=5.5.

round specifies where the estimate should be rounded. Use numbers that work with the round function. Note that even if a number is rounded to an order of 10, it may still be displayed to the number of decimals specified by the decimal option.

sl specifies the significance level to be used for the tests. Tests are two-tailed t-tests. Default is 0.05. Other possibilities inlcude 0.01, 0.10, and 0.20. The program will function with any number between 0 and 1.

mincount specifies the minimum sample size number to check for when searching through the number of observations for each cell in the table. All counts are displayed in a table at the beginning of the output when the program runs. However, mog checks each value to ensure it is equal to or larger than mincount. If it isn't, a warning is displayed. If any cell count is less than 2, the program terminates as no variance can be estimated for this estimate. Institutional (company, government body, etc.) guidelines often require a minimum sample size for an estimate to be published and made available for public use, and preclude the possibility of anyone (if the observations are people) to be identifiable. The default is 15.

cellwidth allows the user to change the number of columns taken up by each cell. Cell widths are automatically sized to the estimates size, the number of decimal places and the number of symbols. However, one may wish to increase this number in some cases to allow the column labels to be fully displayed.

underscore replaces spaces, " ", in the value labels of the grouping variables with underscores, "_". This ensures that the copying process of this table to another program will recognize the columns properly. The default is to NOT make the replacement.

nodetail makes mog run without displaying any preliminary checks, estimation output, or test results. Only the final table will be displayed. This is useful to cut back on the amount of output shown when you have already checked the details and you wish to tweak other options. Also useful with the retest option. The default is to show the details.

retest reruns mog again, checking to ensure that the same underlying estimation just performed is being requested again, that is, varname1 to varname3 are the same, if and in statements, and the same weight or survey option, and total option. However, when it is rerun, the command that perfoms the estimation will not execute, often saving a lot of time if it uses variance estimation with bootstrap weights. All reporting options are changable, such as the reference group for testing (ref1 and ref2), the significance level of the tests (sl), and formatting options like decimals, cellwidth and underscores. The default is not to retest. Note: although the estimation command mean (or total) is not rerun, the estimation results from the last mog command must be the active estimation results in memory. This includes both the estimation results saved in e(), and mog results saved in r().

survey tells mog to use the information entered in the svyset command regarding the details of the survey design. The survey option will over-ride any weights specified. Use one or the other, not both. The default is not to use the svyset information. It is operationalized by running the estimation command (mean or total) with the svy prefix.

fwer computes the family-wise error rate, based on each tests p-value in the series of tests for the categories of a grouping variable. This may be useful when considering the error rates resulting from useage by a diverse, independent audience who tend to pick and choose what information they need. It provides a sense of the overall error rate for the set of tests in this context. The default is not to make the calculations. Not to be confused with joint tests of significance when a set of test results are used together to develop a conclusion, see test for information on joint testing and associated adjustments used in multiple tests (like bonferonni, sidak, etc.). Use test after mog to conduct joint tests, if desired. The family-wise error rate, the probability that at least one test in a series of independently used tests is incorrectly rejecting the null. Formula: 1 - (1-pvalue_1)*...*(1-pvalue_i)*...*(1-pvalue_n) , i subscripts tests and n is the number of tests.

pubstand will place symbols in the table that indicate if an estimate has not passed confidentiality and reliability standards. Estimates with cell counts less than mincount will have the symbol X (changable using using symbmincount) placed to their right. Estimates with coefficients of variation (1/t-ratio) greater than cvtoohigh will have an F placed to their right. Estimates with cvs between cvwarning and cvtoohigh will have an E placed to their right. Symbols are changeable with symbwarning and symbtoohigh. There can be an X, E, an F, or neither, but never more than one. F and E are mutually exclusive. X replaces E and F.

pubdichot. For dichotomous dependent variables for varname1, for instance, indicator variables showing if an observation has a particular characteristic, the pubdichot option may be used to further slice the samples. They are further sliced by not just the grouping variables, but also the dependent variable, varname1. This may be a requirement of your institutional quality assurance guidelines for total and proportion estimates. One example is the total number of people who are disabled, by age category and sex. If only using the pubstand option, just the sample sizes in a tabulation by age and sex would be checked against mincount for a minium sample size. This may be acceptable for continuous dependent variables. However for estimating totals, when there are rare events where maybe only 2 people are disabled in a particular category, it is this number, and the number not disabled, that are checked individually against mincount, instead of the sum of the two (representing the overall sample size of the cell).

Examples ------------------------------------------------------------------------------- Setup . use auto

Estimate the mean of weight over foreign . mog weight foreign

With rounding and no decimals . mog weight foreign, dec(0) round(100)

Recode rep78 . recode rep78 (1/3=3 "Good") (4=4 "Very Good") (5=5 "Exceptionally Good"), gen(rep78new)

Now estimate means by foreign and rep78new . mog weight foreign rep78new, dec(0) round(100)

Suppress details with nodetail . mog weight foreign rep78new, dec(0) round(100) nodetail

Change the reference group for varname2 (rep78new) to the 3rd category . mog weight foreign rep78new, dec(0) round(100) nodetail ref2(3)

Perform quality control checks . mog weight foreign rep78new, dec(0) round(100) nodetail ref2(3) pubstand

Change minimum sample size per estimate to 5 . mog weight foreign rep78new, dec(0) round(100) nodetail ref2(3) pubstand mincount(5)

Increase the cellwidth to see all rep78new labels . mog weight foreign rep78new, dec(0) round(100) nodetail ref2(3) pubstand mincount(5) cellwidth(15)

Insert underscores to aid in pasting to a spreadsheet (use copy table option in Edit or shortcut menu) . mog weight foreign rep78new, dec(0) round(100) nodetail ref2(3) pubstand mincount(5) cellwidth(15) und

Do not perform significance tests . mog weight foreign rep78new, dec(0) round(100) nodetail ref2(3) pubstand mincount(5) cellwidth(15) und notest

Show only formatted numbers . mog weight foreign rep78new, dec(0) round(100) nodetail notest ------------------------------------------------------------------------------- Setup . use auto

Make a dummay variable to indicate foreign cars . xi i.foreign

Recode rep78 . recode rep78 (1/3=3), gen(rep78new)

Estimate the total of weight over foreign and rep78new . mog weight foreign rep78new, total dec(0)

Estimate the total of foreign over rep78new . mog _Iforeign_1 rep78new, total dec(0) -------------------------------------------------------------------------------

Saved results

mog saves estimation results in e(). See mean or total for details on what is saved by the respective estimation commands.

mog saves one result in r():

Macros r(cmdtext) command as typed

Author

Matt Hurst Statistics Canada matt.hurst@statcan.gc.ca Last revision: December 9, 2010

Also see

Manual: [R] mean; [R] total

Help: [R] mean, [R] mean postestimation; [R] summarize, [R] total, [R] tabulate, [R] test