-------------------------------------------------------------------------------
help for mrtab
-------------------------------------------------------------------------------

One- and two-way tables of multiple responses

One-way tables:

mrtab varlist [weight] [if exp] [in range] [, { response(numlist) | condition(exp) } poly countall include includemissing casewise title(string) width(#) abbrev nolabel nonames format(%fmt) integer sort[(#)] descending generate(prefix) nofreq ]

Two-way tables:

mrtab varlist [weight] [if exp] [in range] , by(varname) [ column row cell rcolumn rcell chi2 lrchi2 mtest[(spec)] mlrchi2 wrap one-way_options ]

by ... : may be used with mrtab; see help by.

fweights and aweights are allowed with mrtab; see help weights.

Description

mrtab tabulates multiple responses which are stored as a set of variables. For example, a survey question might be: "Which of the following devices do you have in your home?" The respondent is then given a list like "1. Television, 2. Dishwasher, 3. Computer, 4. Dry cleaner ..." and may mark any number of devices. Such information can be stored in several ways. Two of them can be handled by mrtab:

Indicator mode: Each item in the list is represented by an integer indicator variable (e.g., 1 = item was marked by the respondent, 0 = item was not marked).

Polytomous mode (option poly): Each response is represented by a polytomous variable (first response, second response, third response, ...; this is convenient if the list of possible response categories is open or half-open). With the polytomous approach, the response variables may take on either integer or string values.

In either case, mrtab will compute a one-way or a two-way table of the frequency distribution of the responses. For two-way tables mrtab also offers significance tests.

If the data are stored in numeric format according to the polytomous mode, the labels of the response categories are taken from the value labels of the first variable. It is therefore crucial that the first variable contain labels for all possible items.

There are several ways to determine the number of valid observations. By default, all cases with at least one valid response (as specified by response() or condition()) are taken into account. All other observations are treated as missing. In some situations, however, it is appropriate to include cases with zero responses (see the include and the includemissing option). Furthermore, one might want to consider cases with complete information only and neglect all cases with one or more missing values (option casewise).

Use mrgraph to produce plots of multiple response distributions.

Options

abbrev specifies that long response labels be abbreviated rather than wrapped.

by(varname) tabulates the distribution of responses against the categories of varname (two-way table). The by-variable may be string or numeric.

casewise specifies that cases with missing values for at least one of the response variables should be excluded listwise.

cell displays the relative frequency of each cell in a two-way table (base: total number of valid observations).

chi2 requests the calculation of an overall Pearson chi-square statistic for the hypothesis that the distribution of response patterns is independent from the values of the by-variable (not allowed if aweights are specified). That is: A standard chi2 test is applied to an expanded two-way table, where the rows represent unique combinations of responses.

column displays in each cell of a two-way table the relative frequency of that cell within its column (base: column total of observations).

condition(exp) is an alternative to the response() option. It specifies a true or false condition for the response values. The condition must include a wildcard @ for which is substituted in turn each variable name. If the data are stored according to the indicator mode, condition() specifies the scope of values that indicate a response to the item. condition() defaults to @==1 in this case. If the data are stored according to the polytomous mode, condition() specifies the scope of responses that are to be tabulated. The default is to tabulate every value observed for the response variables (except for missing values). In the case of string variables, the condition() option is obsolete. condition() may not be combined with response().

countall requests that repeated identical responses be added up. By default, repeated identical responses will only be counted once per case. If the data are in indicator mode, countall specifies that the observed values be interpreted as response counts. Notes: Significance tests may not be requested if countall is specified. Be careful with interpreting results that are labeled "percentage of cases"; though they reflect the mean number of responses per case, they cannot be interpreted as proportions.

descending specifies that the sort order be descending. The default is to sort in ascending order. This is only relevant if sort is specified.

format(%fmt) specifies the display format for relative frequencies.

generate(prefix) creates a set of indicator variables reflecting the observed responses. The variables will be labeled and named according to the prefix provided. If name(string) is specified, the first eight characters of string are inserted into the variable labels. If the chi2 and/or lrchi2 options are specified, generate will additionally return a composite string variable, prefixrp, which reflects response patterns (each unique combination of responses is represented by a string of zeros and ones).

include specifies that observations composed of zero responses be treated as valid. Only cases with "real" missings (., .a, .b, .c, ...) for all response variables will be excluded. Note that include will affect only the number of valid cases, i.e. both the absolute distribution of responses and the distribution relative to the total of responses will remain unchanged. In the case of string response variables, include specifies that cases with only empty strings ("") be treated as valid.

includemissing is an enhancement to include and specifies that cases be treated as valid even if all response variables are missing. includemissing implies include. Specifying includemissing in connection with casewise has the effect that cases with missing values for at least one of the response variables will be treated as valid cases composed of zero responses.

integer specifies the display of frequencies as integers even if aweights are applied.

lrchi2 requests the calculation of an overall likelihood-ratio chi-square statistic (as an alternative to chi2). Note that the lrchi2 option is not allowed if aweights are specified and that the statistic will not be calculated if there are empty cells.

mtest requests the calculation of separate Pearson chi-square statistics for each response response category. That is, a test is carried out for each response category to establish whether the probability of observing the response depends on the values of the by-variable (this option is not allowed if aweights are specified). Multiple-test adjustments may be requested by specifying the method in brackets (e.g. mtest(bonferroni)). See help _mtest.

mlrchi2 requests mtest to use the likelihood-ratio chi-square statistics instead of Pearson chi-square.

nofreq suppresses printing the frequencies (i.e., the whole frequency table will be suppressed unless cell, column, row, rcell or rcolumn is specified for two-way tables).

nolabel suppresses the printing of labels.

nonames suppresses the printing of variable names or category values in the left stub of the table, i.e. only the labels will be printed. (This option has no effect if the response variables are string variables.) Not allowed if the response variables are unlabeled or the nolabel option is specified.

poly specifies that the responses are stored according to the polytomous mode. If poly is not specified, mrtab assumes that the responses are stored according to the indicator mode. However, string response variables imply poly.

rcell displays the relative frequency of each cell in a two-way table (base: total number of responses).

rcolumn displays in each cell of a two-way table the relative frequency of that cell within its column (base: column total of responses).

response(numlist) specifies the (range of) response values. If the data are stored according to the indicator mode, response() specifies the value which indicates a response to the item. response() defaults to 1 in this case. Note that the indicator variables do not necessarily have to be dichotomous since a list or range of values may be specified. If the data are stored according to the polytomous mode, response() specifies the list or range of responses that are to be tabulated. The default is to tabulate every value observed for the response variables (except for missing values). In the case of string variables, the response() option is obsolete. response() may not be combined with condition().

row displays in each cell of a two-way table the relative frequency of that cell within its row (base: row total of responses; this is equal to the row total of observations unless countall is specified).

title(string) may be used to label the multiple response set. string will be printed at the head of the table.

sort displays the table rows in ascending order of frequency. In the case of a two-way table the sorting will correspond to the row totals unless a reference column is specified in parentheses. That is, sort(1) will sort in order of the frequencies in the first column (first by-group), sort(2) in order of the frequencies in the second column, and so on. Specify the descending option to sort in descending order.

width(#) specifies the maximum width (number of chars) used to display the labels of the responses. Labels that are too wide are wrapped (or abbreviated if the abbrev option is specified). The default width is 30. The minimum width is 11.

wrap requests that Stata take no action on wide two-way tables to make them readable. Unless wrap is specified, wide tables are broken into pieces to enhance readability.

Examples

Indicator mode:

. use http://fmwww.bc.edu/RePEc/bocode/d/drugs.dta (1997 Survey Data on Swiss Drug Addicts)

. mrtab inco1-inco7, include title(Sources of income) width(24)

| Pct. of Pct. of Sources of income | Freq. responses cases -------------------------------+----------------------------------- inco1 private support | 226 12.83 23.25 (partner, family, | friends) | inco2 public support | 607 34.47 62.45 (unemployment insurance, | social benefits) | inco3 drug dealing | 293 16.64 30.14 inco4 housebreaking, theft, | 50 2.84 5.14 robbery | inco5 prostitution | 82 4.66 8.44 inco6 "mischeln"/begging | 151 8.57 15.53 inco7 legal occupation | 352 19.99 36.21 -------------------------------+----------------------------------- Total | 1761 100.00 181.17

Valid cases: 972 Missing cases: 0

Polytomous mode:

. mrtab pinco1-pinco6, poly response(1/7) include title(Sources of income) width(27)

| Pct. of Pct. of Sources of income | Freq. responses cases -------------------------------+----------------------------------- 1 private support (partner, | 226 12.83 23.25 family, friends) | 2 public support | 607 34.47 62.45 (unemployment insurance, | social benefits) | 3 drug dealing | 293 16.64 30.14 4 housebreaking, theft, | 50 2.84 5.14 robbery | 5 prostitution | 82 4.66 8.44 6 "mischeln"/begging | 151 8.57 15.53 7 legal occupation | 352 19.99 36.21 -------------------------------+----------------------------------- Total | 1761 100.00 181.17

Valid cases: 972 Missing cases: 0

The response() option in indicator mode:

. codebook crime1

----------------------------------------------------------------------- crime1 hit someone -----------------------------------------------------------------------

type: numeric (byte) label: crime

range: [0,3] units: 1 unique values: 4 missing .: 65/972

tabulation: Freq. Numeric Label 716 0 no 62 1 yes, as committer 97 2 yes, as victim 32 3 yes, both 65 . . mrtab crime1-crime5, include response(2 3) title(Crime (as victim)) nonames

| Pct. of Pct. of Crime (as victim) | Freq. responses cases -------------------------------+----------------------------------- hit someone | 129 41.08 14.22 use a weapon against someone | 27 8.60 2.98 sexual harassment, rape | 31 9.87 3.42 robbery (including drug theft) | 99 31.53 10.92 blackmail | 28 8.92 3.09 -------------------------------+----------------------------------- Total | 314 100.00 34.62

Valid cases: 907 Missing cases: 65

Two-way table incl. tests:

. mrtab crime1-crime5, include response(2 3) title(Crime (as victim)) nonames width(18) by(sex) column mtest(bonferroni)

+------------------------+ | Key | |------------------------| | frequency of responses | | column pct. of cases | +------------------------+

| Sex of respondent Crime (as victim) | female male | Total chi2/p* -------------------+------------------------+----------- hit someone | 35 94 | 129 0.014 | 14.46 14.16 | 14.24 1.000 -------------------+------------------------+----------- use a weapon | 4 22 | 26 1.754 against someone | 1.65 3.31 | 2.87 0.927 -------------------+------------------------+----------- sexual harassment, | 31 0 | 31 88.071 rape | 12.81 0.00 | 3.42 0.000 -------------------+------------------------+----------- robbery (including | 32 66 | 98 1.982 drug theft) | 13.22 9.94 | 10.82 0.796 -------------------+------------------------+----------- blackmail | 14 14 | 28 8.005 | 5.79 2.11 | 3.09 0.023 -------------------+------------------------+----------- Total | 116 196 | 312 | 47.93 29.52 | 34.44 Cases | 242 664 | 906

* Pearson chi2(1) / Bonferroni adjusted p-values

Valid cases: 906 Missing cases: 65

Saved Results

Scalars:

r(N) number of valid cases r(N_miss) number of missing cases r(r) number of response categories r(c) number of by-groups if by() is specified r(chi2) overall Pearson chi-squared if chi2 is specified r(p) p-value of the overall Pearson chi-squared r(chi2_lr) overall likelihood-ratio chi-squared if lrchi2 is specified r(p_lr) p-value of the overall likelihood-ratio chi-squared r(df) degrees of freedom of the overall chi2 tests

Macros:

r(list) list of the labels of the responses if available r(mode) either "indicator" or "poly" depending on the mode of the multiple response variables r(type) either "numeric" or "string" depending on the storage type of the multiple response variables r(bylist) list of the labels of the by-groups if available r(bytype) either "numeric" or "string" depending on the storage type of the by-variable

Matrices:

r(responses) frequencies of responses r(cases) cases in by-groups if by() is specified r(mchi2) Pearson chi-squared and (adjusted) p-values of the separate tests if mtest is specified r(mchi2_lr) likelihood-ratio chi-squared and (adjusted) p-values of the separate tests if mtest and mlrchi2 are specified

Authors

Ben Jann, ETH Zurich, jann@soz.gess.ethz.ch

Hilde Schaeper, HIS, Hannover, schaeper@his.de

Also see

Manual: [R] tabulate

Online: help for mrgraph, _mrsvmat, tabulate, _mtest