help for chitest

Chi-square test for univariate frequency distributions

chitest observed [expected] [if exp] [in range] [ , nfit(#) count list_options ]


chitest works on either one or two variables.

By default the first variable is taken to contain observed frequencies, which must be zeros or positive integers. Optionally, the first variable is treated as a variable with distinct values to be counted; the observed frequencies are then used in a chi-square test.

The second variable, if specified, is taken to contain expected frequencies under some hypothesis, which must be positive. If the second variable is not specified, the expected frequencies are taken to be equal, i.e. equal to the mean of the observed frequencies.

The display includes the Pearson chi-square statistic and its P-value for a test of the hypothesis, the likelihood-ratio chi-square statistic and its P-value, observed frequencies, expected frequencies, residuals (observed - expected), and Pearson residuals, defined as (observed - expected) / sqrt(expected).

Any cells with expected frequencies less than 5 are flagged.


nfit() indicates the number of parameters that have been estimated from the data. This number will be subtracted from (number of cells - 1) to give the number of degrees of freedom. The default is 0.

count instructs chitest to count the single variable specified, which is treated as a categorical variable. Note that count will not produce zero counts; that is, it cannot count what is not present in the data. In some problems it is easiest to insert zero counts by hand with chitesti. See the examples for one counting technique.

list_options are options of list.


. chitest count Poisson, nfit(1)

. gen lastdigit = mod(price, 10) . chitest lastdigit, count

. gen firstdigit = real(substr(string(myvar),1,1)) . gen obs = . . qui forval i = 1/9 { . count if firstdigit == `i' . replace obs = r(N) in `i' . } . gen exp = _N * log10(1 + 1/_n) in 1/9 . chitest obs exp

Saved values

r(k) number of classes in distribution r(df) degrees of freedom r(chi2) Pearson chi-square r(p) P-value of Pearson chi-square r(chi2_lr) likelihood-ratio chi-square r(p_lr) P-value of likelihood-ratio chi-square r(emean) mean expected frequency


Nicholas J. Cox, University of Durham, U.K. n.j.cox@durham.ac.uk


Benoit Dulong pointed towards a precision problem.

Also see

On-line: help for chitesti, tabulate