Chi-square test for univariate frequency distributions
chitesti #obs1 #obs2 [...] [ \ #exp1 #exp2 [...] ] [ , nfit(#) replace list_options ]
Description
The input for chitesti must consist of
either a single row of values
or two rows of values separated by a backslash \.
The first row is taken to be observed frequencies, which must be zeros or positive integers. The second row, if present, is taken to be expected frequencies under some hypothesis, which must be positive. These may be given either as numbers or as numeric expressions without embedded spaces. If the second row is not present, the expected frequencies are taken to be equal, i.e. equal to the mean of the observed frequencies.
The display includes the Pearson chi-square statistic and its P-value for a test of the hypothesis, the likelihood-ratio chi-square statistic and its P-value, observed frequencies, expected frequencies, residuals (observed - expected), and Pearson residuals, defined as (observed - expected) / sqrt(expected).
Any cells with expected frequencies less than 5 are flagged.
Options
nfit() indicates the number of parameters that have been estimated from the data. This number will be subtracted from (number of cells - 1) to give the number of degrees of freedom. The default is 0.
replace indicates that the observed and expected frequencies are to be left as the current data in place of whatever data were there. These variables will be called observed and expected.
list_options are options of list.
Examples
Breiman (1973, p.191) lists the frequencies of the digits 0 ... 9 in the first 608 decimal places of pi as 60 62 67 68 64 56 62 44 58 67.
. chitesti 60 62 67 68 64 56 62 44 58 67
Breiman (1973, p.191) also gives data from one of Mendel's experiments. He observed 315 round yellow peas, 108 round green peas, 101 wrinkled yellow peas and 32 wrinkled green peas. According to theory, the expected frequencies should be in the ratio 9:3:3:1.
. chitesti 315 108 101 32 \ 556*9/16 556*3/16 556*3/16 556*1/16
. gen lastdigit = mod(myvar, 10) . qui forval i = 0/9 { . count if lastdigit == `i' . local obs "`obs' `r(N)'" . } . chitesti `obs'
Saved values
r(k) number of classes in distribution r(df) degrees of freedom r(chi2) Pearson chi-square r(p) P-value of Pearson chi-square r(chi2_lr) likelihood-ratio chi-square r(p_lr) P-value of likelihood-ratio chi-square r(emean) mean expected frequency
Author
Nicholas J. Cox, University of Durham, U.K. n.j.cox@durham.ac.uk
Acknowledgements
Benoit Dulong pointed towards a precision problem.
References
Breiman, L. 1973. Statistics: with a view towards applications. Boston: Houghton Mifflin.
Also see
On-line: help for chitest, tabulate