{smcl}
{* 10jul2003/12mar2004}{...}
{hline}
help for {hi:chitest}
{hline}
{title:Chi-square test for univariate frequency distributions}
{p 8 17 2}
{cmd:chitest}
{it:observed}
[{it:expected}]
[{cmd:if} {it:exp}]
[{cmd:in} {it:range}]
[
{cmd:,}
{cmd:nfit(}{it:#}{cmd:)}
{cmd:count}
{it:list_options}
]
{title:Description}
{p 4 4 2}{cmd:chitest} works on either one or two variables.
{p 4 4 2}By default the first variable is taken to contain observed
frequencies, which must be zeros or positive integers. Optionally, the first
variable is treated as a variable with distinct values to be counted; the
observed frequencies are then used in a chi-square test.
{p 4 4 2}The second variable, if specified, is taken to contain expected
frequencies under some hypothesis, which must be positive. If the second
variable is not specified, the expected frequencies are taken to be equal,
i.e. equal to the mean of the observed frequencies.
{p 4 4 2}The display includes the Pearson chi-square statistic and its
{it:P}-value for a test of the hypothesis, the likelihood-ratio chi-square
statistic and its {it:P}-value, observed frequencies, expected frequencies,
residuals (observed - expected), and Pearson residuals, defined as (observed -
expected) / sqrt(expected).
{p 4 4 2}Any cells with expected frequencies less than 5 are flagged.
{title:Options}
{p 4 8 2}{cmd:nfit()} indicates the number of parameters that have been
estimated from the data. This number will be subtracted from (number of cells -
1) to give the number of degrees of freedom. The default is 0.
{p 4 8 2}{cmd:count} instructs {cmd:chitest} to count the single variable
specified, which is treated as a categorical variable. Note that {cmd:count}
will not produce zero counts; that is, it cannot count what is not present in
the data. In some problems it is easiest to insert zero counts by hand with
{help chitesti}. See the examples for one counting technique.
{p 4 8 2}{it:list_options} are options of {help list}.
{title:Examples}
{p 4 8 2}{cmd:. chitest count Poisson, nfit(1)}
{p 4 8 2}{cmd:. gen lastdigit = mod(price, 10)}{p_end}
{p 4 8 2}{cmd:. chitest lastdigit, count}
{p 4 8 2}{cmd:. gen firstdigit = real(substr(string(myvar),1,1))}{p_end}
{p 4 8 2}{cmd:. gen obs = .}{p_end}
{p 4 8 2}{cmd:. qui forval i = 1/9 {c -(}}{p_end}
{p 4 8 2}{cmd:. {space 8}count if firstdigit == `i'}{p_end}
{p 4 8 2}{cmd:. {space 8}replace obs = r(N) in `i'}{p_end}
{p 4 8 2}{cmd:. {c )-}}{p_end}
{p 4 8 2}{cmd:. gen exp = _N * log10(1 + 1/_n) in 1/9}{p_end}
{p 4 8 2}{cmd:. chitest obs exp}
{title:Saved values}
r(k) number of classes in distribution
r(df) degrees of freedom
r(chi2) Pearson chi-square
r(p) {it:P}-value of Pearson chi-square
r(chi2_lr) likelihood-ratio chi-square
r(p_lr) {it:P}-value of likelihood-ratio chi-square
r(emean) mean expected frequency
{title:Author}
{p 4 4 2}Nicholas J. Cox, University of Durham, U.K.{break}
n.j.cox@durham.ac.uk
{title:Acknowledgements}
{p 4 4 2}Benoit Dulong pointed towards a precision problem.
{title:Also see}
{p 4 17 2}On-line: help for {help chitesti}, {help tabulate}