{smcl}
{* 11feb2003}{...}
{hline}
help for {cmd:ellip7} {right:(SSC: 20030211)}
{hline}
{title:Graphing confidence ellipses}
{title:[An update of {cmd:ellip} for Stata 7]}
{p 8 15}{cmd:ellip7} {it:yvar} [{it:xvar}] [{cmd:if}
{it:exp}] [{cmd:in} {it:range}] [{cmd:,}
{c -(}{cmd:means}|{cmd:coefs} [{cmdab:p:ool}{cmd:(}{it:#}{cmd:)}]{c )-}
{cmdab:c:onstant}{cmd:(}{it:string} [{it:#}]{cmd:)}
{cmdab:l:evel}{cmd:(}{it:#}{cmd:)}
{cmdab:g:enerate}{cmd:(}{it:ynewvar} {it:xnewvar}{cmd:)}
{cmdab:a:dd}{cmd:(}{it:yoldvar} {it:xoldvar}{cmd:)}
{cmdab:nogr:aph}
{cmd:replace}
{cmd:evr}{cmd:(}{it:#}{cmd:)}
{cmdab:np:oints}{cmd:(}{it:#}{cmd:)}
{cmdab:yf:ormat}[{cmd:(%}{it:fmt}{cmd:)}]
{cmdab:xf:ormat}[{cmd:(%}{it:fmt}{cmd:)}]
{it:graph_options}
]
{title:Description}
{p}{cmd:ellip7} graphs confidence ellipses for approximately normally
distributed data, and is an update of {cmd:ellip} to Stata 7. A confidence
ellipse is the boundary of an elliptical joint 100(1-alpha)% confidence region
for two parameters. In {cmd:ellip7}, the centering variables {it:yvar} and
{it:xvar} are two data variables or the first two independent variables after
an immediately preceding {cmd:regress}. If {cmd:coefs} is specified without
{it:xvar}, then the _cons in {cmd:regress} is used for {it:xvar}. The boundary
constant determines the size of the confidence ellipse.
{title:Options}
{p 0 4}{c -(}{cmd:means}|{cmd:coefs}{c )-} specifies how to center the
confidence ellipse. The default, and the {cmd:means} option uses two variable
means, whereas {cmd:coefs} uses the first two regression coefficients from an
immediately preceding {cmd:regress}. If you restricted {cmd:regress} to a
portion of the data using {cmd:if} or {cmd:in}, then you will generally want to
use the same conditions with {cmd:coefs}.
{p 0 4}{cmdab:p:ool}{cmd:(}{it:#}{cmd:)} displays a confidence ellipse labeled
{cmd:bp} using all the data, a confidence ellipse labeled {cmd:b} using a
theoretically unproblematic subset, and {it:#} lines connecting #+1 dots
of fractionally pooled regression coefficients dots at 1/# intervals.
{cmd:pool}{cmd:(}{cmd:)} must be used with {cmd:if} or {cmd:in}, and with
{cmd:coefs}, {cmd:generate}{cmd:(}{cmd:)}, and {cmd:add}{cmd:(}{cmd:)}.
{cmd:pool}{cmd:(}{cmd:)} is incompatible with {cmd:by}{cmd:(}{cmd:)}.
{p 0 4}{cmdab:c:onstant}{cmd:(}{it:string} [{it:#}]{cmd:)} specifies the
boundary constant as a statname and an optional #. The overall default, and the
{cmd:means} default is the standard deviation ellipse with
{cmd:constant}{cmd:(}{it:sd} {it:2}{cmd:)} or, squared,
{cmd:constant}{cmd:(}{it:sq} {it:4}{cmd:)}. The standard deviation ellipse is
a.k.a. the covariance, concentration, data, error, or inertia ellipse. With the
statname {it:sq}, the confidence level in percent is
{bind:(1 - exp^(-#/2)) * 100).} It is the ellipse which is the most
representative of the data points without any a priori statistical assumptions
concerning their origin. The default corresponds to 95% INDIVIDUAL confidence
intervals, or 86% JOINT confidence intervals. {it:sd} and {it:sq} cannot be
used with {cmd:level}{cmd:(}{cmd:)}. I have NOT implemented the standard
deviation curve for geographical data, see Gong (2002). The {cmd:coefs}
default is {cmd:constant}{cmd:(}{it:f} {it:2}{cmd:)}. The default {it:#}
is 4 for {it:sq}, otherwise it is 2. Available statistics are:
{p 8 22}{it:statname} {space 2} definition{p_end}
{hline 51}
{p 8 22}{cmd:sd} {space 8} standard deviation = #^2{p_end}
{p 8 22}{space 11} cannot be used with {cmd:level}{cmd:(}{cmd:)}{p_end}
{p 8 22}{cmd:sq} {space 8} squared standard deviation = sd^2 = #{p_end}
{p 8 22}{space 11} cannot be used with {cmd:level}{cmd:(}{cmd:)}{p_end}
{p 8 22}{cmd:tsq} {space 7} Hotelling one-sample T-squared{p_end}
{p 8 22}{cmd:hotel} {space 5} same as {cmd:tsq} = #(n-1)/(n-#) * F{p_end}
{p 8 22}{cmd:tsqn} {space 6} sample-adjusted tsq = tsq / n{p_end}
{p 8 22}{cmd:hoteln} {space 4} same as {cmd:tsqn}{p_end}
{p 8 22}{cmd:ptsqn} {space 5} Hotelling T-squared prediction or{p_end}
{p 8 22}{space 11} tolerance ellipse = tsqn * (n+1) / n {p_end}
{p 8 22}{cmd:photeln} {space 3} same as {cmd:ptsqn}{p_end}
{p 8 22}{cmd:chisq} {space 5} Chi-squared{p_end}
{p 8 22}{cmd:chisqn} {space 4} sample-adjusted Chi-squared = chisq / n{p_end}
{p 8 22}{cmd:f} {space 9} F = 2F * (#,n-#){p_end}
{p 8 22}{cmd:fadj} {space 6} F-adjusted = = 2F * (2,n-#){p_end}
{p 8 22}{space 11} Defaults {cmd:f}{it:2} and {cmd:fadj}{it:2} are
equivalent{p_end}
{p 0 4}{cmdab:l:evel}{cmd:(}{it:#}{cmd:)} specifies the confidence level, in
percent, for calculation of the confidence ellipse; the default {it:#} is 95.
{cmd:level}{cmd:(}{cmd:)} cannot be used with
{cmd:constant}{cmd:(}{it:sd}{cmd:)} and {cmd:constant}{cmd:(}{it:sq}{cmd:)}.
{p 0 4}{cmdab:g:enerate}{cmd:(}{it:ynewvar} {it:xnewvar}{cmd:)} generates two
new variables, {it:ynewvar} and {it:xnewvar}, which define the confidence
ellipse. If the current dataset contains fewer observations than in
{cmd:npoints}{cmd:(}{cmd:)}, then the length of the dataset will be expanded
accordingly with missing values, even if ynewvar and xnewvar are temporary
variables, and a warning message is displayed. {cmd:generate}{cmd:(}{cmd:)}
cannot be used with {cmd:by}{cmd:(}{cmd:)}.
{p 0 4}{cmdab:a:dd}{cmd:(}{it:yoldvar} {it:xoldvar}{cmd:)} adds an old
confidence ellipse to the new confidence ellipse. The result is two overlaid
confidence ellipses in the same graph. May be used with but does not require
{cmd:generate}{cmd:(}{cmd:)}.
{p 0 4}{cmdab:nogr:aph} suppresses the display of the graph.
{p 0 4}{cmd:replace} replaces any existing variables in
{cmd:generate}{cmd:(}{cmd:)}.
{p 0 4}{cmd:evr}{cmd:(}{it:#}{cmd:)} specifies the error variance ratio, where
# is a floating point number between 0 and 10^36. The default is 1. evr(0)
corresponds to regression of x on y, evr(1) to orthogonal regression, and a
larger number, say evr(999), corresponds to regression of y on x. See McCartin
(n.d.).
{p 0 4}{cmdab:np:oints}{cmd:(}{it:#}{cmd:)} specifies # points to be calculated
for the confidence ellipse. The default is 400. You seldom have to use this
option, but users with Small Stata may want to lower the number and if the
output looks jagged try increasing the number.
{p 0 4}{cmdab:yf:ormat}[{cmd:(%}{it:fmt}{cmd:)}] specifies the display format of
the y-axis. The default is to use a {cmd:%9.0g} format.
{p 4 4}{cmd:yformat} specifies that the y-axis uses the {it:yvar}'s display
format.
{p 4 4}{cmd:yformat(%}{it:fmt}{cmd:)} specifies the format to be used for the
y-axis (see help {help format}).
{p 0 4}{cmdab:xf:ormat}{cmd:(%}{it:fmt}{cmd:)} specifies the display format of
the x-axis; see the {cmdab:yf:ormat}[{cmd:(%}{it:fmt}{cmd:)}] option above.
{p 0 4}{it:graph_options} are any options allowed with {cmd:graph, twoway},
including {cmd:by}{cmd:(}{it:varname}{cmd:)}. {cmd:by}{cmd:(}{cmd:)} is
incompatible with {cmd:pool}{cmd:(}{cmd:)}. {cmd:by}{cmd:(}{cmd:)} with many
groups may exceed the "width" of the dataset because of the {cmd:stack}
included in {cmd:ellip7}. Defaults are: c(l) s(.) t1(" ") t2(" ") l1({it:yvar})
b2({it:xvar}) [or c(ll) s(..) if {cmd:add}{cmd:(}{cmd:)} is specified, etc.],
and l1(Estimated {it:yvar}) and b2(Estimated {it:xvar}) if {cmd:coefs} is
specified.
{title:Remarks}
{p 4 4}The latest version for Stata 7 is version 1.3.1 of {cmd:ellip7}. The
last version for Stata 6 was version 1.2.0 of {cmd:ellip6}. To use the
{cmdab:p:ool}{cmd:(}{it:#}{cmd:)} option, you must have gphdt.ado and
gphsave.ado installed.
{p 4 4}{cmd:ellip7} is a graphics command, but {cmdab:g:enerate}{cmd:(}{cmd:)}
may lengthen the dataset. Only one statistics may be requested in
{cmd:constant}{cmd:(}{cmd:)}; A simple but limited workaround is to use
{cmd:add}{cmd:(}{cmd:)}.
{p 4 4}The {cmd:by}{cmd:(}{it:varname}{cmd:)} graph option bug has been fixed
in {cmd:ellip7} but not in {cmd:ellip6}. In {cmd:ellip7}, the graph option now
displays an ellipse for each value of {it:varname} in {cmd:by}{cmd:(}{cmd:)},
as expected. {cmd:ellip7} also introduce the nograph option, and the
sq argument to constant().
{p 4 4}Stata 8 became available in January 2003. Stata 8 has a new graphics
programming language, and many new graphics features. For example, Stata 8 has
a new built-in method for overlaying graphs with a ||-separator and a
()-binding notation. To create overlaid confidence ellipses with Stata 7 or
Stata 6, I recommend Nick Cox's muxyplot.ado. That is, generate the
ellipse variables with the {cmd:gen}{cmd:(}{cmd:)} option, and then use
{cmd:muxyplot} {it:yvarlist} {it:xvarlist}. The complementary command
muxyplot with helpfile can be downloaded separately from SSC.
{p 4 4}Version 1.3.0 from 20030116 had bugs which have been fixed in 1.3.1.
That is, by() with coefs would not report results and, more importantly, would
incorrectly report the default (means) by-results; this bug does not apply to
ellip6 and ellip5, because they are not byable. pool() would only use
two independent variables as part of its calculations even if the immediately
preceding regress command used more independent variables; the bug still
affects ellip6 (the original version of ellip/ellip5 never had the bug, because
it was used after fit rather than after regress).
{p 4 4}The author is currently developing the program in Stata 8. Please
contact the author if you want to contribute in any way.
{title:Examples}
{p 1 26}{inp:. ellip y x} {space 12} (graph sd ellipse){p_end}
{p 1 26}{inp:. ellip y x, g(sdy sdx)} {space 1}(graph and generate sd
ellipse){p_end}
{p 1 26}{inp:. ellip y x, c(hoteln) a(sdy sdx)}{break}
(overlaid graph of 95% Hotelling confidence ellipse and previous sd ellipse)
{p_end}
{p 1 26}{inp:. reg dv iv}{p_end}
{p 1 26}{inp:. ellip iv, coefs c(chisq)}{break}
(graphs a 95% Chi-square confidence ellipse around the regression coeffient
for iv and around _cons in the preceding regression){p_end}
{title:Author}
{p 5}Anders Alexandersson {p_end}
{p 5}ITS, Mississippi State University{p_end}
{p 5}Mississippi State, MS 39762{p_end}
{p 5}USA{p_end}
{title:References}
{p 5 10}Batschelet, E. 1981. Circular Statistics in Biology. London and New
York: Academic Press.
{p 5 10}Gong, J. 2002. Clarifying the Standard Deviational Ellipse. Geographical
Analysis 34(2): 155-167.
{p 5 10}Johnson, R., and D. Wichern. 2002. 5th ed. Applied Multivariate
Statistical Analysis. Upper Saddle River, NJ: Prentice Hall.
{p 5 10}McCartin, B. n.d. A Geometric Characterization of Linear Regression.
Statistics: A Journal of Theoretical and Applied Statistics.
{title:Also see}
Manual: {hi:[R] graph}, {hi:[R] gph}
STB: {hi:STB-46 gr32}, {hi:STB-34 gr20}
{p 0 19}On-line: help for {help gphsave}, {help gphdt}, {help muxyplot}
(if installed){p_end}