{smcl}
{* 23-mar-Jan2005; rev 3-Oct-2005; 4-Jan-2006}
{hline}
help for {hi:assertky}
{hline}

{title:Sort the data and assert that the given varlist is a key for the dataset}

{p 8 17 2}
{cmd:assertky} {it:varlist} [{cmd:if} {it:exp}] [{cmd:in} {it:range}]
[{cmd:, stable gen_n(}{it:varname1}{cmd:)} {cmd:gen_N(}{it:varname2}{cmd:)}]

{p 4 4 2}
Alternative syntax:

{p 8 17 2}
{cmd:assertky} [{cmd:if} {it:exp}] [{cmd:in} {it:range}] {cmd:, basis(}{it:varlist}{cmd:)}
[{cmd:stable gen_n(}{it:varname1}{cmd:)} {cmd:gen_N(}{it:varname2}{cmd:)}]


{title:Description}

{p 4 4 2}
{cmd:assertky} will sort the data on {it:varlist} and test whether
{it:varlist} is a key for the dataset,
that is, whether the values in {it:varlist} uniquely identify observations.
This is useful when you wish to
simultaneously sort the data and test for whether {it:varlist} is a key,
such as in preparation for a {help merge} or certain {help by} operations.
If the test fails, {cmd:assertky} exits with an error condition.
{cmd:assertky} will leave the data sorted on {it:varlist} (regardless of whether the
test succeeds).


{title:Remarks}

{p 4 4 2}
You must use one of the two syntaxes shown; you may not combine them.  (The
first syntax may be easier to use; the second syntax is allowed
for backward-compatibility.)

{p 4 4 2}
The {cmd:if} and {cmd:in} qualifiers would presumably be rarely used.
They would be useful when
the key test fails on the entire set, but might succeed on a specific subset.
If this feature is used, then the dataset
is left sorted on {it:varlist}, with the excluded cases appearing
at the front of each subset of constant values of {it:varlist}.


{title:Options}

{p 4 4 2}
{cmd:stable} specifies that you want a stable sort; cases that have the same
values in {it:varlist} (i.e., those that violate the key condition) will
appear, within sets of constant values of {it:varlist}, in the same order as
they were prior to the sort.  See {help sort}.  {cmd:stable} makes no difference
if the key test succeeds.

{p 4 4 2}
{cmd:gen_n(}{it:varname1}{cmd:)} specifies that if the key test fails,
then a variable will be generated that enumerates the cases within sets having
the same values in {it:varlist}.  Note that in order to have these values
set consistently, you should also specify {cmd:stable}.

{p 4 4 2}
{cmd:gen_N(}{it:varname2}{cmd:)} specifies that if the key test fails,
then a variable will be generated that reports the numbers of cases having
the same values in {it:varlist}.

{p 4 4 2}
Note that both {cmd:gen_n} and {cmd:gen_N} will generate the variables only if
the key test fails.  These options can be used to identify cases that
cause the test to fail (the "key violations").  ({cmd:gen_N} may be more
useful than {cmd:gen_n}.)
See examples, below.  Also, any cases excluded by an {cmd:if} or {cmd:in}
qualifier will recieve a missing value.


{title:Examples}

{p 4 8 2}
{cmd:. assertky familyid person_no year}

{p 4 8 2}{cmd:. assertky emplid effdate}{p_end}
{p 4 8 2}{cmd:. merge emplid effdate using otherdataset}{p_end}

{p 4 8 2}{cmd:. assertky cust_no prod_serial_no}{p_end}
{p 4 8 2}{err}varlist is not a key{p_end}
{p 4 8 2}{txt}{search r(459):r(459);}{p_end}
{p 4 8 2}{cmd:. assertky cust_no prod_serial_no if status=="A"}{p_end}

{p 4 8 2}{cmd:. assertky emplid effdate}{p_end}
{p 4 8 2}{err}varlist is not a key{p_end}
{p 4 8 2}{txt}{search r(459):r(459);}{p_end}
{p 4 8 2}{cmd:. assertky emplid effdate, gen_N(N)}{p_end}
{p 4 8 2}{err}varlist is not a key{p_end}
{p 4 8 2}{txt}{search r(459):r(459);}{p_end}
{p 4 8 2}{cmd:list if N>1, sepby(emplid effdate)}{p_end}
{p 8 8 2}/* shows all sets of key violations */{p_end}

{p 4 8 2}{cmd:. assertky emplid effdate, gen_n(n) gen_N(N)}{p_end}
{p 4 8 2}{err}varlist is not a key{p_end}
{p 4 8 2}{txt}{search r(459):r(459);}{p_end}
{p 4 8 2}{cmd:list if N>1 & n==1}{p_end}
{p 8 8 2}/* shows one example from each set of key violations */{p_end}

{p 4 4 2}
Note that if an {cmd:if} or {cmd:in} qualifier is used in combination with
{cmd:gen_n} or {cmd:gen_N}, you should accomodate the possibility of missing
values in the generated variables:

{p 4 8 2}{cmd:. assertky emplid effdate if status=="A", gen_N(N)}{p_end}
{p 4 8 2}{err}varlist is not a key{p_end}
{p 4 8 2}{txt}{search r(459):r(459);}{p_end}
{p 4 8 2}{cmd:list if N>1 & ~mi(N), sepby(emplid effdate)}{p_end}

{p 4 4 2}
Or you could also code,

{p 4 8 2}{cmd:list if N>1 & status=="A", sepby(emplid effdate)}{p_end}


{title:Further Remarks}

{p 4 4 2}
{cmd:assertky} is useful prior to a {help merge}, though {help sort} is
just as good, provided that the {cmd:merge} command is used with the {cmd:uniq} or
{cmd:uniqm} option.  ({cmd:assertky} was initially developed prior to the advent of these
options in {cmd:merge}, and one of the motivations for its development was
to facilitate insuring the key condition in a {cmd:merge}.)

{p 4 4 2}
Another useful application is prior to a {cmd:by:} prefix command, where a
secondary sort varlist is used.  (The "secondary" variables are those that
appear in parenthese in a {cmd:by:} prefix command.  See help {help by}.)
In that case, you will often
want to be sure that the variables, including the secondaries, put the data
into a unique sort order.  (In these instances, the primary {cmd:by:} variables
serve mainly to group the observations; the actual order of the groups is
unimportant, but the uniquness of the sort on the secondary variable(s)
may be necessary for the correct functioning of the subsequent command.)  Example:

{p 4 8 2}{cmd:. assertky emplid effdate}{p_end}
{p 4 8 2}{cmd:. by emplid (effdate):} {cmd:gen int spellno = _n}{p_end}

{p 4 4 2}
In this situation, {cmd:assertky} is useful because {cmd:sort} (or {cmd:bysort})
alone is not enough to insure a unique result.

{p 4 4 2}
{cmd:assertky} (without an {cmd:if} or {cmd:in} or options) is similar to {help isid}
but users may find it easier to use.  It is ostensibly equivalent to

{p 8 8 2}
{cmd:isid} {it:varlist}{cmd:, sort missok}

{p 4 4 2}
though the author has not verified that it is exactly equivalent.



{title:Author}

{p 4 4 2}
David Kantor.  Email {browse "mailto:kantor.d@att.net":kantor.d@att.net} if you observe any
problems.

{title:Also See}

{p 4 4 2}
{help isid}, {help duplicates}, {help funcdep} (part of the collapseunique package)