-------------------------------------------------------------------------------
help for datacheck
-------------------------------------------------------------------------------

Assert with report

datacheck true_or_false_condition [if condition] [in range] [ , by(byvarlist) message(string) varshow(varlist) previous next flag nolist list_options ]

Description

datacheck is a utility that checks for each observation that a specified condition is true and reports on any observations for which it is false.

Options

by(byvarlist) makes the assertion by byvarlist:. This allows, for example, conditions referring to _n and _N defined within distinct groups of byvarlist. The dataset must be previously sorted by byvarlist. If this option is specified, list output is by default separated by byvarlist.

message(string) displays the given message string if any contradictions are found.

varshow(varlist) restricts list output to the variables in varlist. If this option is not specified, all variables in the dataset are listed.

previous and next list the previous and/or following observation as well as any observation contradicting the assertion. This can be especially useful when data are in time order.

flag leaves behind a binary flag variable in the dataset named _contra taking the value 1 for observations failing the check and 0 otherwise. This flag variable will automatically be dropped the next time datacheck is run, and a new variable will be generated the next time datacheck is run with this option.

nolist suppresses output of list. Output is restricted to a brief report on contradictions.

list_options are options of list.

Remarks

Unlike assert, a contradiction of the condition will not produce an error, but only output which must be displayed for the contradiction to be detected. Thus always use this command noisily if using run to run do files.

Examples

. datacheck age < ., varshow(id age) message(Missing age)

. datacheck drug == 3 if arm == 1, varshow(id drug arm) message(Wrong drugs)

. datacheck time > time[_n-1], varshow(id time) message(Dates do not follow) prev

. datacheck time==0 if _n == 1, by(id) varshow(id time) message(Patient's first record is not at time 0)

Author

Krishnan Bhaskaran London School of Hygiene and Tropical Medicine Keppel Street London WC1E 7HT krishnan.bhaskaran@lshtm.ac.uk

Also see

Manual: [D] assert