{smcl}
{* *! version 3.0.0 21 Apr 2026}{...}

{title:Title}

{pstd}
{hi:epicheck} {hline 2} Diagnose structural problems in episode-format diary files

{title:Syntax}

{p 8 16 2}
{cmd:epicheck}{cmd:,} {opt did(varlist)} [{opt quiet}]

{synoptset 24 tabbed}{...}
{synopthdr}
{synoptline}
{synopt:{opt did(varlist)}}variable(s) that uniquely identify each diary; required{p_end}
{synopt:{opt quiet}}suppress output when no issues are detected{p_end}
{synoptline}

{title:Description}

{pstd}
{cmd:epicheck} is a diagnostic command for {bf:episode-format} diary files.

{pstd}
Given a diary identifier and the variables {cmd:start} and {cmd:end}, it scans each diary for 
several types of structural problems that commonly arise in episode data. 
These include overlaps, gaps, zero-length episodes, and missing time boundaries.

{pstd}
The command is intended as a quality-checking step before analysis. Problems detected 
by {cmd:epicheck} should usually be investigated and, where appropriate, corrected 
before proceeding. In many workflows, these issues can later be addressed using a 
repair command such as {help epifix}.

{pstd}
A related command, {help tslotcheck}, performs analogous checks on {bf:calendar-format} files.

{title:Required variables}

{phang}
{cmd:start} must exist and contain the start minute of each episode.

{phang}
{cmd:end} must exist and contain the end minute of each episode.

{pstd}
The dataset should already be in episode format, with one row per episode.

{title:Arguments}

{phang}
{opt did(varlist)} specifies one or more variables that jointly identify each diary uniquely.

{title:Option}

{phang}
{opt quiet} suppresses output when no issues are detected.

{pstd}
If issues are found, the report is still printed even when {cmd:quiet} is specified.

{title:Issues checked}

{pstd}
{cmd:epicheck} checks for the following eight issue types:

{phang}
{bf:1. Full overlap}
Two or more episodes within a diary have the same {cmd:start} and {cmd:end} times.

{phang}
{bf:2. Nested episode}
One episode fully contains the next episode.

{phang}
{bf:3. Partial overlap}
Two consecutive episodes overlap in time without complete containment.

{phang}
{bf:4. Gap at min 0}
The first episode of a diary does not begin at minute 0.

{phang}
{bf:5. Gap at end of diary}
The final episode of a diary ends before minute 1440.

{phang}
{bf:6. Gap between episodes}
A positive gap exists between consecutive episodes.

{phang}
{bf:7. Row with start==end}
An episode has zero duration.

{phang}
{bf:8. Row with start==.|end==.}
At least one of {cmd:start} or {cmd:end} is missing.

{title:What the command creates}

{pstd}
When issues are detected, {cmd:epicheck} creates episode-level and diary-level flag variables in the dataset.

{synoptset 28 tabbed}{...}
{synopthdr:Output variables}
{synoptline}
{synopt:{cmd:__flag_case}}episode-level issue code: 0 = no issue, 1 to 8 = issue type{p_end}
{synopt:{cmd:__flag_diary}}diary-level flag: 1 if the diary contains any issue{p_end}
{synopt:{cmd:__flag_diary_1}}diary contains at least one full overlap{p_end}
{synopt:{cmd:__flag_diary_2}}diary contains at least one nested episode{p_end}
{synopt:{cmd:__flag_diary_3}}diary contains at least one partial overlap{p_end}
{synopt:{cmd:__flag_diary_4}}diary contains a gap at the beginning{p_end}
{synopt:{cmd:__flag_diary_5}}diary contains a gap at the end{p_end}
{synopt:{cmd:__flag_diary_6}}diary contains a gap between episodes{p_end}
{synopt:{cmd:__flag_diary_7}}diary contains at least one zero-length episode{p_end}
{synopt:{cmd:__flag_diary_8}}diary contains at least one row with missing {cmd:start} or {cmd:end}{p_end}
{synoptline}

{pstd}
If no issues are detected, these flag variables are dropped before the command finishes.

{title:How issue coding works}

{pstd}
Each episode is assigned at most one value in {cmd:__flag_case}. If a row could qualify for more than one issue type, the command applies the following priority order:

{phang2}
{cmd:8 > 7 > 1 > 2 > 3 > 4 > 5 > 6}

{pstd}
This means, for example, that a row with missing {cmd:start} or {cmd:end} is classified as issue 8 even if other problems would also apply.

{title:How missing values are handled}

{pstd}
Rows with missing values in one or more variables listed in {opt did()} are {bf:not dropped} from the dataset. However, they are ignored during structural issue detection, and the command reports how many such rows were found.

{pstd}
Rows with missing {cmd:start} or {cmd:end} are kept in the data and are classified as issue 8 in the final output.

{title:Output in the Results window}

{pstd}
If no issues are detected, {cmd:epicheck} prints:

{phang2}{cmd:No issues detected.}{p_end}

{pstd}
Otherwise, it prints a summary table showing: the number of flagged rows ({it:Cases}), number of diaries affected ({it:Diaries}), and 
percentage of diaries affected

{pstd}
A total row is shown at the bottom of the table.

{title:Dataset after running the command}

{pstd}
{cmd:epicheck} does not reshape or collapse the dataset. The file remains at the {bf:episode level}.

{pstd}
The command is diagnostic: it inspects the data, reports issues, and adds flag variables when needed.

{title:Examples}

{marker ex1}{...}
{bf:Example 1: Basic episode check}

{phang2}{cmd:. use mtus_hef, clear}{p_end}
{phang2}{cmd:. epicheck, did(hldid persid id)}{p_end}

{pstd}
If the file is structurally sound, the command reports that no issues were detected.

{marker ex2}{...}
{bf:Example 2: Quiet mode}

{phang2}{cmd:. epicheck, did(hldid persid id) quiet}{p_end}

{pstd}
This suppresses output only when no issues are present.

{marker ex3}{...}
{bf:Example 3: Investigate flagged rows}

{phang2}{cmd:. epicheck, did(hldid persid id)}{p_end}
{phang2}{cmd:. tab __flag_case}{p_end}
{phang2}{cmd:. list hldid persid id start end if __flag_case>0}{p_end}

{pstd}
This lets you inspect the problematic rows directly.

{title:Remarks}

{pstd}
{bf:1. Run before analysis}

{pstd}
It is good practice to run {cmd:epicheck} before commands that rely on structurally valid episode timing.

{pstd}
{bf:2. Flags are diagnostic, not corrections}

{pstd}
{cmd:epicheck} identifies problems but does not repair them. Use the flags to inspect the data or pass the file to a repair workflow.

{pstd}
{bf:3. Missing diary identifiers are treated separately}

{pstd}
Rows with missing values in {opt did()} are not used in the structural checks, because they cannot be assigned reliably to a diary.

{pstd}
{bf:4. Zero-length and missing-bound rows are preserved}

{pstd}
Unlike some structural checks that work on temporary filtered copies internally, the final output restores these rows and flags them in the original dataset.

{pstd}
{bf:5. No flag variables remain if the file is clean}

{pstd}
If the command finds no issues at all, the flag variables are removed before exit.

{title:Stored results}

{pstd}
{cmd:epicheck} does not store results in {cmd:r()} or {cmd:e()}. Results are returned through the report and, when relevant, through the created flag variables.

{title:Author}

{pstd}
Juana Lamote de Grignon-Pérez
{break}
Centre for Time Use Research (CTUR)

{title:Also see}

{pstd}
{help epifix} for repairing structural issues in episode files.

{pstd}
{help tslotcheck} for analogous checks in calendar-format files.