{smcl}
{* *! version 3.0.0 21 Apr 2026}{...}

{title:Title}

{pstd}
{hi:epitrans} {hline 2} Transform wide-interval diary rows into standard episode format

{title:Syntax}

{p 8 16 2}
{cmd:epitrans} {it:actvar}{cmd:,} {opt did(varlist)} {opt sim(varname)} {opt napi(#)} [{opt dur(varname)}]

{synoptset 24 tabbed}{...}
{synopthdr}
{synoptline}
{synopt:{it:actvar}}activity variable to be transformed; required{p_end}
{synopt:{opt did(varlist)}}variable(s) that uniquely identify each diary; required{p_end}
{synopt:{opt sim(varname)}}simultaneity flag (1 = simultaneous, 0 = sequential); required{p_end}
{synopt:{opt napi(#)}}maximum number of activity layers to create (2 to 6); required{p_end}
{synopt:{opt dur(varname)}}within-slot duration variable for unequal sequential splits; optional{p_end}
{synoptline}

{title:Description}

{pstd}
{cmd:epitrans} restructures data from {bf:wide-interval diary formats}, where multiple activities are listed in separate rows sharing the same time interval.

{pstd}
These formats are common in some surveys where respondents report several activities within a broad slot (for example 30 minutes or 1 hour), and the raw file stores one row per listed activity rather than one row per final episode.

{pstd}
{cmd:epitrans} converts this structure into a standard {bf:episode-format} file with clear {cmd:start} and {cmd:end} times and layered activity fields such as primary, secondary, and tertiary activities.

{pstd}
Sequential activities within the same slot are converted into consecutive non-overlapping sub-episodes. Simultaneous activities are consolidated into a single time window with multiple activity layers.


{title:Required variables}

{phang}
{cmd:start} must exist and contain the start minute of the interval or episode.

{phang}
{cmd:end} must exist and contain the end minute of the interval or episode.

{pstd}
The input data may already be in episode-style rows or may come from fixed slots converted into {cmd:start}/{cmd:end} using {help clock2min} or similar preparation.

{title:Arguments}

{phang}
{it:actvar} is the activity variable. It may be numeric or string.

{phang}
{opt did(varlist)} specifies one or more variables that jointly identify each diary uniquely.

{phang}
{opt sim(varname)} specifies the simultaneity flag.

{pstd}
Expected coding is:

{phang2}{cmd:1} = simultaneous activity{p_end}
{phang2}{cmd:0} = sequential activity{p_end}

{phang}
{opt napi(#)} specifies the maximum number of activity layers to retain in the output. Allowed values are from {cmd:2} to {cmd:6}.

{phang}
{opt dur(varname)} specifies a numeric duration variable containing within-slot durations for sequential activities.

{pstd}
When supplied, these durations are used instead of equal splitting for sequential activities. Durations attached to simultaneous rows are ignored.

{title:What the command creates}

{pstd}
{cmd:epitrans} transforms the data into an episode-format file and creates standard layered activity variables:

{synoptset 20 tabbed}{...}
{synopthdr:Output}
{synoptline}
{synopt:{cmd:__pri}}primary activity{p_end}
{synopt:{cmd:__sec}}secondary activity{p_end}
{synopt:{cmd:__ter}}tertiary activity{p_end}
{synopt:{cmd:__quat}}quaternary activity{p_end}
{synopt:{cmd:__fif}}fifth activity layer{p_end}
{synopt:{cmd:__six}}sixth activity layer{p_end}
{synoptline}

{pstd}
Only the layers required by {opt napi()} are kept.

{pstd}
The original activity variable supplied to {cmd:epitrans} is removed and replaced by these standardised activity layers.

{title:How the transformation works}

{pstd}
Within each diary and time interval, {cmd:epitrans} groups rows that share the same {cmd:start}/{cmd:end} boundaries.

{pstd}
Then:

{phang}
{bf:If all activities are sequential}  
The interval is split into consecutive sub-episodes.

{phang}
{bf:If all activities are simultaneous}  
The interval remains one episode, with activities stored across layers.

{phang}
{bf:If a mixture is present}  
Sequential rows are laid out as separate episodes, while simultaneous rows are layered within the relevant interval according to row order.

{pstd}
Incoming row order matters. The command respects the order of rows when deciding sequence and activity priority.

{title:How duration splitting works}

{pstd}
By default, sequential activities sharing the same interval receive equal durations.

{pstd}
If equal division leaves remainder minutes, the earliest created sub-episode(s) receive the extra minute(s) so that the total duration matches the original interval exactly.

{pstd}
If {opt dur()} is supplied, those durations are used for sequential activities instead.

{title:Checks and warnings}

{pstd}
{cmd:epitrans} checks that:

{phang2}
- {cmd:start} and {cmd:end} exist{break}
- {cmd:napi()} is between 2 and 6{break}
- required variables are present

{pstd}
The command also reports summary tables describing the transformed overlap blocks.

{title:Dataset after running the command}

{pstd}
The dataset remains in {bf:episode format}, but with corrected timing and standardised activity layers.

{pstd}
The number of rows may increase or decrease depending on how many intervals are split or consolidated.

{title:Examples}

{marker ex1}{...}
{bf:Example 1: Basic transformation}

{phang2}{cmd:. epitrans act, did(hid pid) sim(simult) napi(3)}{p_end}

{pstd}
Creates primary, secondary, and tertiary layers.

{marker ex2}{...}
{bf:Example 2: Use recorded unequal durations}

{phang2}{cmd:. epitrans act, did(id) sim(sim) napi(2) dur(durmins)}{p_end}

{pstd}
Uses {cmd:durmins} to split sequential activities within intervals.

{marker ex3}{...}
{bf:Example 3: Check result afterward}

{phang2}{cmd:. epicheck, did(hid pid)}{p_end}

{pstd}
Useful to confirm the transformed file is structurally valid.

{title:Remarks}

{pstd}
{bf:1. Sort order matters}

{pstd}
Before running {cmd:epitrans}, sort carefully by diary identifier, then by interval ({cmd:start}/{cmd:end}), and then by any within-slot priority variable if one exists.

{pstd}
{bf:2. Use a major-activity indicator when available}

{pstd}
If the source data indicate which activity is the main one, sort so that the main activity appears first within each block.

{pstd}
{bf:3. Choose {cmd:napi()} realistically}

{pstd}
Most datasets only require 2 or 3 layers. Larger values retain more simultaneous activities but create wider files.

{pstd}
{bf:4. Equal splitting is a fallback}

{pstd}
If real within-slot durations are available, use {opt dur()} whenever possible.

{pstd}
{bf:5. Good harmonisation workflow}

{pstd}
A common sequence is:

{phang2}
raw file -> {help clock2min} -> {cmd:epitrans} -> {help epicheck} -> analysis

{title:Stored results}

{pstd}
{cmd:epitrans} does not store results in {cmd:r()} or {cmd:e()}. Results are returned through the transformed dataset and printed summaries.

{title:Author}

{pstd}
Juana Lamote de Grignon-Pérez
{break}
Centre for Time Use Research (CTUR)

{title:Also see}

{pstd}
{help epicheck} for diagnosing structural issues in episode files.

{pstd}
{help clock2min} for creating {cmd:start} and {cmd:end} from clock-style variables.