{smcl} {* *! version 3.0.0 21 Apr 2026}{...} {title:Title} {pstd} {hi:epigenx} {hline 2} Redefine an existing episode file using different episode boundaries {title:Syntax} {p 8 16 2} {cmd:epigenx} {it:varlist}{cmd:,} {opt did(varlist)} [{opt dst(#)} {opt nolabel}] {pstd} where {it:varlist} contains the variables that should define the new episodes. {synoptset 24 tabbed}{...} {synopthdr} {synoptline} {synopt:{opt did(varlist)}}variable(s) that uniquely identify each diary; required{p_end} {synopt:{opt dst(#)}}diary start hour on a 24-hour clock; required unless {cmd:nolabel} is used{p_end} {synopt:{opt nolabel}}do not create clock-time labels or {cmd:clockst}{p_end} {synoptline} {title:Description} {pstd} {cmd:epigenx} takes a dataset that is {bf:already in episode format} and creates a {bf:new episode file} using different episode boundaries. {pstd} A new episode is created whenever any variable listed in {it:varlist} changes value. {pstd} This is useful when an existing episode file is defined using many dimensions, but your research question focuses on only one or a few of them. {pstd} For example: {pmore} - redefine episodes using only {cmd:location} to study mobility - redefine episodes using only {cmd:Enjoy} to study mood changes across the day - redefine episodes using only {cmd:activity} to simplify a complex file - redefine episodes using {cmd:activity location} to retain both dimensions {pstd} In short: {pmore} {help epigen} = calendar file to episode file {cmd:epigenx} = episode file to a different episode file {title:Required variables} {phang} {cmd:start} must exist and contain the start minute of each episode. {phang} {cmd:end} must exist and contain the end minute of each episode. {pstd} The file should already contain one row per episode. {title:Arguments} {phang} {it:varlist} specifies the variables that define the new episodes. Consecutive existing episodes are merged whenever all supplied variables remain unchanged. {pstd} Variables in {it:varlist} may be numeric or string. {pstd} The current version allows up to {bf:20} variables in {it:varlist}. {phang} {opt did(varlist)} specifies one or more variables that jointly identify each diary uniquely. Variables may be numeric or string. {title:Options} {phang} {opt dst(#)} specifies the diary start hour using an integer from 0 to 23. {pstd} Examples: {pmore} {cmd:dst(0)} = diary begins at midnight {cmd:dst(4)} = diary begins at 04:00 {cmd:dst(18)} = diary begins at 18:00 {pstd} This option is used to create readable clock labels for {cmd:start} and {cmd:end}, and to generate {cmd:clockst}. {pstd} {opt dst()} is required unless {cmd:nolabel} is specified. {phang} {opt nolabel} suppresses creation of clock-time labels and suppresses {cmd:clockst}. {title:What the command creates} {pstd} {cmd:epigenx} returns a new episode file with one row per newly defined episode. {pstd} It creates or recreates the following variables: {synoptset 22 tabbed}{...} {synopthdr:Output} {synoptline} {synopt:{cmd:epnum}}episode number within diary{p_end} {synopt:{cmd:start}}start minute of episode{p_end} {synopt:{cmd:end}}end minute of episode{p_end} {synopt:{cmd:time}}episode duration in minutes ({cmd:end-start}){p_end} {synopt:{cmd:clockst}}clock-time start variable; omitted with {cmd:nolabel}{p_end} {synoptline} {pstd} The variables listed in {it:varlist} are retained as the defining diary fields of the new file. {title:How the command works} {pstd} Within each diary, {cmd:epigenx} compares consecutive existing episodes. {pstd} If all variables in {it:varlist} remain unchanged, adjacent episodes are merged into one longer episode. {pstd} If any supplied variable changes, a new episode begins. {pstd} This means the resulting file often contains {bf:fewer episodes} than the starting file, especially when simplifying a richly coded sequence file. {title:Checks and warnings} {pstd} {cmd:epigenx} checks that: {pmore} - {cmd:start} exists - {cmd:end} exists - the dataset appears to be in episode format - identifier variables are present {pstd} If observations contain missing values in any variable listed in {opt did()}, those observations may be dropped after warning messages. {title:Dataset after running the command} {pstd} The output remains an {bf:episode-level} file, but with newly defined episodes. {pstd} Variables listed in {opt did()}, timing variables, and the new episode number appear first in the dataset. {title:Examples} {marker ex1}{...} {bf:Example 1: Redefine episodes using enjoyment only} {pstd} Suppose the original file contains many short episodes, but the goal is to analyse changes in enjoyment across the day. {phang2}{cmd:. use UK2014, clear}{p_end} {phang2}{cmd:. gen start = tid*10 - 10}{p_end} {phang2}{cmd:. gen end = start + eptime}{p_end} {phang2}{cmd:. epicheck, did(serial pnum daynum)}{p_end} {phang2}{cmd:. epigenx Enjoy, did(serial pnum daynum) dst(4)}{p_end} {pstd} Adjacent episodes with the same enjoyment score are merged. {marker ex2}{...} {bf:Example 2: Redefine episodes using location} {phang2}{cmd:. epigenx where, did(pid day) dst(4)}{p_end} {pstd} Useful for analysing movement between places during the day. {marker ex3}{...} {bf:Example 3: Redefine episodes using activity and location} {phang2}{cmd:. epigenx activity where, did(pid day) dst(4)}{p_end} {pstd} A new episode begins when either activity or location changes. {marker ex4}{...} {bf:Example 4: Faster numeric-only output} {phang2}{cmd:. epigenx Enjoy, did(serial pnum daynum) nolabel}{p_end} {pstd} Use {cmd:nolabel} when you only need numeric times. {title:Remarks} {pstd} {bf:1. Use when the file is already episodic} {pstd} If your starting file is in calendar format (one row per slot), use {help epigen} instead. {pstd} {bf:2. Simplification often reduces file size} {pstd} If the original file uses many diary dimensions, redefining episodes with fewer variables often greatly reduces the number of rows. {pstd} {bf:3. Choose variables based on the research question} {pstd} Use only the dimensions relevant to your analysis. For example, mood research may only need enjoyment, while mobility research may only need location. {pstd} {bf:4. Consecutive identical episodes are merged} {pstd} If the same category reappears later after interruption, it becomes a new episode. Only adjacent episodes with identical values are merged. {pstd} {bf:5. {cmd:dst()} affects labels, not episode boundaries} {pstd} Episode boundaries come from {cmd:start} and {cmd:end}. The {cmd:dst()} option is used for clock labels and {cmd:clockst}. {title:Stored results} {pstd} {cmd:epigenx} does not store results in {cmd:r()} or {cmd:e()}. Results are returned through the transformed dataset. {title:Author} {pstd} Juana Lamote de Grignon-Pérez {break} Centre for Time Use Research (CTUR) {title:Also see} {pstd} {help epigen} to convert calendar files into episode files. {pstd} {help timealloc} for diary-level summaries from episode files.