{smcl} {* *! version 1.3.2 18oct2011}{...} {cmd:help prsnperd} {hline} {title:Title} {p2colset 5 17 16 2}{...} {p2col:{hi:prsnperd} {hline 2}}A utility for creating person-period datasets for discrete time longitudinal analyses{p_end} {p2colreset}{...} {title:Syntax} {p 8 18 2} {cmd:prsnperd} {it:id length-to-event} [{it:censor}] [{cmd:, {ul on}t{ul off}runcate(}{it:#}{cmd:)} {cmd:{ul on}p{ul off}retrunc(}{it:#}{cmd:)} {cmd:{ul on}cs{ul off}witch} {cmd:tvp(}{it:names}{cmd:)} {cmd:fev(}{it:name}{cmd:)} {cmd:copyleft}] {synoptset 28 tabbed}{...} {synopthdr} {synoptline} {syntab:Miscellaneous} {synopt :{opt t:runcate(#)}}truncate the maximum time of {it:length-to-event}{p_end} {synopt :{opt p:retrunc(#)}}ignore some initial time periods in the model{p_end} {synopt :{opt cs:witch}}invert {it:censor} coding{p_end} {synopt :{opt tvp(names)}}provide root names of flat-encoded time varying predictors{p_end} {synopt :{opt fev(name)}}provide root name of flat-encoded time varying event occurrence{p_end} {synopt :{opt copyleft}}display license information{p_end} {synoptline} {p2colreset}{...} {p 4 6 2} {title:Description} {pstd} {cmd:prsnperd} transforms a person-time dataset into a person-period dataset for discrete-time longitudinal analyses, for example, using {cmd:dthaz}. Input variables are {it:id}: the unique id number of each observed individual in the person-time dataset; {it:length-to-event}: the duration to event occurrence (in number of discrete time intervals since the study's Beginning of Time); and {it:censor}, which indicates censoring status of the observed individual (where 0 = not censored; and 1 = censored, unless the {cmd:{ul on}cs{ul off}witch} option is used). Given an input data set of this form, an output dataset is created with expanded observations and several new variables. {pstd} NOTE: individuals who were never observed to have experienced an event should be coded as having a {it:length-to-event} equal to their total time in the study, and should be censored. {pstd} Each individual observation within the person-time dataset is replaced with a number of new observations equal to {it:length-to-event} for that {it:id}. If there is no event occurrence for a given time period, the user is so notified. Within these new observations either one, or several new variables are created, depending on whether the survival analysis or growth-modeling syntax is used. If application is for growth modeling, then only the {it:_period} variable is created, otherwise all the following variables are produced. {p 4 12 2}{it:_period}{space 1}Specific time interval of this observation. Each {it:id} will have at least one observation with {it:_period} = 1. The maximum value for {it:_period} is equal to the maximum {it:length-to-event} of the person-time dataset (or to {cmd:{ul on}t{ul off}runcate} if specified).{p_end} {p 4 12 2}{it:_d1-_dX}{space 1}(Where X is the maximum value for {it:period}) These are indicator variables (i.e. "dummy variables") for the current period.{p_end} {p 4 12 2}{it:_Y}{space 6}_Y indicates event occurrence for the given period (where 0 = event did not happen and 1 = event happened). _Y is usefully employed as the outcome in event history models. As in a simple logit hazard model:{p_end} {p 12 16 2}{inp:. logit _Y d1-d8, nocons}{p_end} {p 12 12 2}produces an estimate of baseline hazard corresponding perfectly with the sample hazard where ^H(t_j) = 1/1+e^-(B_j). The estimate becomes more interesting when additional predictors are added thus:{p_end} {p 12 16 2}{inp:. logit _Y d1-d8 age, nocons or}{p_end} {p 12 12 2}Exploration of estimated differences in ^H(t_j) can therefore be modeled using standard nested models of multiple predictors. The {cmd:or} function provides estimated odds for hazard of event compared to non-event for each predictor. {p 4 12 2}{it:_status}{space 1}A categorical status variable for producing life-tables (where 1 = event occurred; 2 = event did not occur; and 3 = censored). Life tables with sample hazard can be created by using the following:{p_end} {p 12 16 2}{inp:. tabulate _period _status, row}{p_end} {title:Options} {dlgtab:Miscellaneous} {phang}{cmd:{ul on}t{ul off}runcate(}{it:#}{cmd:)} restricts the maximum value for {it:length-to-event}, censoring those observations with integer values greater than {cmd:{ul on}t{ul off}runcate}.{p_end} {p 4 4}NOTE:{space 3}Specifying values of {cmd:{ul on}t{ul off}runcate} greater than the maximum value of {it:length-to-event} (or specifying negative values) produces the same dataset as one with no value of {cmd:{ul on}t{ul off}runcate} specified.{p_end} {phang}{cmd:{ul on}p{ul off}retrunc(}{it:#}{cmd:)} discards early time periods from the new dataset. For example, when pre-truncating with a value of 2, the period that would be indicated by _d3 becomes _d1 instead, and the value of _period would be decreased by 2.{p_end} {p 4 4}NOTE:{space 3}Specifying values of {cmd:{ul on}t{ul off}runcate} greater than the one minus the maximum value of {it:length-to-event} (or specifying negative values) produces the same dataset as one with no value of {cmd:{ul on}t{ul off}runcate} specified. Also, {cmd:{ul on}t{ul off}runcate} and {cmd:{ul on}p{ul off}retrunc} cannot be combined when their values would result in fewer than two periods. Discrete time survival analyses conducted upon pre-truncated datasets are, in effect analyses conducted upon separate populations from the not pre-truncated datasets {it:if the conditional hazard during the pre-truncated periods is greater than zero}. The author suggests that an analyst may desire to perform a pre-truncated analysis either because there are no events during initial periods, or because she is interested in analyzing a surviving sub-population at a later starting period. However, in cases where events occurred during the pre-truncated periods, a survival analysis cannot be said to generalize to the population of the not pre-truncated dataset. In cases where events occur in initial periods, but at rates that are too few to provide reliable estimates for these periods, the analyst should both employ a sensitivity analysis to describe differences between models on pre-truncated and not pre-truncated datasets, but also examine the characteristics of anomalous individuals--qualitative data may particularly help illuminate how these persons differ from the majority of individuals who remain in the pre-truncated dataset.{p_end} {phang}{cmd:{ul on}cs{ul off}witch} tells {cmd:prsnperd} to expect that censored data are coded with 0 = censored, and 1 = event/failure.{p_end} {phang}{cmd:tvp(}{it:names}{cmd:)} generates variable(s) with the supplied name(s) if the names correspond precisely to prefixed portions of flat coded time varying predictors. Person-time data sets are often constructed with time-varying predictors encoded in such a format (for example, {it:predictor1}, {it:predictor2}, {it:predictor3}, {it:predictor4}, where the numeric suffix indicates which time-period the observation was made in). Missing values will not be imputed. The time-designation in the suffix must be ordered in the same manner as the periods of observation. {phang}{cmd:fev(}{it:name}{cmd:)} constructs variables named {it:length_to_event} and {it:censored} with appropriate values if event data are in a flat indicator format (for example, {it:event1 event2 event3 event4}), rather than in a single {it:length-to-event} variable by specifying the common portion of the event variables' names (for example {cmd:"}{it:event}{cmd:"}). This option assumes that all event variables share a common prefix ({it:name}), that {it:name} has values {cmd:0} (no event), {cmd:1} (event, or first event), or {cmd:.} (censored), and that there is no left-censoring of observations. The created variables will override supplied {it:length-to-event} and {it:censored} variables. {cmd:prsnperd} will exit with an error if it encounters left-censored data with the {cmd:fev} option. {cmd:fev} also expects that no data are middle-censored (i.e. all time periods have been observed for each individual between the study's beginning of time and either the first occurence of the event, or right-censoring). {phang}{cmd:copyleft} {cmd:prsnperd} is free software, licensed under the GPL. The {cmd:copyleft} option displays the copying permission statement for {cmd:prsnperd} which is a part of the {cmd:dthaz} package. The full license can be obtained by typing: {p 12 8 2} {inp: . net describe dthaz, from (http://www.doyenne.com/stata)} {phang} and clicking on the {net "describe dthaz, from (http://www.doyenne.com/stata)":click here to get} link for the ancillary file. {title:Examples} {p 4 8}{inp:. prsnperd id length censored}{p_end} {p 4 8}{inp:. prsnperd id length censored, truncate(8)}{p_end} {p 4 8}{inp:. prsnperd id, tvp(predictor) fev(event)}{p_end} {title:Author} Alexis Dinno Portland State University alexis dot dinno at pdx dot edu Please contact me with any questions, bug reports or suggestions for improvement. My thanks to Dr. Suzanne Graham, Dr. Jim Stiles, and Dr Anna Song. {title:References} {p 0 10} Singer JD and Willett JB. 2003. {it:Applied Longitudinal Data Analysis: Modeling Change and Event Occurence}. Oxford, UK: Oxford University Press. 672 pages. {p 0 10} Willet JB and Singer JD. 1991. "From Whether to When: New Methods for Studying Student Dropout and Teacher Attrition." {it:Review of Educational Research}. 61: 407-450 {p 0 10} Singer JD and Willett JB. 1991. "Modeling the Days of Our Lives: Using Survival Analysis When Designing and Analyzing Longitudinal Studies of Duration and Timing of Events." {it:Psychological Bulletin.} 110: 268-290 {title:Also See} {psee} {space 2}Help: {help dthaz:dthaz}, {help msdthaz:msdthaz}