{smcl}
{* 29Dec2024}{...}


{title:Title}

{p2colset 5 16 17 2}{...}
{p2col :{hi:itsadgp} {hline 2}}Data generating process for interrupted time-series analysis {p_end}
{p2colreset}{...}


{title:Syntax}

{p 8 16 2}
{cmd:itsadgp}{cmd:,}
{cmdab:nt:ime(}{it:integer}{cmd:)}
{cmdab:trp:eriod(}{it:integer}{cmd:)}
{cmdab:int:ercept(}{it:#}{cmd:)}
{cmdab:pre:trend(}{it:#}{cmd:)}
{cmdab:post:trend(}{it:#}{cmd:)}
{cmdab:st:ep(}{it:#}{cmd:)}
[ {cmdab:sd(}{it:#}{cmd:)}
{cmdab:rho(}{it:{help numlist}}{cmd:)}
{cmdab:seed(}{it:#}{cmd:)} ]



{synoptset 25 tabbed}{...}
{synopthdr}
{synoptline}
{p2coldent:* {opt nt:ime}{cmd:(}{it:integer}{cmd:)}}specify the number of time periods in the series{p_end}
{p2coldent:* {opt trp:eriod}{cmd:(}{it:integer}{cmd:)}}specify the time period when the intervention begins {p_end}
{p2coldent:* {opt int:ercept}{cmd:(}{it:#}{cmd:)}}specify the starting value (intercept) of the time series {p_end}
{p2coldent:* {opt pre:trend}{cmd:(}{it:#}{cmd:)}}specify the trend (slope) of the time series prior to the intervention {p_end}
{p2coldent:* {opt st:ep}{cmd:(}{it:#}{cmd:)}}specify the change in the level of the time series immediately following the introduction of the intervention{p_end}
{p2coldent:* {opt post:trend}{cmd:(}{it:#}{cmd:)}}specify the trend (slope) of the time series after introduction of the intervention {p_end}
{synopt:{opt sd}{cmd:(}{it:#}{cmd:)}}specify the standard deviation used to generate the random error term(s); default is {cmd:sd(1)} {p_end}
{synopt:{opt rho}{cmd:(}{it:{help numlist}}{cmd:)}}specify the correlation coefficient(s) between adjacent (autoregressive) error terms {p_end}
{synopt:{opt seed}{cmd:(}{it:#}{cmd:)}}sets the random-number seed to {it:#}{p_end}
{synoptline}
{p 4 6 2}* {opt ntime()}, {opt trperiod()}, {opt intercept()}, {opt pretrend()}, {opt posttrend()}, {opt step()} are required. {p_end}
{p2colreset}{...}



{title:Description}

{pstd}
{cmd:itsadgp} generates artificial interrupted time series data using the specifications defined by the user -- including autoregressive terms (lags). 
In the simplest case, the data generated by {cmd:itsadgp} could be used for simulating a single-group interrupted time series analysis. In a more complex
scenario, {cmd:itsadgp} could be implemented repeatedly using different specifications to generate data for simulating a multiple-group (treatment vs 
control) interrupted time series analysis. While the most straightforward approach to analyzing the data generated by {cmd:itsadgp} is via the {helpb itsa} 
package (Linden 2015), other models, such as {helpb arima}, can also be used for analyzing the generated data. 

{pstd}
Note: {cmd:itsadgp} replaces the data in memory, so be sure to save your data!



{title:Options}

{phang}
{cmd:ntime(}{it:integer}{cmd:)} specifies the number of time periods to generate in the series; {cmd:ntime() is required}. 

{phang}
{cmd:trperiod(}{it:integer}{cmd:)} specifies the time period when the intervention begins; {cmd:trperiod() is required}.

{phang}
{cmd:intercept(}{it:#}{cmd:)} specifies the starting value of the time-series. In an ITSA regression model, this value would
represent the intercept; {cmd:intercept() is required}.

{phang}
{cmd:pretrend(}{it:#}{cmd:)} specifies the pre-intervention trend (slope). In an ITSA regression model, this value would
represent the coefficient for {it:t}; {cmd:pretrend() is required}.
 
{phang}
{cmd:step(}{it:#}{cmd:)} specifies the change in the level of the time-series in the period immediately following introduction
of the intervention. In an ITSA regression model, this value would represent the coefficient for {it:x}; {cmd:step() is required}.

{phang}
{cmd:posttrend(}{it:#}{cmd:)} specifies the trend (slope) of the time-series following introduction of the intervention. In an ITSA
regression model, this value would be computed post-estimation as: {it:t} + {it:xt}
(where {it:xt} is an interaction between {it:t} and {it:x}); {cmd:posttrend() is required}. 

{phang}
{cmd:sd(}{it:#}{cmd:)} specifies the standard deviation used to generate the random error term(s): {it:u_t} = {it:N}(0,{cmd:sd}). A larger
sd will induce greater variability in the time-series; default is {cmd:sd(1)}.

{phang}
{cmd:rho(}{it:numlist}{cmd:)} specifies the correlation coefficient(s) between adjacent (autoregressive) error terms. For a first-order 
autoregressive [AR(1)] process, the random-error term is computed as: {it:epsilon_t} = {it:rho * epsilon_t-1} + {it:u_t} (equation 2 in [Linden 2015]).
The number of values specified in {cmd:rho()} indicates the number of lags of autocorrelation in the time-series, so that one value indicates 1 lag, two
values indicate 2 lags, etc. Most time-series follow an AR(1) process. 
 
{phang}
{cmd:seed(}{it:#}{cmd:)} sets the random-number seed.
 

 
{title:Examples}

{pstd}
In this example, we generate a time-series of 100 time points, with the intervention introduced 
at the 50th time point. The starting value (intercept) is set to 10, the pre-intervention trend is
set to increase by 1 point every time period until the intervention starts, the level change (step) in
the first time period immediately following introduction of the intervention is set to 30 points, 
and the trend following introduction of the intervention (posttrend) is set to increase by 5 points 
every period. We also build in an autoregressive process with one lag (AR1) by setting rho 
(the autocorrelation coefficient) to 0.50. We set the standard deviation used to generate the random 
error term to 4, and we specify a seed to ensure replication.

{pmore2}{cmd:. itsadgp, nt(100) trp(50) intercept(10) pre(1) post(5) step(30) sd(4) rho(.50) seed(12345)}{p_end}

{pstd}
We now analyze these data using {helpb itsa}. The results in the regression table and post-trend table are close
to those used in the data generating process. This will naturally differ depending on how {cmd:sd()} and {cmd:rho()} 
are specified.

{pmore2}{cmd:. itsa y, single trperiod(50) posttrend fig lag(1)}{p_end}

{pstd}
Same as before but we now generate an AR2 (2 lags) process, where the first lag has autocorrelation
of 0.50 and the second lag has autocorrelation of 0.30. We follow up the data generating process with {helpb itsa}.

{pmore2}{cmd:. itsadgp, nt(100) trp(50) intercept(10) pre(1) post(5) step(30) sd(4) rho(.50 .30) seed(12345)}{p_end}
{pmore2}{cmd:. itsa y, single trperiod(50) posttrend fig lag(2)}{p_end}

{pstd}
In this example, we generate a time-series based on the results from the single-group ITSA presented in Linden [2015],
(available in the {helpb itsa} help file). We then estimate the treatment effects using {helpb itsa}. The regression table and
posttrend table show similar results using our data generating process. 

{pmore2}{cmd: . itsadgp, ntime(31) intercept(132) pretrend(-1.78) posttrend(-3.27) step(-20) trperiod(19) sd(1) rho(0.5) seed(12345)}{p_end}
{pmore2}{cmd:. itsa y, single trperiod(19) posttrend fig lag(1)}{p_end}

{pstd}
In this example, we generate data for a multi-group (treament vs control) time-series analysis. This is completed by generating a time series 
separately for each group (in this example we'll have one treatment group and one control group), saving and then appending the datasets.
Finally, we estimate a model using {helpb itsa} 

{pmore2}{cmd:. itsadgp, nt(100) trp(50) intercept(10) pre(1) post(5) step(30) sd(6) rho(.50) seed(12345)}{p_end}
{pmore2}{cmd:. gen tx = 1}{p_end}
{pmore2}{cmd:. tempfile tx1}{p_end}
{pmore2}{cmd:. save `tx1'}{p_end}

{pmore2}{cmd:. itsadgp, nt(100) trp(50) intercept(8) pre(1.3) post(2.5) step(10) sd(6) rho(.50) seed(12345)}{p_end}
{pmore2}{cmd:. gen tx = 0}{p_end}
{pmore2}{cmd:. append using `tx1'}{p_end}
{pmore2}{cmd:. label define tx 1 "Treatment" 0 "Control", replace}{p_end}
{pmore2}{cmd:. label values tx tx}{p_end}

{pmore2}{cmd:. tsset tx t}{p_end}
{pmore2}{cmd:. itsa y, treatid(1) trperiod(50) posttrend fig lag(1)}{p_end}



{title:References}

{phang}
Linden, A. 2015.
{browse "http://www.stata-journal.com/article.html?article=st0389":Conducting interrupted time series analysis for single and multiple group comparisons}.
{it:Stata Journal}.
15: 480-500.

{phang}
------. 2017.
{browse "http://www.stata-journal.com/article.html?article=st0389_3":A comprehensive set of postestimation measures to enrich interrupted time-series analysis}.
{it:Stata Journal}
17: 73-88.

{phang}
------. 2022.
{browse "https://journals.sagepub.com/doi/full/10.1177/1536867X221083929":Erratum: A comprehensive set of postestimation measures to enrich interrupted time-series analysis}.
{it:Stata Journal}
22: 231-233. 



{marker citation}{title:Citation of {cmd:itsadgp}}

{p 4 8 2}{cmd:itsadgp} is not an official Stata command. It is a free contribution
to the research community, like a paper. Please cite it as such: {p_end}

{p 4 8 2}
Linden A. (2025). ITSADGP: Stata module to generate artificial data for interrupted time-series analysis.



{title:Author}

{pstd}Ariel Linden{p_end}
{pstd}Linden Consulting Group, LLC{p_end}
{pstd}{browse "mailto:alinden@lindenconsulting.org":alinden@lindenconsulting.org}{p_end}
       
 
{p 7 14 2}Help: {helpb itsa} (if installed) {p_end}