{smcl}
{* *! version 1.2.2  15may2018}{...}
{findalias asfradohelp}{...}
{vieweralsosee "" "--"}{...}
{vieweralsosee "[R] help" "help help"}{...}
{viewerjumpto "Syntax" "csestudy##syntax"}{...}
{viewerjumpto "Description" "csestudy##description"}{...}
{viewerjumpto "Options" "csestudy##options"}{...}
{viewerjumpto "Remarks" "csestudy##remarks"}{...}
{viewerjumpto "Examples" "csestudy##examples"}{...}
{title:Title}

{phang}
{bf:csestudy} {hline 2} Efficient Inference for Cross-Sectional Event Studies


{marker syntax}{...}
{title:Syntax}

{p 8 17 2}
{cmdab:csestudy}
{depvar} [{indepvars}]
[{help if:if}]
{cmd:,} {opth event:startdate(csestudy##eventdate:date)} {opth firstpre:eventdate(csestudy##firstpreeventdate:date)} {opth lastpre:eventdate(csestudy##lastpreeventdate:date)} [{help csestudy##options:options}] {p_end}


{synoptset 27 tabbed}{...}
{synopthdr}
{synoptline}
{syntab:Main}
{p2coldent:* {opth event:startdate(csestudy##eventstartdate:date)}}The start date of the event.{p_end}
{p2coldent:* {opth firstpre:eventdate(csestudy##firstpreeventdate:date)}}The first (i.e. earliest) date in the pre-event period. {p_end}
{p2coldent:* {opth lastpre:eventdate(csestudy##lastpreeventdate:date)}}The last (i.e. latest) date in the pre-event period. {p_end}
{synopt:{opt gls}}Calculate GLS estimates.{p_end}
{synopt:{opt npc(integer)}}Number of principal components. Defaults to 100{p_end}
{synopt:{opt woodbury}}Use the Woodbury matrix identity for GLS instead of Cholesky decomposition. Faster but slightly less numerically precise. Requires {opt gls}.{p_end}
{synopt:{opt coefsonly}}Calculates only the coefficients, skipping the significance tests. Programmer option only.{p_end}
{synoptline}
{p2colreset}{...}
{p 4 6 2}
* {opth event:startdate(csestudy##eventstartdate:date)}}, {opth firstpre:eventdate(csestudy##firstpreeventdate:date)}}, and {opth lastpre:eventdate(csestudy##lastpreeventdate:date)}} are required.{p_end}


{marker description}{...}
{title:Description}

{pstd}
{cmd:csestudy} calculates robust inference for cross-sectional event studies as described in {browse "https://doi.org/10.1016/j.jfineco.2026.104278":Cohn, Johnson, Liu, and Wardlaw (2026) "Past is Prologue: Inference from the Cross Section of Returns Around an Event," {it:Journal of Financial Economics} 180, 104278}.{p_end}

{pstd}
The estimation uses a time-series adjusted portfolio approach to inference about standard errors in which the coefficients are compared against a pre-event window of daily returns and adjusted rejection criteria are computed in the form of a parameterized z-score and a p-value estimated from the empirical distribution (the preferred metric in this approach.)
{p_end}

{marker options}{...}
{title:Options}


{dlgtab:Main}

{phang}
{marker eventstartdate}{...}
{opt eventstartdate(date)} The start date of the event. This date refers to the time variable set by tsset.
{p_end}

{phang}
{marker lastpreeventdate}{...}
{opt lastpreeventdate()} The last date in the pre-event period. This must be earlier than the eventstartdate later than lastpreeventdate.
{p_end}

{phang}
{marker firstpreeventdate}{...}
{opt firstpreeventdate(date)} The first date in the pre-event period. This must be earlier than the eventstartdate and lastpreeventdate.
{p_end}

{phang}
{opt gls} Calculate GLS estimates.
{p_end}

{phang}
{opt npc()} Number of principal components. Defaults to 100.
{p_end}

{phang}
{opt woodbury} Use the Woodbury matrix identity to compute the GLS transformation instead of a
Cholesky decomposition of the full covariance matrix. This inverts only a k x k matrix
(k = npc) rather than factoring the N x N covariance matrix, yielding a ~50-66% speedup
per iteration. The tradeoff is slightly reduced numerical precision due to the wide range
of idiosyncratic variances. Requires {opt gls}.
{p_end}

{marker remarks}{...}
{title:Remarks}

{pstd}
The GLS estimation requires a strongly balanced panel in the pre-period in order to work, so any ids which do not have a full set of available returns in the pre-period will be dropped. This is done for the user, and the observations which satisfy this condition are stored in e(sample). This is usually not a major issue in daily stock market data, but if your sample is significantly cut down by this operation, you may have an unusual set of pre-period observations.
{p_end}

{pstd}
Calculating significance with the estimates also require that there is a sufficiently long window of available data prior to the firstpreeventdate. (Effectively a window equal to {it:eventstartdate} - {it:firstpreeventdate} prior to firstpreeventdate). The user should check that the data is at least {it:mostly} balanced before proceeding.
{p_end}

{dlgtab:Event Date Input}

{pstd}
The date options accept integer values or Stata expressions evaluated at runtime. For example:{p_end}

{phang}{cmd:. csestudy ret lag_LNMV, eventstartdate(100) ...}{p_end}
{phang}{cmd:. csestudy ret lag_LNMV, eventstartdate(bofd("mycal",mdy(9,19,2011))) ...}{p_end}

{pstd}
It is important that the time variable tracks {it:trading days} rather than calendar days.
The simplest way to achieve this is with {help bcal:bcal create}, which constructs a
business calendar from the dates in your data and generates a sequential trading-day
variable:{p_end}

{phang}{cmd:. bcal create trading, from(date) gen(trading_date) center(20081006) replace}{p_end}
{phang}{cmd:. tsset permno trading_date}{p_end}

{pstd}
Alternatively, many users have historically used a simple sequential integer for the time
variable:{p_end}

{phang}{cmd:. bysort permno (date): gen time = _n}{p_end}
{phang}{cmd:. tsset permno time}{p_end}

{pstd}
This works as long as the panel is strongly balanced (i.e. all observations begin at the
same date and exist on all trading days). The {cmd:bcal} approach is more robust to missing
data and non-trading days, while the sequential integer approach can be more convenient if
your data is already structured that way.{p_end}

{marker sampledata}{...}
{dlgtab:Sample Data}

{pstd}
A synthetic dataset with realistic CRSP-like properties is included for testing. It contains
300 firms over 461 S&P 500 trading days (Jan 2007 {hline 1} Nov 2008), with a size-dependent
abnormal return (0.15% per sd of {bf:lag_LNMV}) injected on 2008-10-06. The panel is long
enough to support GLS estimation with a 200-day pre-event window.{p_end}

{pstd}Load directly from GitHub:{p_end}

{phang}{cmd:. use "https://raw.githubusercontent.com/MalcolmWardlaw/csestudy/release/examples/sample_data.dta", clear}{p_end}

{marker examples}{...}
{title:Examples}

{pstd}Load the sample data and set up the panel:{p_end}

{phang}{cmd:. use "https://raw.githubusercontent.com/MalcolmWardlaw/csestudy/release/examples/sample_data.dta", clear}{p_end}
{phang}{cmd:. bcal create trading, from(date) gen(trading_date) center(20081006) replace}{p_end}
{phang}{cmd:. tsset permno trading_date}{p_end}

{pstd}OLS with time-series corrected errors:{p_end}

{phang}{cmd:. csestudy ret lag_LNMV if abs(prc)>5, eventstartdate(0) firstpreeventdate(-200) lastpreeventdate(-1)}{p_end}

{pstd}GLS with 100 principal components (Cholesky, default):{p_end}

{phang}{cmd:. csestudy ret lag_LNMV if abs(prc)>5, eventstartdate(0) firstpreeventdate(-200) lastpreeventdate(-1) gls npc(100)}{p_end}

{pstd}GLS with Woodbury identity (faster, slightly less precise):{p_end}

{phang}{cmd:. csestudy ret lag_LNMV if abs(prc)>5, eventstartdate(0) firstpreeventdate(-200) lastpreeventdate(-1) gls npc(100) woodbury}{p_end}

{pstd}Multi-day event window using cumulative returns:{p_end}

{phang}{cmd:. gen ret5 = ret + f1.ret + f2.ret + f3.ret + f4.ret}{p_end}

{phang}{cmd:. csestudy ret5 lag_LNMV if abs(prc)>5, eventstartdate(0) firstpreeventdate(-204) lastpreeventdate(-5)}{p_end}