{smcl}
{* 16oct2005}{...}
{hline}
help for {hi:mvsumm}
{hline}

{title:Generate moving-window descriptive statistics in time series or panel}

{p 8 17 2}{cmd:mvsumm}
{it:tsvar}
[{cmd:if} {it:exp}]
[{cmd:in} {it:range}]
[{it:weight}] 
{cmd:,} {cmdab:g:enerate(}{it:newvar}{cmd:)}
{cmdab:s:tat(}{it:statistic}{cmd:)}
[
{cmdab:w:indow(}{it:#}{cmd:)}
{cmd:end}
{cmd:force}
]

{p 4 4 2}
{cmd:mvsumm} is for use with time-series data.  You must {cmd:tsset} your
data before using {cmd:mvsumm}; see help {help tsset}.

{p 4 4 2}
{it:varname} may contain time-series operators; see help {help varlist}.

{title:Description}

{p 4 4 2}{cmd:mvsumm} computes a moving-window descriptive statistic for {it:tsvar}
which must be a time series variable under the aegis of {cmd:tsset}.  If a
panel calendar is in effect, the statistic is calculated for each time series
within the panel.  The moving-window statistic is placed in a new variable,
specified with the {cmd:generate()} option.  The statistics available include
minimum, maximum, other key percentiles, mean and standard deviation:  one of
these and/or other statistics returned by {cmd:summarize}, or easily computable
from what it returns, may be specified.  aweights or fweights may be specified.
Although {cmd:mvsumm} works with unbalanced panels (where the start and/or end points 
differ across units), {cmd:mvsumm} does not allow gaps within the observations 
of a time series; that is, the value of an observation for a given period may be 
missing, but the observation itself must be defined. Gaps in time series may be 
dealt with via the {cmd:tsfill} command.


{title:Options}

{p 4 8 2}{cmd:stat(}{it:statistic}{cmd:)} specifies the statistic
desired, from the following list. This is a required option. 

    one of           statistic
    {hline 6}           {hline 9}          
    n N count        number of non-missing observations
    sum              sum 
    sum_w            sum of weight
    mean             mean
    sd SD            standard deviation
    Var var          variance
    se SE semean     standard error of the mean
    skew skewness    skewness
    kurt kurtosis    kurtosis
    min              minimum
    max              maximum
    p1               1st percentile 
    p5               5th percentile
    p10              10th percentile
    p25              25th percentile
    p50 med median   50th percentile (median)
    p75              75th percentile
    p90              90th percentile
    p95              95th percentile
    p99              99th percentile 
    iqr IQR          interquartile range (p75 - p25) 
    range            range (max - min) 
    
{p 4 8 2}{cmd:generate(}{it:newvar}{cmd:)} specifies the name of a new variable
in which the results are to be placed.
This is a required option. 
 
{p 4 8 2}{cmd:window(}{it:#}{cmd:)} specifies the width of the window for
computation of the statistics, which should be an integer at least 2. By
default, results for odd-length windows are placed in the middle of the window
and results for even-length windows are placed at the end of the window. The
defaults can be over-ridden by the {cmd:end} option.
The default is 3. 

{p 4 8}{cmd:end} forces results to be placed at the end of the window in 
the case where the window width is an odd number.

{p 4 8}{cmd:force} forces results to be computed when some of a particular 
window's values are missing.


{title:Remarks} 

{p 4 4 2}Occasionally people want to use {cmd:if} 
and/or {cmd:in} when calculating moving summaries, but 
that raises a complication not usually encountered. 
What would you expect from a moving summary calculated with 
either kind of restriction? Let us identify two possibilities: 

{p 8 8 2}Weak interpretation: I don't want to see any results for 
the excluded observations. 

{p 8 8 2}Strong interpretation: I don't even want you to use the 
values for the excluded observations. 

{p 4 4 2}Here is a concrete example. Suppose as a consequence of 
some restriction, observations 1-42 are included, but not 
observations 43 on. But the moving summary for 42 will depend, 
among other things, on the value for observation 43 if the summary
extends backwards and forwards and is of length at least 3, 
and it will similarly depend on some of the observations 44 
onwards in some circumstances. 

{p 4 4 2}Our guess is that most people would go for the weak 
interpretation, which is employed in {cmd:mvsumm}. If not, 
you should ignore what you don't want or even set unwanted values 
to missing afterwards by using {cmd:replace}. 


{title:Examples}

{p 4 8 2}{stata "webuse grunfeld" :. webuse grunfeld}{p_end}
 
{p 4 8 2}{stata "mvsumm invest, stat(mean) win(3) gen(inv3yavg) end" :. mvsumm invest, stat(mean) win(3) gen(inv3yavg) end}

{p 4 8 2}{stata "mvsumm invest, stat(sd) win(5) gen(inv5ysd) end" :. mvsumm invest, stat(sd) win(5) gen(inv5ysd) end}

{p 4 8 2}{stata "mvsumm D.mvalue, stat(median) win(5) gen(meddmval) end" :. mvsumm D.mvalue, stat(median) win(5) gen(meddmval) end}


{title:Authors}

{p 4 4 2}Christopher F. Baum, Boston College, USA{break} 
       baum@bc.edu
    
{p 4 4 2}Nicholas J. Cox, Durham University, U.K.{break} 
       n.j.cox@durham.ac.uk


{title:Acknowledgements}     

{p 4 4 2}This routine is based on Cox's {cmd:movsumm} and the authors'
{cmd:statsmat}. Its development was inspired by a July 2002 discussion on
Statalist.  Nick Winter and Vince Wiggins provided helpful comments. Ernest Berkhout
helpfully identified some problems with the routine.
 

{title:Also see}

{p 4 13 2}On-line: {help summarize}, {help tsset}, {help tsfill}