------------------------------------------------------------------------------- help for mvsumm -------------------------------------------------------------------------------

Generate moving-window descriptive statistics in time series or panel

mvsumm tsvar [if exp] [in range] [weight] , generate(newvar) stat(statistic) [ window(#) end force ]

mvsumm is for use with time-series data. You must tsset your data before using mvsumm; see help tsset.

varname may contain time-series operators; see help varlist.

Description

mvsumm computes a moving-window descriptive statistic for tsvar which must be a time series variable under the aegis of tsset. If a panel calendar is in effect, the statistic is calculated for each time series within the panel. The moving-window statistic is placed in a new variable, specified with the generate() option. The statistics available include minimum, maximum, other key percentiles, mean and standard deviation: one of these and/or other statistics returned by summarize, or easily computable from what it returns, may be specified. aweights or fweights may be specified. Although mvsumm works with unbalanced panels (where the start and/or end points differ across units), mvsumm does not allow gaps within the observations of a time series; that is, the value of an observation for a given period may be missing, but the observation itself must be defined. Gaps in time series may be dealt with via the tsfill command.

Options

stat(statistic) specifies the statistic desired, from the following list. This is a required option.

one of statistic ------ --------- n N count number of non-missing observations sum sum sum_w sum of weight mean mean sd SD standard deviation Var var variance se SE semean standard error of the mean skew skewness skewness kurt kurtosis kurtosis min minimum max maximum p1 1st percentile p5 5th percentile p10 10th percentile p25 25th percentile p50 med median 50th percentile (median) p75 75th percentile p90 90th percentile p95 95th percentile p99 99th percentile iqr IQR interquartile range (p75 - p25) range range (max - min) generate(newvar) specifies the name of a new variable in which the results are to be placed. This is a required option.

window(#) specifies the width of the window for computation of the statistics, which should be an integer at least 2. By default, results for odd-length windows are placed in the middle of the window and results for even-length windows are placed at the end of the window. The defaults can be over-ridden by the end option. The default is 3.

end forces results to be placed at the end of the window in the case where the window width is an odd number.

force forces results to be computed when some of a particular window's values are missing.

Remarks

Occasionally people want to use if and/or in when calculating moving summaries, but that raises a complication not usually encountered. What would you expect from a moving summary calculated with either kind of restriction? Let us identify two possibilities:

Weak interpretation: I don't want to see any results for the excluded observations.

Strong interpretation: I don't even want you to use the values for the excluded observations.

Here is a concrete example. Suppose as a consequence of some restriction, observations 1-42 are included, but not observations 43 on. But the moving summary for 42 will depend, among other things, on the value for observation 43 if the summary extends backwards and forwards and is of length at least 3, and it will similarly depend on some of the observations 44 onwards in some circumstances.

Our guess is that most people would go for the weak interpretation, which is employed in mvsumm. If not, you should ignore what you don't want or even set unwanted values to missing afterwards by using replace.

Examples

. webuse grunfeld . mvsumm invest, stat(mean) win(3) gen(inv3yavg) end

. mvsumm invest, stat(sd) win(5) gen(inv5ysd) end

. mvsumm D.mvalue, stat(median) win(5) gen(meddmval) end

Authors

Christopher F. Baum, Boston College, USA baum@bc.edu

Nicholas J. Cox, Durham University, U.K. n.j.cox@durham.ac.uk

Acknowledgements

This routine is based on Cox's movsumm and the authors' statsmat. Its development was inspired by a July 2002 discussion on Statalist. Nick Winter and Vince Wiggins provided helpful comments. Ernest Berkhout helpfully identified some problems with the routine.

Also see

On-line: summarize, tsset, tsfill