{smcl}
{* 30may2017/8jun2017}{...}
{cmd:help rangerun}
{hline}
{title:Title}
{phang}
{cmd:rangerun} {hline 2} Run Stata commands on observations within range
{title:Syntax}
{p 8 17 2}
{cmd:rangerun}
{it:program_name}
{ifin}
{cmd:,}
{opt i:nterval(keyvar low high)}
[
{it:{help rangerun##table_options:options}}
]
{synoptset 27 tabbed}{...}
{marker table_options}{...}
{synopthdr}
{synoptline}
{p2coldent :* {opt i:nterval(keyvar low high)}}use observations where {it:keyvar}
is within the bounds indicated by {it:low} and {it:high}
{p_end}
{synopt :{opth by(varlist)}}the set of observations to use is found within {it:by} group
{p_end}
{synopt :{opth u:se(varlist)}}the set of numeric variables visible to {it:program_name}
{p_end}
{synopt :{opth s:prefix(string)}}variable name prefix used to create scalars to hold current observation's values
{p_end}
{synopt :{opt v:erbose}}output while running {it:program_name} is not suppressed
{p_end}
{synoptline}
{phang}* {opt i:nterval(keyvar low high)} is required.
{it:keyvar} is a numeric variable.
The lower and upper bound of the closed interval to use
for each observation can be
specified using a numeric variable, a {it:#}, or a {help missing:system missing value}.
If a {it:#} is used, the bound for each observation is computed by adding {it:#} to {it:keyvar}.
If {it:low} is specified using a {help missing:system missing value}, {it:low} is set to
missing for all observations.
{cmd:rangerun} applies the same rules as {help inrange()} for missing bounds.
{p2colreset}{...}
{marker Description}{...}
{title:Description}
{pstd}
{cmd:rangerun} requires the latest version
of {stata ssc des rangestat:rangestat}.
Click {stata ssc install rangestat:here to install} {cmd:rangestat}
from SSC.
{pstd}
{cmd:rangerun} uses the same mechanics as {cmd:rangestat} to identify,
for each observation in the sample, the set of observations that fall
within the bounds of the specified interval.
Please consult {cmd:rangestat}'s {help rangestat:help file};
it contains detailed instructions on how to set the {opt i:nterval()}
as well as numerous examples of how to control the sample,
including how to implement various types of rolling windows.
{pstd}
{cmd:rangerun} makes a virtual copy of the numeric variables to use
for all observations in the sample
(stored in a Mata matrix)
and then loops over each of these observations.
At each iteration, the data in memory is cleared and replaced
with the set of observations in range for the current observation.
The observations in range are sorted by {it:keyvar} and
follow the order they appear
in the initial data in memory when {it:keyvar} values are the same.
{pstd}
{it:program_name} is then called.
{it:program_name} takes no argument and returns nothing.
{it:program_name} may include as many Stata commands as needed.
Results are picked out from what is left in memory when {it:program_name}
terminates without error.
{cmd:rangerun} identifies all new numeric variables
and stores results using values from the last observation in memory.
{pstd}
If you have multiple observations with the same interval
bounds, you should follow the advice in the
{it:{help rangerun##Controlling_the_sample:Controlling the sample: Median salary of non-teammates}}
example below on how to designate a representative
observation to avoid running {it:program_name} needlessly
over the same subset of observations.
{pstd}
The references give a context of previous discussions of related problems. In essence, the main point is that {cmd:rangerun} and {help rangestat} supersede many of the techniques discussed there.
{marker options}{...}
{title:Options}
{dlgtab:Options}
{phang}{opt i:nterval(keyvar low high)} is required and defines the interval
that selects the set of observations to use to calculate results for the
current observation.
{it:keyvar} is a numeric variable.
Observations whose values for {it:keyvar} fall within the
closed interval bounds are selected.
{it:low} and {it:high} can each be
specified using a numeric variable, a {it:#} (a number in Stata parlance), or a {help missing:system missing value}.
If a {it:#} is used, the bound for each observation is computed by adding {it:#} to {it:keyvar}.
If {it:low} is specified using a {help missing:system missing value}, {it:low} is set to missing
for all observations.
{cmd:rangerun} applies the same rules as {help inrange()} for missing bounds:
if the lower bound is missing, observations will match up to and including
the value of {it:high}.
If both {it:low} and {it:high} are missing, all observations will match.
{phang}{opth by(varlist)} specifies that observations
in range with respect to the current observation are found
only within the same group of the variables named.
{phang}{opth u:se(varlist)} specifies the set of numeric variables
in the data in memory when {it:program_name} is called.
If not specified, all numeric variables are included.
Since the data in memory is constantly being refreshed with
the set of observations in range for the current observation,
fewer variables will result in faster execution times.
{phang}{opth s:prefix(string)} specifies the prefix to use when
creating scalars to hold the value of each variable for the
current observation.
The name of each scalar is the combination of the prefix followed
by the variable name.
This allows your program to condition a task based
on the value of some variable(s) for the current observation.
You can, for example, exclude the current observation when
performing calculations (see the
{help rangerun##Controlling_the_sample:Controlling the sample: Median salary of non-teammates}
example below).
The scalars are created with the correct value even if
the current observation does not fall within the
set of observations in range.
If the option is not specified, no scalar is created.
{phang}{opt v:erbose} indicates that the output generated by
{it:program_name} should appear in the Results window.
This option is useful for testing your program on
a small subsample.
WARNING: this can generate a tremendous
amount of output as {it:program_name} will be called
as many times as there are observations in the overall sample.
{marker Examples}{...}
{title:Examples}
{pstd}
If you are familiar with {help rangestat}, {help rolling},
{help statsby},
or if you have used a loop to generate results based on
subsets of observations,
you may get started quickly by browsing the following examples that
contrast a {cmd:rangerun} solution to each alternative method:
{help rangerun##Compared_with_rangestat:Compared with rangestat}
{help rangerun##Compared_with_rolling:Compared with rolling}
{help rangerun##Compared_with_statsby:Compared with statsby}
{help rangerun##Compared_with_looping:Compared with looping over observations}
{pstd}
Users are encouraged to step through an extended example
of how to perform a weighted regression over a rolling window:
{help rangerun##weighted_regression:Weighted regression over a rolling window}
{pstd}
Additional examples:
{help rangerun##Controlling_the_sample:Controlling the sample: Median salary of non-teammates}
{help rangerun##measures_of_skew:Collation and comparison of various measures of distribution skew}
{title:Examples where rangerun is compared with alternative solutions}
{marker Compared_with_rangestat}{...}
{pstd}{ul:Compared with rangestat}
{pstd}
{cmd:rangerun} is very similar to {cmd:rangestat} and everything
that can be done with {cmd:rangestat} can also be done with {cmd:rangerun}.
With {cmd:rangestat}, however, you are limited to built-in functions
or you must create your own Mata function to get what you want.
{pstd}
The following example creates panel data for 100 companies, each with data
over a 360 month period.
There are missing values and gaps in the data.
The task is to calculate basic statistics for the variable {hi:invest}
over a 12 month rolling window within panels.
{space 8}{hline 27} {it:example do-file content} {hline 27}
{cmd}{...}
{* example_start - compared2rangestat}{...}
* create data for 100 companies over 360 months
clear all
set seed 31231
set obs 100
gen long company = _n
expand 360
bysort company: gen mdate = ym(1987,1) + _n
format %tm mdate
gen invest = runiform() if runiform() < .95
drop if runiform() < .05
timer on 1
program myprog
sum invest
gen rrun_n = r(N)
gen double rrun_mean = r(mean)
gen double rrun_sd = r(sd)
end
rangerun myprog, interval(mdate -11 0) use(invest) by(company)
timer off 1
timer on 2
rangestat (count) invest (mean) invest (sd) invest, ///
interval(mdate -11 0) by(company)
timer off 2
* confirm that results are the same using both methods
assert rrun_n == invest_count
assert rrun_mean == invest_mean
assert rrun_sd == invest_sd
{* example_end}{...}
{txt}{...}
{space 8}{hline 80}
{space 8}{it:({stata rangerun_run compared2rangestat using rangerun.sthlp:click to run})}
{pstd}
{stata timer list:Click here} to list
the timer results.
{cmd:rangerun} is a bit slower than {cmd:rangestat}
but it is vastly more flexible since you can use the full
complement of Stata statistical commands.
If possible, use the most efficient Stata command to do the job
since {it:program_name} will be called for each observation in
the sample (except if there is no
observation within the interval bounds of the current observation).
{pstd}
Note that execution times for both {cmd:rangerun} and {cmd:rangestat}
increase linearly, in proportion to the number of observations.
So if you double the number of companies in the example above, the
run times will be twice as long.
{marker Compared_with_rolling}{...}
{pstd}{ul:Compared with rolling}
{pstd}
Everything that can be done with {cmd:rolling} can also be done with {cmd:rangerun}.
The following replicates the last example in {cmd:rolling}'s {help rolling:help file}.
{space 8}{hline 27} {it:example do-file content} {hline 27}
{cmd}{...}
{* example_start - compared2rolling}{...}
webuse lutkepohl2, clear
tsset qtr
rolling ratio=(r(mean)/r(p50)), window(10): summarize inc, detail
list in 1/10
clear all
program myprog
if _N < 10 exit
summarize inc, detail
gen ratio = r(mean)/r(p50)
end
webuse lutkepohl2, clear
rangerun myprog, interval(qtr -9 0) use(inc)
list qtr inc ratio in 1/19, sep(0)
{* example_end}{...}
{txt}{...}
{space 8}{hline 80}
{space 8}{it:({stata rangerun_run compared2rolling using rangerun.sthlp:click to run})}
{pstd}
Note that
execution times with {cmd:rolling} increase exponentially
as the data size increases.
For large problems, {cmd:rangerun} will be orders of magnitude faster than {cmd:rolling}.
{marker Compared_with_statsby}{...}
{pstd}{ul:Compared with statsby}
{pstd}
Everything that can be done with {cmd:statsby} can also be done with {cmd:rangerun}.
The following replicates the last example in {cmd:statsby}'s {help statsby:help file}.
{space 8}{hline 27} {it:example do-file content} {hline 27}
{cmd}{...}
{* example_start - compared2statsby}{...}
sysuse auto, clear
statsby mean=r(mean) sd=r(sd) size=r(N), by(rep78): summarize mpg
list
clear all
program myprog
if mi(rep78) exit
sum mpg
gen size = r(N)
gen mean = r(mean)
gen sd = r(sd)
end
sysuse auto
bysort rep78 (make): gen high = cond(_n==1, ., -1)
rangerun myprog, interval(price . high) by(rep78)
list rep78 mpg size mean sd if high == . & !mi(rep78)
* if desired, carry forward the results of the first obs within rep78 groups
by rep78: replace size = size[1]
by rep78: replace mean = mean[1]
by rep78: replace sd = sd[1]
{* example_end}{...}
{txt}{...}
{space 8}{hline 80}
{space 8}{it:({stata rangerun_run compared2statsby using rangerun.sthlp:click to run})}
{pstd}
To replicate {cmd:statsby}'s functionality,
you specify a {hi:by(rep78)} option combined with
an interval
that selects all observations within these {hi:by()} groups.
If both {it:low} and {it:high} bounds are missing, all observations
within the {hi:by()} group are selected.
The example uses {hi:price} as {it:keyvar} because
it contains no missing values (missing values would exclude the
observation from the overall sample), but any numeric variable with
no missing values would have worked.
To avoid running {hi:myprog} over and over for each observation
within each {hi:rep78} group, we make the {it:high} bound missing
for the first observation in the group and -1 for all repeats.
Since there is no value for {hi:price} between minus infinity and -1,
there are no observations in range for these repeated observations.
When that happens, {cmd:rangerun} does not run {hi:myprog}
and results are set to missing.
{pstd}
Note that
execution times with {cmd:statsby} increase exponentially
as the data size increases.
For large problems, {cmd:rangerun} will be orders of magnitude faster than {cmd:statsby}.
{marker Compared_with_looping}{...}
{pstd}{ul:Compared with looping over observations}
{pstd}
You can loop over each observation to calculate statistics based on
other observations.
The following calculates a regression on a rolling window of 7 years
(including the current observation) and stores
the constant term.
A minimum of 4 observations is required
{space 8}{hline 27} {it:example do-file content} {hline 27}
{cmd}{...}
{* example_start - compared2looping}{...}
clear all
webuse grunfeld
local nobs = _N
gen alpha = .
quietly forvalues i = 1/`nobs' {
capture regress invest kstock if company == company[`i'] & ///
inrange(year, year[`i']-6, year[`i'])
if _rc == 0 & e(N) >= 4 replace alpha = _b[_cons] in `i'
}
program myprog
if _N < 4 exit
regress invest kstock
gen alpha_rr = _b[_cons]
end
rangerun myprog, interval(year -6 0) by(company)
assert alpha == alpha_rr
{* example_end}{...}
{txt}{...}
{space 8}{hline 80}
{space 8}{it:({stata rangerun_run compared2looping using rangerun.sthlp:click to run})}
{pstd}
Note that when looping over all observations,
execution times increase exponentially as the data size increases.
For large problems, {cmd:rangerun} will be orders of magnitude faster than looping.
{marker weighted_regression}{...}
{title:Extended example: weighted regression over a rolling window}
{pstd}
Let's say that you need to perform a weighted regression
using a 5-year rolling window that includes the current observation.
The weights are 1 for the most distant observation and increase
by 1 up to 5 for the current observation.
{pstd}
{ul:Step 1: How to target the observations in range}
{pstd}
With a rolling window problem, the subset of observations in the
current window changes from one observation to the next.
This means that results are specific to each observation
and must be calculated separately.
Let's say that we chose to calculate results for observation 50
in the data.
We could identify the subset of relevant observations
this way:
{space 8}{hline 27} {it:example do-file content} {hline 27}
{cmd}{...}
{* example_start - rw_step1a}{...}
webuse grunfeld, clear
list in 46/50
{* example_end}{...}
{txt}{...}
{space 8}{hline 80}
{space 8}{it:({stata rangerun_run rw_step1a using rangerun.sthlp:click to run})}
{pstd}
But this will not work if there are gaps in the data. Further, it is not
a good approach if the regression needs to be carried out within a panel.
A better way is to use {help subscripting:explicit subscripting}
to construct a condition that ensures that the company is the
same and that the year is within the desired 5-year window.
{space 8}{hline 27} {it:example do-file content} {hline 27}
{cmd}{...}
{* example_start - rw_step1b}{...}
webuse grunfeld, clear
list if company == company[50] & inrange(year, year[50]-4, year[50])
{* example_end}{...}
{txt}{...}
{space 8}{hline 80}
{space 8}{it:({stata rangerun_run rw_step1b using rangerun.sthlp:click to run})}
{pstd}
{ul:Step 2: Calculate results using the observations in range for observation 50}
{pstd}
Since we don't need the other observations to do this,
the simplest solution is to reduce the data in memory
to the observations in the subsample for observation 50.
The weights are then easy to generate: these match
the number of each observation (see {help _variables}).
The {help regress} command supports {help weights} so
all we need is to specify the desired regression.
Since {cmd:rangerun} will collect results from new variable(s),
using the values from the last observation in memory, we store
the desired results that way.
Note that we store {hi:_b[mvalue]} in all observations
while we store {hi:_b[_cons]} only in the last observation
in memory (see {help in:help in}).
While it appears wasteful to generate {hi:b_mvalue} this way,
it is faster than how {hi:b_cons} is generated since
Stata must evaluate the {help in} condition.
{space 8}{hline 27} {it:example do-file content} {hline 27}
{cmd}{...}
{* example_start - rw_step2}{...}
webuse grunfeld, clear
keep if company == company[50] & inrange(year, year[50]-4, year[50])
gen long myweight = _n
regress invest mvalue [aw=myweight]
gen b_mvalue = _b[mvalue]
gen b_cons = _b[_cons] in l
list
{* example_end}{...}
{txt}{...}
{space 8}{hline 80}
{space 8}{it:({stata rangerun_run rw_step2 using rangerun.sthlp:click to run})}
{pstd}
{ul:Step 3: Construct a program to perform the task}
{pstd}
Now that we have determined the set of commands needed to generate
results for one observation, we enclose these commands in a
Stata program.
This program can then be called on any subsample and will generate
the desired results for that subsample.
There is nothing special about this program: it does nothing but
run the exact same commands we detailed above.
{pstd}
Here is the same example as above, with the commands embedded in
a Stata program.
{space 8}{hline 27} {it:example do-file content} {hline 27}
{cmd}{...}
{* example_start - rw_step3}{...}
clear all
webuse grunfeld
keep if company == company[50] & inrange(year, year[50]-4, year[50])
* define the program and include all desired commands
program my_rw_reg
gen long myweight = _n
regress invest mvalue [aw=myweight]
gen b_mvalue = _b[mvalue]
gen b_cons = _b[_cons] in l
end
* run the program
my_rw_reg
list
{* example_end}{...}
{txt}{...}
{space 8}{hline 80}
{space 8}{it:({stata rangerun_run rw_step3 using rangerun.sthlp:click to run})}
{pstd}
{ul:Step 4: Make a practice run using rangerun}
{pstd}
Before trying to perform the task on the whole dataset, it would be
prudent to run a test on a small subset of the data.
The interval to use mimics the one we determined in step 1.
Since these regressions are done within panels, we
specify the {hi:by(company)} option.
As with most Stata commands, you can restrict the sample
used by {cmd:rangerun} using an {help if} condition.
Since we have focused on observation 50 so far, we
limit the sample to company 3 and stop in 1944, the
year in observation 50.
By default, {cmd:rangerun} will suppress all output
generated by the commands in {hi:my_rw_reg}.
Since this is a test run on just a small subset of the
data, we also specify the {opt verbose} option to get a good sense
of what is happening.
{space 8}{hline 27} {it:example do-file content} {hline 27}
{cmd}{...}
{* example_start - rw_step4}{...}
clear all
webuse grunfeld
* define the program and include all desired commands
program my_rw_reg
gen long myweight = _n
regress invest mvalue [aw=myweight]
gen b_mvalue = _b[mvalue]
gen b_cons = _b[_cons]
end
rangerun my_rw_reg if company == 3 & year <= 1944, ///
interval(year -4 0) by(company) verbose
list if company == 3 & year <= 1944
{* example_end}{...}
{txt}{...}
{space 8}{hline 80}
{space 8}{it:({stata rangerun_run rw_step4 using rangerun.sthlp:click to run})}
{pstd}
Naturally, the first years of a panel have fewer observations than
the specified 5 year window. This leads to an
error of insufficient observations for the first year.
If there were gaps in the data, this could also occur
at any point in the time series.
It is interesting to note that {cmd:rangerun}
does not stop when there's an error within {cmd:my_rw_reg},
but simply moves on to the next window without recording
any results.
{pstd}
Since {cmd:my_rw_reg} will run as many times as there
are observations in the sample, it makes sense to reduce
the amount of work it does as much as possible.
If we are only interested in results from a window with
a full complement of years, we can simply exit the program
if that's the case.
Similarly, we can restrict which variables to
load before running {cmd:my_rw_reg}, saving the extra
overhead required to populate the dataset with variables
that will not be used.
Finally, we notice that the {hi:myweight} variable
is returned because it is a new variable, created by
the {cmd:my_rw_reg} program.
We don't want it, so we perform some housekeeping at
the end of the program.
{pstd}
So a retooled version of the test run would be:
{space 8}{hline 27} {it:example do-file content} {hline 27}
{cmd}{...}
{* example_start - rw_step4b}{...}
clear all
webuse grunfeld
* define the program and include all desired commands
program my_rw_reg
if _N < 5 exit
gen long myweight = _n
regress invest mvalue [aw=myweight]
gen b_mvalue = _b[mvalue]
gen b_cons = _b[_cons]
drop myweight
end
rangerun my_rw_reg if company == 3 & year <= 1944, ///
interval(year -4 0) by(company) use(invest mvalue) verbose
list if company == 3 & year <= 1944
{* example_end}{...}
{txt}{...}
{space 8}{hline 80}
{space 8}{it:({stata rangerun_run rw_step4b using rangerun.sthlp:click to run})}
{pstd}
If you scroll back up to step 2 and run the example, you'll see
that we match the results we calculated for observation 50.
{pstd}
{ul:Step 5: Run rangerun on the whole sample}
{pstd}
Once you are satisfied that {cmd:my_rw_reg} produces the
results you want and that the interval is correctly specified,
you can go ahead and make a run for the whole sample.
You need to remove the {opt verbose} option to suppress output
and to remove the condition used for the test run.
{space 8}{hline 27} {it:example do-file content} {hline 27}
{cmd}{...}
{* example_start - rw_step5}{...}
clear all
webuse grunfeld
* define the program and include all desired commands
program my_rw_reg
if _N < 5 exit
gen long myweight = _n
regress invest mvalue [aw=myweight]
gen b_mvalue = _b[mvalue]
gen b_cons = _b[_cons]
drop myweight
end
rangerun my_rw_reg, interval(year -4 0) by(company) use(invest mvalue)
list in 50
{* example_end}{...}
{txt}{...}
{space 8}{hline 80}
{space 8}{it:({stata rangerun_run rw_step5 using rangerun.sthlp:click to run})}
{title:Additional examples}
{marker Controlling_the_sample}{...}
{pstd}
{ul:Controlling the sample: Median salary of non-teammates}
{pstd}
The following example constructs a dataset
of 10 teams, each with 15 years of salary data for
their 20 players.
Then it creates an instrument consisting of
the median annual salary of players from other teams.
{pstd}
The {hi:by(year)} option is specified because the median is to be calculated
using the observations in the same {hi:year}.
{pstd}
The {it:keyvar} for the interval is {hi:teamID}
(chosen here because it has no missing values)
and when both {it:low} and {it:high} bounds are missing,
all observations will be selected.
But since the instrument is constant per team and year,
we can speed up the calculation by designating one player per
{hi:teamID year} group and call the {cmd:median_exclude} program
only for this representative player.
The example sets the {it:high} bound to -1 for other players
and since there are no observations where {hi:teamID} is between
minus infinity and -1, these repeat observations will be ignored.
{pstd}
The {cmd:median_exclude} program needs to know which team the
designated player belongs to.
The {hi:sprefix(rr_)} option tells {cmd:rangerun} to create
scalars using the values for the current observation.
Each scalar is named by combining the prefix string with the
variable name.
{space 8}{hline 27} {it:example do-file content} {hline 27}
{cmd}{...}
{* example_start - median_salary3}{...}
clear all
set seed 32424
set obs 10
gen teamID = _n
expand 15
bysort teamID: gen year = 1999 + _n
expand 20
bysort teamID year: gen player = _n
gen salary = runiform()
* define the program; scalars with the values for current obs start with rr_
program median_exclude
sum salary if teamID != rr_teamID, detail
gen double med_others = r(p50)
end
* the first player per group is the designated player, -1 for others
by teamID year: gen high = cond(_n==1, ., -1)
rangerun median_exclude, by(year) interval(teamID . high) use(salary teamID) sprefix(rr_)
* carry over the results to non-designated players
by teamID year: gen median_ot = med_others[1]
* spot check for observation 100
sum salary if teamID != teamID[100] & year == year[100], detail
list in 100
{* example_end}{...}
{txt}{...}
{space 8}{hline 80}
{space 8}{it:({stata rangerun_run median_salary3 using rangerun.sthlp:click to run})}
{pstd}
The example above was inspired by
{browse "http://www.statalist.org/forums/forum/general-stata-discussion/general/1384241-generating-a-variable-for-median-salary-of-non-team-mates":this post}
on Statalist.
{marker Controlling_the_sample}{...}
{pstd}
{ul:Collation and comparison of various measures of distribution skew}
{pstd}{cmd:summarize, detail} yields moment-based skewness directly as
the r-class result {cmd:r(skewness)} and allows calculation of some
other measures from its results. 75, 50 and 25% percentiles or
quantiles (upper quartile, median and lower quartile) allow calculation
of
{cmd:[(p75 - p50) - (p50 - p25)] / [p75 - p25]}.
Mean, median and SD appear in {cmd:(mean - p50) / sd}. Both measures
must lie within [-1, 1]. Notation here reflects Stata's notation for
saved results such as {cmd:r(p75)}.
{pstd}A further measure based on L-moments comes from a program
{cmd:lmoments} (click
{stata ssc install lmoments:here}
to install from SSC).
See its {help lmoments:help} for explanation.
{pstd}For a skewness program there is no loss in insisting on a minimum
sample size of 3. There is no information on skewness in samples of size
1, while even if a sample of size 2 includes two distinct values,
skewness measures are either undefined or identically zero.
{space 8}{hline 27} {it:example do-file content} {hline 27}
{cmd}{...}
{* example_start - skewness}{...}
clear all
program myskew
su mvalue, detail
if r(N) < 3 exit
gen skewness = r(skewness)
gen mmskew = (r(mean) - r(p50)) / r(sd)
gen qskew = (r(p75) - 2 * r(p50) + r(p25)) / (r(p75) - r(p25))
lmoments mvalue, short
gen t3skew = r(t_3)
end
webuse grunfeld
rangerun myskew, interval(year -9 0) use(mvalue year) by(company)
list if company == 1
{* example_end}{...}
{txt}{...}
{space 8}{hline 80}
{space 8}{it:({stata rangerun_run skewness using rangerun.sthlp:click to run})}
{title:References}
{pstd}
Cox, N.J. 2007. {browse "http://www.stata-journal.com/sjpdf.html?articlenum=pr0033":Events in intervals.}
{it:Stata Journal} 7: 440{c -}443.
{pstd}
Cox, N.J. 2009.
{browse "http://www.stata-journal.com/sjpdf.html?articlenum=pr0046":Rowwise.}
{it:Stata Journal} 9: 137{c -}157.
{pstd}
Cox, N.J. 2010.
{browse "http://www.stata-journal.com/article.html?article=st0204":The limits of sample skewness and kurtosis.}
{it:Stata Journal} 10: 482{c -}495.
{pstd}
Cox, N.J. 2011.
{browse "http://www.stata-journal.com/sjpdf.html?articlenum=dm0055":Compared with ....}
{it:Stata Journal} 11: 305{c -}314.
{pstd}
Cox, N.J. 2014.
{browse "http://www.stata-journal.com/article.html?article=dm0075":Self and others.}
{it:Stata Journal} 14: 432{c -}444.
{title:Acknowledgements}
{pstd}
Several members of Statalist helped directly and indirectly by posting
challenging problems.
{title:Authors}
{pstd}Robert Picard{p_end}
{pstd}picard@netbox.com{p_end}
{pstd}Nicholas J. Cox, Durham University, U.K.{p_end}
{pstd}n.j.cox@durham.ac.uk{p_end}
{title:Also see}
{psee}
Stata:
{help egen},
{help rolling},
{help statsby},
{help tssmooth},
{help tsvarlist},
{help tsrevar}
{p_end}
{psee}
SSC:
{stata "ssc desc rangejoin":rangejoin},
{stata "ssc desc tsegen":tsegen},
{stata "ssc desc mvsumm":mvsumm},
{stata "ssc desc rollstat":rollstat},
{stata "ssc desc egenmore":egenmore}
{p_end}