```.-
help for ^shapley^                                      skolenik@@recep.glasnet
> .ru
.-

Shapley value decomposition
---------------------------

^shapley^ factor_list , ^result^([^global^] sample_stat) [^perc^ent
^fromto^ ^d^ots ^sav^ing(filename) ^sto^ring(filename) ^replace^
^ti^tle(str) ^noi^sily] : call_to_program ^@@^ , [call_options]

Description
-----------

^shapley^ performs (exact additive) decomposition of a sample statistics by
effects specified in factor list. To perform Shapley decomposition, the
effects are eliminated one by one, and marginal effects from each exclusion
are weighted according to the stage of exclustion. The weights of the
marginal effects are assigned in such a way that all exclusion trajectories
have equal weights.

In other words, ^shapley^ effectively creates 2^^(# factors) patterns of
included and excluded factors from the list specified by user, runs
call_to_program with call_options where the ^@@^ is substituted for the
current pattern, saves the results and weights them in some "fair" way.

It is written in the most general format and saves as much of the results
as possible, so that the end user could fill the Shapley framework with
whatever contents required.

Options
-------

^result^ is the sample statistics to be "shapleyed" in the form "^r(^something^
> )^",
"^e(^something^)^", or a global macro defined by the called program. Se
> e
help for @global@ and @return@. Specify ^global^ if you are referring t
> o
a global macro defined in your program, and strip the leading \$ from
it. For instance, if your program saves \$S_1 for future reference,
then you need to specify ^result(global S_1)^.

^percent^ is used to report the percentages of Shapley value contributions
corresponding to the factors.

^fromto^ reports the values of the statistic when all factors are present
and when no factors is present.

^dots^ entertains the user by displaying progress indicator.

^noisily^ keeps the output of the program you are calling. Be prepared for
a huge amount of output if you specify this; you won't want it
unless your program returns crazy results. You could want to use
@quietly@ in your program to reduce amount of output produced.

^saving^ saves the results of various patterns of factors substitution in
the call to the external program into the user-defined file. Use
this file for hierarchical Shapley-Owen decomposition, where
a different set of weights is to be used.

^storing^ saves the marginal differences of the factor exclusion into the
user-defined file.

^replace^ indicates to Stata that the files specified in ^saving^ and ^storing^
options may be replaced if necessary.

^title^ allows to add the name of the thing you are decomposing to the output.

Saved results
-------------

Matrices:
^r(decompos)^ the Shapley decomposition results, by factors

Macros:
^r(factors)^  the list of the factors.
^r(call)^     the call statement with ^@@^ inside.

Global macros:
^\$SHsave^     the location of the file specified in ^saving^ options.
^\$SHstore^    the location of the file specified in ^storing^ options.
If none specified, contains junk.
^\$SHfactor^   factor list
^\$SHres^      saved result
^\$SHdiff^     total difference: the returned value when all factors are
included vs. the case when all factors are excluded.
^\$SHcall^     call to the external program

These names, as long as ^\$SHbase^, ^\$SHdebug^ ,^\$SHMD^, ^\$SHfile^, ^\$SHreplc^,
^\$SHgl^, should be avoided in the program performing actual calculations.

Files:
^saving^ option, or global ^\$SHsave^ -- the complete set of returned values
factors  : present 1/not present 0
__ID     : binary representation of 0/1 pattern
__result : the corresponding result
__round  : number of 1s in the pattern; round of exclusion = #factors - it

^storing^ option, or global ^\$SHstore^ -- the set of marginal differences
__factor : the name of the factor
__fno    : the number of the factors in the list
__from   : the pattern of the parent node
__to     : the pattern of the successor node
__stage  : the stage of exclusion; 0 is nothing excluded; # factors
when all excluded
__weight : the number of trajectories passing through
__diff   : the marginal difference

Examples
--------

. ^use auto^
(1978 automobile data)

. ^replace price=price/1000^
price was int now float

. ^shapley weight foreign mpg length, result(e(mss)) : regress price @@^

Shapley decomposition

Factors | 1st round |  Shapley
|  effects  |   value
---------+-----------+-----------
weight   |  184.234  |  148.168
for      |  1.50738  |  79.6461
mpg      |  139.449  |  54.1969
leng     |  118.426  |  66.6984
---------+-----------+-----------
Resdiual | -94.9077  |
---------+-----------+-----------
Total |  348.709  |  348.709

This sequence performs the Shapley value decomposition of the explained
variance from the regression model of price on weight, foreign, mpg and
length variables. The scaling of price is done to make output more
readable. The first round effects are obtained by eliminating the factor
from the saturated model.

(In fact, the initial aim of the Shapley decomposition was to isolate the
effects of various sources of income on the income inequality indices.
Detailed examples with the do-files and dta-files are available upon
request.)

Tips
----

1. As long as ^shapley^ is computationally intensive, you might think of
writing your own simplified and thus faster versions of base or STB
programs, or dropping unnecessary observations prior to ^shapley^.

2. ^call_to_program @@ , call_options^ in fact might be any valid Stata
statement where ^@@^ would be substituted for the current pattern
of factors: do-file, ado-file, built-in command, whatever, and ^@@^
may be among options, as well. It is user's responsibility to use
these patterns in the way (s)he likes and to supply the non-missing
returned values to the ^shapley^. Returned missing value is likely to
crash ^shapley^ at some point.

3. No standard errors for factor contributions are provided so far.
A reasonable thing to do would be to bootstrap your data; see help
@bstrap@. Doing so, however, would result in ^awfully^ computationally
intensive calculations...

4. Sometimes, conformability error @search:rc 503!r(503)@ is returned. I was un
> able to
safely reproduce the problem, however. Just rerun your shapley statement.

Authors & references
--------------------

Theoretical foundation: Tony Shorrocks, shora@@essex.ac.uk
Current draft is available at @http://giganda.komkon.org/~tacik/shapley.pdf@

Implementation: Stas Kolenikov, skolenik@@recep.glasnet.ru