{smcl}
{* 25oct2006}{...}
{hline}
help for {hi:jmpierce}
{hline}
{title:Juhn-Murphy-Pierce decomposition}
{p 8 15 2}
{cmd:jmpierce} {it:est1} {it:est2} [ {cmd:,}
{cmdab:r:eference:(}{cmd:0}|{cmd:1}|{cmd:2}|{it:estref}{cmd:)}
{cmdab:s:tatitics:(}{it:statlist}{cmd:)}
{cmdab:b:locks:(}{it:blist}{cmd:)}
{bind:{cmdab:sav:e:(}{it:newvar1} {it:newvar2}|{it:prefix}{cmd:)}}
{cmdab:res:iduals:(}{it:newvar}{cmd:)}
]
{p 4 4 2} where {it:blist} is
{p 15 15 2}
{it:name1} {cmd:=} {it:varlist1} [{cmd:,} {it:name2} {cmd:=} {it:varlist2}
[{cmd:,} {it:...}] ]
{title:Description}
{p 4 4 2} {cmd:jmpierce} computes the decomposition of
differences between two outcome distributions introduced by Juhn, Murphy and Pierce
(1993; also see Blau and Kahn 1996) from models previously fitted and stored
by {help estimates:estimates store}. Examples are the decomposition of
changes in the income distribution over time, the decomposition of
male-female wage differentials, or the decomposition of wage inequality
differences between countries.
{p 4 4 2} {it:est1} is the name of the estimates related to the first
distribution (e.g. the distribution in country A, among males, or at
time point t), {it:est2} is the name of the estimates related to the second
distribution (e.g. the distribution in country B, among females, or at
time point t-1). Note that the samples underlying {it:est1} and {it:est2}
must be disjunctive.
{p 4 4 2} The model estimated last may be
indicated by a period (.), even if it has not yet been stored.
{p 4 4 2} See the {help oaxaca} package and the {help decompose} package
(available from the SSC archive; type
{net "describe http://fmwww.bc.edu/repec/bocode/o/oaxaca":ssc describe oaxaca}
and
{net "describe http://fmwww.bc.edu/repec/bocode/d/decompose":ssc describe decompose})
for programs to compute Oaxaca-Blinder type decompositions. See the
{help jmpierce2} package for the decomposition of changes of differentials over
time
({net "describe http://fmwww.bc.edu/repec/bocode/j/jmpierce2":ssc describe jmpierce2}).
{p 4 4 2}{hi:Warning:} {cmd:jmpierce} is intended for use with models
that have been estimated by the {help regress} command. Use {cmd:jmpierce} with
other models at your own risk.
{title:Options}
{p 4 8 2}{cmd:reference()} specifies the reference or benchmark model. The
default is {cmd:reference(0)}, meaning that the average coefficients from
{it:est1} and {it:est2} are used as the reference prices and the average
residual distribution from {it:est1} and {it:est2} is used as the reference
residual distribution. However, {cmd:reference(1)} uses {it:est1} and
{cmd:reference(2)} uses {it:est2} as the benchmark, i.e. the coefficients
from either {it:est1} or {it:est2} are used as the reference prices and the
residuals from {it:est1} or {it:est2} are used to determine the reference
residual distribution. Alternatively, specify
{cmd:reference(}{it:estref}{cmd:)}, where {it:estref} is the name of the
reference model. In this case, the coefficients from {it:estref} are used
as the reference prices and the average residual distribution from
{it:est1} and {it:est2} is used as the reference residual distribution. See
the "Methods and Formulas" section for more details.
{p 4 8 2} {cmd:statistics(}{it:statlist}{cmd:)} specifies the summary
statistics for which the decomposition be displayed. The default is
{cmd:statistics(mean)}. Specify, for example,
{bind:{cmd:statistics(p25 p50 p75)}} to compute the decomposition for the
25th, 50th, and 75th percentile. Available statistics are
{p 12 26 2}{it:statname} {space 2} definition{p_end}
{hline 50}
{p 12 26 2}{cmdab:me:an} {space 6} mean{p_end}
{p 12 26 2}{cmd:sd} {space 8} standard deviation{p_end}
{p 12 26 2}{cmdab:med:ian} {space 4} median (same as {cmd:p50}){p_end}
{p 12 26 2}{cmd:p5} {space 8} 5th percentile{p_end}
{p 12 26 2}{cmd:p10} {space 7} 10th percentile{p_end}
{p 12 26 2}{cmd:p25} {space 7} 25th percentile{p_end}
{p 12 26 2}{cmd:p50} {space 7} 50th percentile (same as {cmd:median}){p_end}
{p 12 26 2}{cmd:p75} {space 7} 75th percentile{p_end}
{p 12 26 2}{cmd:p90} {space 7} 90th percentile{p_end}
{p 12 26 2}{cmd:p95} {space 7} 95th percentile{p_end}
{p 12 26 2}{cmd:iqr} {space 7} interquartile range (same as {cmd:d7525}){p_end}
{p 12 26 2}{cmd:d9010} {space 5} p90 - p10{p_end}
{p 12 26 2}{cmd:d7525} {space 5} p75 - p25 (same as {cmd:iqr}){p_end}
{p 12 26 2}{cmd:d9050} {space 5} p90 - p50{p_end}
{p 12 26 2}{cmd:d5010} {space 5} p50 - p10{p_end}
{hline 50}
{p 4 8 2} {cmd:blocks(}{it:blist}{cmd:)} reports the quantity effect (Q) of
specified blocks of variables. Unless the decomposition is conducted at the
mean, the results will most likely depend on the order of the blocks.
{p 4 8 2} {cmd:save(}{it:newvar1} {it:newvar2}|{it:prefix}{cmd:)}
creates a variable reflecting the hypothetical outcome distribution under
the condition of fixed prices and fixed unobservables (called {it:newvar1} or
{it:prefix}{cmd:1}, respectively) and a variable reflecting the
hypothetical outcome distribution under the condition of fixed unobservables
(called {it:newvar2} or {it:prefix}{cmd:2}, respectively).
{p 4 8 2} {cmd:residuals(}{it:newvar}{cmd:)} creates a variable containing
the hypothetical residuals called {it:newvar}.
{title:Examples}
{p 4 4 2} Decomposition of the gender wage gap using the residuals/prices
of the male model as benchmark:
{com}. regress lnwage educ exp exp2 if sex==1
. estimates store male
. regress lnwage educ exp exp2 if sex==2
. estimates store female
. jmpierce male female, reference(1) statistics(mean median)
{txt}
{p 4 4 2} ... subdividing the quantity effect Q between education and experience:
{com}. jmpierce male female, reference(1) blocks(educ=educ, exp=exp*)
{txt}
{p 4 4 2} ... using a the average residual distribution and the prices of
a pooled model as benchmark:
{com}. regress lnwage educ exp exp2 if sex==1 | sex==2
. jmpierce male female, ref(.)
{txt}
{title:Saved Results}
{p 4 4 2}
Matrices:
{p 4 25 2}{cmd:r(D)}{space 17}The components of the decomposition(s){p_end}
{p 4 25 2}{cmd:r(stats1)}, {cmd:r(stats2)}{space 1}The summary statistics for the
hypothetical and raw distributions{p_end}
{p 4 25 2}{cmd:r(Qblocks)}{space 11}Quantity effect by (blocks of) variables{p_end}
{title:Methods and Formulas}
{p 4 4 2}
Closely following Juhn, Murphy and Pierce (1993):
Given are the models
{p 8 8 2}{bf:y}_1 = {bf:x}_1{bf:b}_1 + {bf:u}_1{p_end}
{p 8 8 2}{bf:y}_2 = {bf:x}_2{bf:b}_2 + {bf:u}_2{p_end}
{p 4 4 2} where {bf:y}_1 and {bf:y}_2 are the vectors of the values of the
dependent variable in two samples, {bf:x}_1 and {bf:x}_2 are the data
matrices (observable quantities), {bf:b}_1 and {bf:b}_2 are the vectors of
estimated coefficients (observable prices) and {bf:u}_1 and {bf:u}_2 are
the residuals (unmeasured prices and quantities).
{p 4 4 2} Let F_1(.) and F_2(.) denote the cumulative distribution functions
of the residuals. For example,
{p 8 8 2} p_i1 = F_1(u_i1|{bf:x}_i1)
{p 4 4 2}
is the percentile of an individual residual in the residual distribution of
model 1. By definition we can write
{p 8 8 2}u_i1 = F_1[-1](p_i1|{bf:x}_i1)
{p 4 4 2}where F_1[-1](.) is the inverse of the cumulative distribution
function.
{p 4 4 2}Next, assume that F(.) is a reference residual distribution
(e.g. the average residual distribution over both samples) and that {bf:b}
is an estimate of benchmark coefficients (e.g. the coefficients
from a pooled model over the whole sample). We can then determine
hypothetical outcomes with varying quantities between the groups but
fixed prices (coefficients) and a fixed residual distribution as
{p 8 8 2}{bf:y1}_i1 = {bf:x}_i1{bf:b} + F[-1]({bf:p}_i1|{bf:x}_i1){p_end}
{p 8 8 2}{bf:y1}_i2 = {bf:x}_i2{bf:b} + F[-1]({bf:p}_i2|{bf:x}_i2){p_end}
{p 4 4 2} Furthermore, the hypothetical outcomes with varying quantities
and varying prices but a fixed residual distribution are given as
{p 8 8 2}{bf:y2}_i1 = {bf:x}_i1{bf:b}_1 + F[-1]({bf:p}_i1|{bf:x}_i1){p_end}
{p 8 8 2}{bf:y2}_i2 = {bf:x}_i2{bf:b}_2 + F[-1]({bf:p}_i2|{bf:x}_i2){p_end}
{p 4 4 2}Finally, the outcomes with varying quantities, varying prices and
a varying residual distribution can be determined as
{p 8 8 2}{bf:y3}_i1 = {bf:x}_i1{bf:b}_1 + F_1[-1]({bf:p}_i1|{bf:x}_i1) {p_end}
{p 8 8 2}{bf:y3}_i2 = {bf:x}_i2{bf:b}_2 + F_2[-1]({bf:p}_i2|{bf:x}_i2) {p_end}
{p 4 4 2}
These last outcomes are obviously nothing else than the originally observed
values, that is:
{p 8 8 2}{bf:y3}_i1 = {bf:y}_i1 = {bf:x}_i1{bf:b}_1 + {bf:u}_i1 {p_end}
{p 8 8 2}{bf:y3}_i2 = {bf:y}_i1 = {bf:x}_i2{bf:b}_1 + {bf:u}_i2 {p_end}
{p 4 4 2}Let a capital letter stand for a summary statistic of the
distribution of the variable denoted by the corresponding lower-case
letter. For instance, Y may be the mean or the interquartile range of the
distribution of y. The differential Y_1-Y_2 can then be decomposed as
{p 8 12 2}Y_1-Y_2 = [Y1_1-Y1_2] + {bind:[(Y2_1-Y2_2) - (Y1_1-Y1_2)]} +
{bind:[(Y3_1-Y3_2) - (Y2_1-Y2_2)]} = T = Q + P + U
{p 4 4 2}That is, the total difference (T) can be attributed to
differences in observable quantities (Q), differences in observable
prices (P), and differences in unobservable quantities and prices (U).
{p 4 4 2}Technical notes:
{p 8 10 2}- {cmd:jmpierce}'s method to invert the empirical
distribution function uses averages where the function is flat (the same
method is used by {help summarize} and {help pctile}). Also see the
{net "describe http://fmwww.bc.edu/repec/bocode/i/invcdf":invcdf}
package
(available from the SSC archive; type {cmd:ssc d invcdf}). The
choice of the inversion method may have a significant
impact on the decomposition results (especially in small samples).
{p 8 10 2}- {cmd:reference(0)} (the default) causes {cmd:jmpierce} to use
the average coefficients from {it:est1} and {it:est2} as the reference
coefficients. The "average" coefficients are derived by computing a simple
arithmetic mean, that
is {bind:{bf:b} = ({bf:b}_1+{bf:b}_2)/2}.
{p 8 10 2}- {cmd:reference(0)} or {cmd:reference(}{it:estref}{cmd:)},
where {it:estref} is not {it:est1} nor {it:est2}, causes {cmd:jmpierce}
to use the average residual distribution from {it:est1} and {it:est2}
as the reference residual distribution. The "average" residual distribution
is computed by pooling the residuals from {it:est1}
and {it:est2}.
{title:References}
{p 4 8 2}
Juhn, Chinhui, Kevin M. Murphy, Brooks Pierce (1993). Wage
Inequality and the Rise in Returns to Skill. Journal of Political Economy
101(3): 410-442.
{p 4 8 2}
Blau, Francine D., Lawrence M. Kahn (1996). International
Differences in Male Wage Inequality: Institutions versus Market Forces.
Journal of Political Economy 104(4): 791-837.
{title:Author}
{p 4 4 2}
Ben Jann, ETH Zurich, jann@soz.gess.ethz.ch
{title:Also see}
{p 4 13 2}
Online: help for {help regress}, {help estimates}, {help cumul},
{help pctile}, {help oaxaca} (if installed), {help decompose}
(if installed), {help jmpierce2} (if installed), {help invcdf} (if installed)