help egen vreldifAuthor: Stas Kolenikov -------------------------------------------------------------------------------

Title

[D] egen-- Extensions to generate

Syntax

egen[type]newvar= vreldif(varlist1)[if] [in], by(varlist2)

egen ... = vreldif()creates a variable that contains the relative differences (see mreldif) of the variables invarlist1within the values identified byvarlist2. It is useful to compare results in two appended data sets when some minor numeric discrepancies are expected.

by(varlist2)is required. It is expected that each unique combination of variables invarlist2identifies at most two observations in the data set.Comparable functionality can be achieved by the following Stata pseudocode:

generatenewvar= 0

foreach x of varlistvarlist1{

bysortvarlist2: replacenewvar=newvar+ reldif( `x'[1], `x'[_N] )

}

bysortvarlist2: replacenewvar= . if _N == 1

Example

. sysuse auto, clear

. set seed 10101

. gen byte replic = ceil( 0.5+1.5*uniform())

. expand replic, gen( datacopy )

. tabulate datacopy

. replace weight = weight + uniform()

. egen check1 = vreldif(mpg price), by(make)

. egen check2 = vreldif(mpg weight), by(make)The variable

check1should be zero in the observations that were doubled up, and missing in the unique observations:

. assert check1 == 0 if !missing( check1 )The variable

check2will not be zero in the observations that were doubled up, so this assert should fail:

. assert check2 == 0 if !missing( check2 )Since the differences of the values in the

weightvariable between the two "copies" of data (identified bydatacopyvariable) are in the fourth digit, the non-missing values ofcheck2should be of the order 1e-4: