-------------------------------------------------------------------------------
Title
simuped2 -- Simulate two-generation families
simuped3 -- Simulate three-generation families
Syntax
simuped2 #Age1 #Std1 #Age2 #Std2 [, options]
simuped3 #Age1 #Std1 #Age2 #Std2 #Age3 #Std3 [, options]
options Description ------------------------------------------------------------------------- Main reps(#) specifies the number of families to be simulated saving(filename) specifies the file name of the simulated data alle(#) specifies the allele frequency of a biallelic locus A sib(#) spcifies the mean number of siblings in the second generation
For simuped3 only si3(#) spcifies the mean number of siblings in the third generation -------------------------------------------------------------------------
Description
simuped2 and simuped3 are immediate commands to generate two- and three-generation family data, respectively. The number of siblings in a family is determined by a Poisson distribution with a mean specified by sib(#) or si3(#), where the Poisson variate is generated using rndpoix. This program needs to be installed before running simuped2 or simuped3. The gender of a person is determined by a Bernoulli distribution with mean 0.5 and age is determined by a normal distribution with means #Age1, #Age2 and #Age3 for the first, second and third generation, respectively. The corresponding standard deviation is given by #Std1, #Std2 and #Std3 for the three generations, respectively.
Hardy-Weinberg equilibrium is assumed for the genotypic distribution of people in the first generation (Elandt-Johnson 1971). The allele frequency of a biallelic locus A is specified by option alle(#), denoted as p. The frequencies of genotypes AA, Aa and aa in the first generation are given by p^2, 2p(1-p) and (1-p)^2, respectively. The genotype of a person in the second- and third-generation is generated according the Mendelian inheritance, that is, a person inherits the allele A from the father (or mother) with probability 0.5. The simulated family data are saved in a file specified by saving(filename), and the number of replications is specified by reps(#).
Options
+------+ ----+ Main +-------------------------------------------------------------
reps(#) specifies the number of families to be simulated. The default value is 100.
saving(filename) specifies the file name of the simulated data. The default file name is temp.dta.
alle(#) specifies the allele frequency of a biallelic locus A. The default value is 0.1.
sib(#) spcifies the number of siblings in the second generation. The default value is 3.
+-------------------+ ----+ For simuped3 only +------------------------------------------------
si3(#) spcifies the number of siblings in the third generation. The default value is 3.
Examples
clear
simuped2 70 10 40 10, reps(1000) sav(output) alle(0.05) sib(5)
simuped3 80 10 50 10 20 10, reps(2000) alle(0.1) sib(4) si3(3.5)
Also see
STB: STB-58: dm82
References
Cui J. Simulating two- and three-generation families. Stata Technical Bulletin 2000; 58: 2-5.
Elandt-Johnson R. Probability models and statistical methods in genetics. New York: John Wiley & Sons, 1971.
Author
James Cui, Department of Epidemiology and Preventive Medicine, Monash University.
Email: james.cui@med.monash.edu.au
Other Commands I have written: genhwcci (if installed) ssc install genhwcci (to install this comman > d) phenotype (if installed) ssc install phenotype (to install this comman > d) buckley (if installed) ssc install buckley (to install this comman > d) qic (if installed) ssc install qic (to install this comman > d)