{smcl} {hline} help for {hi:hapblock} {hline} {title:Haplotype Block Edge Identification using hapipf} {p 8 27} {cmdab:hapblock} [{it:varlist}] [{cmd:using}] [, {cmdab:mv} {cmdab:mvdel} {cmdab:hlen}{cmd:(}{it:numlist}{cmd:)} {cmdab:s:tart}{cmd:(}{it:#}{cmd:)} {cmdab:replace} {cmdab:b:lock}{cmd:(}{it:filename}{cmd:)} ] {title:Description} {p 0 0} This command systematically fits a series of {hi:hapipf} log-linear models that attempts to find the edge of areas containing high LD within a set of loci. {p 0 0} The log-linear model is fitted using iterative proportional fitting which is available using {hi ssc} and is called {hi:ipf} (version 1.36 or later). Additionally, the user will also have to install {hi:hapipf} (version 1.44 or later). This algorithm can handle very large contingency tables and converges to maximum likelihood estimates even when the likelihood is badly behaved. {p 0 0} If you are connected to the Web you can install the latest version by clicking here {stata ssc install hapipf}. The latest version of {hi:hapblock} can be installed here {stata ssc install hapblock,replace}. {p 0 0} The {hi:varlist} consists of paired variables representing the alleles at each locus. If phase is known then the paired variables are in fact the genotypes. When phase is unknown the algorithm assumes Hardy Weinberg Equilibrium so that models are based on chromosomal data and not genotypic data. {p 0 0} This algorithm can handle missing alleles at the loci by using the {hi:mv} or {hi:mvdel} option. {title:Options} {p 0 0} {cmdab:mv} specifies that the algorithm should replace missing data (".") with a copy of each of the possible alleles at this locus. This is performed at the same stage as the handling of the missing phase when the dataset is expanded into all possible observations. If this option is not specified but some of the alleles do contain missing data the algorithm sees the symbol "." as another allele. {cmdab:mvdel} specifies that people with missing alleles are deleted. {cmdab:hlen}{cmd:(}{it:numlist}{cmd:)} specifies the width of the sliding window of models. {cmdab:s:tart}{cmd:(}{it:#}{cmd:)} specifies the starting loci in the varlist. This is useful when the algorithm is taking a long time and hence the command can be rerun from the loci that the algorithm ended prematurely. {cmdab:replace} specifies that the results file created can be overwritten. {cmdab:b:lock}{cmd:(}{it:filename}{cmd:)} specifies that the calculated block sizes and p-values are saved to a file and is named {it:filename}.dta {title:Examples} {p 0 0} Take a dataset with 70 loci, the pairs of alleles at locus i are the variables li_1 and li_2. {inp:.hapblock l1_1-l70_2, hlen(6) s(10) mvdel} This will make the following comparisons l10*l11*l12+l13*l14*l15 vs l10*l11*l12*l13*l14*l15 l11*l12*l13+l14*l15*l16 vs l11*l12*l13*l14*l15*l16 l12*l13*l14+l15*l16*l17 vs l12*l13*l14*l15*l16*l17 l13*l14*l15+l16*l17*l18 vs l13*l14*l15*l16*l17*l18 e.t.c. If you specify the {hi:mvdel} missing data option then these models might not be on the same subjects. {title:Author} {p} Adrian Mander, Cambridge, UK. Email {browse "mailto:junk.ade@ntlworld.com":junk.ade@ntlworld.com} {title:Also see} Related commands HELP FILES Installation status SSC installation links {help hapipf} (MUST be installed) ({stata ssc install hapipf}) {help ipf} (MUST be installed) (the above installs {hi:ipf}) {help swblock} (if installed) ({stata ssc install swblock}) {help gipf} (if installed) ({stata ssc install gipf}).