help ipf-------------------------------------------------------------------------------

Title

Log-linear modelling using Iterative Proportional Fitting

Syntax

ipf[varlist] [weight] [,options]

optionsDescription ------------------------------------------------------------------------- Mainfit(string) specifies the log-linear model.constr(string) specifies initial values for the expected frequencies.confile(filename) specifies a *.dta file that contains initial values for the expected counts.convars(varlist) specifies the variables specified in constraint file.save(filename) saves the expected frequencies and probabilities per cell.expectspecifies that the expected frequencies are displayed.nologspecifies whether the log-likelihood is displayed at each iteration. -------------------------------------------------------------------------

fweightsare allowed; see help weights.

DescriptionThe iterative proportional fitting (IPF) algorithm is a simple method to calculate the expected counts of a hierarchical loglinear model. The algorithm's rate of convergence is first order. The more commonly used Newton-Rahpson algorithm is second order, however, each iteration of the IPF algorithm is quicker because Newton-Rahpson inverts matrices. This makes the IPF algorithm much quicker for contingency tables with high dimensionality.

The IPF algorithm has the following steps

1) Initial estimates of the expected frequencies are given. The initial estimates should have associations and interactions that are less complex than the model being fitted. By default the initial frequencies are 1.

2) Successively adjust the estimates of the expected frequencies by scaling factors so they match each marginal table.

3) The scaling continues until the loglikelihood converges.

The algorithm always converges to the correct expected frequencies even when the likelihood is poorly behaved, for example, when there are zero fitted counts.

The varlist defines the dimension of the continguency table that the Poisson likelihood is calculated over. If the varlist is not specified the variables in the fit() option define the dimensions of the continguency table.

Latest VersionThe latest version is always kept on the SSC website. To install the latest version click on the following link

ssc install ipf, replace.

Options+------+ ----+ Main +-------------------------------------------------------------

fit(string)specifies the loglinear model. It requires special syntax of the formvar1*var2+var3+var4.The termvar1*var2includes all the interactions between the two variables and also the main effects ofvar1andvar2. The main effectsvar3andvar4are also included in the model but no interactions. This syntax is used in most books on Loglinear modelling.

constr(string)specifies initial values for the expected frequencies. The syntax requires an if statement followed by a value for the expected frequency. Hence [sex=="male"]2 replaces all initial values for males to be 2.

confile(filename)specifies a *.dta file that contains initial values for the expected counts, the variable containing the frequencies must be called Efreqold. This option requiresconvarsalso to specified.

save(filename)specifies the expected frequencies and probabilities for every cell to be saved in a *.dta file.

convars(varlist)specifies the variables in the file specified inconfile(), excluding Efreqold because this variable is always needed.

expectspecifies that the expected frequencies are displayed.

nologspecifies whether the loglikelihood is displayed at each iteration.

ExamplesFor a 3-way continguency table containing the factors sex, age and treatment the saturated model is given by

.ipf, fit(sex*age*treatment)

If the data was not individual records the command would require a variable containing the frequency counts,

freqsay..ipf [fw=freq], fit(sex*age*treatment)

Using a file for initial frequencies.

.ipf [fw=freq], fit( sex+age) convars(sex age) confile(constrain) exp

AuthorAdrian Mander, MRC Biostatistics Unit, Cambridge, UK. Email adrian.mander@mrc-bsu.cam.ac.uk

Also seeRelated commands

HELP FILES SSC installation links Description

gipf (if installed) (ssc install gipf) Graphical representation of a log-linear model hapipf (if installed) (ssc install hapipf) Haplotype frequency estimation using log-linear models