------------------------------------------------------------------------------- help for estparm and estparmtest (Roger Newson) -------------------------------------------------------------------------------

Save results from a parmest resultsset and test equality

estparm varlist [if] [in] [ , level(#) eform obs(varname) eq(varname) dfcombine(combination_rule) ]

estparmtest varlist [if] [in] [ , level(#) eform obs(varname) eq( varname) dfcombine(combination_rule) ]

where varlist is a list of 2 or 3 variables with the syntax

estimate_varname stderr_varname [ dof_varname ]

and estimate_varname, stderr_varname and dof_varname are the names of existing variables, containing parameter estimates, standard errors, and degrees of freedom, respectively. If the dof_varname is specified, then the test statistics are assumed to have a t-distribution, with degrees of freedom equal to the sum of the variable specified by dof_varname. If the dof_varname is not specified, then the test statistics are assumed to have a standard Normal sampling distribution.

A combination_rule is

sum | constant

by and statsby are allowed; see prefix.

Description

estparm is an inverse of parmest. It inputs 2 or 3 variables in the varlist, containing parameter estimates, standard errors, and (optionally) degrees of freedom. It saves a set of estimation results for the parameters, assuming that the parameter estimates are statistically uncorrelated. estparmtest is an extended version of estparm, which also performs a chi-squared or F test of the hypothesis that all these parameters are equal. estparmtest can be used for performing interaction tests, using data from regression models for multiple subsets, stored in a parmest or parmby output dataset (or resultsset).

Options for estparm and estparmtest

level(#) specifies the confidence level, as a percentage, for confidence intervals of the estimates; see level.

eform indicates that the input estimates are exponentiated, and that the input standard errors are multiplied by the exponentiated estimate, as produced by the eform option of parmest and parmby. Note that the output confidence limits are not exponentiated, whether or not eform is specified. (This is because the column names in the estimation output matrices are all set to _cons, and the ereturn display command of Stata will not print parameters with that name if the eform() option is specified.)

obs(varname) specifies an existing numeric variable, whose sum is stored in the estimation results as e(N) and reported in the output as the total number of observations. Such a variable might have been created by parmest or parmby, using the option escal(N).

eq(varname) specifies an existing string variable, whose values are used to specify the equation names used in the estimation results. It is the responsibility of the user to ensure that values of this variable are unique. If eq() is not specified, then the equation names are set to numbers from 1 to the number of parameters, in order of appearance in the dataset. (The column names used in the estimation results are all set to _cons.)

dfcombine(combination_rule) specifies a rule for combining the degrees of freedom of the input parameters to define the combined residual degrees of freedom for the output estimation results, if a degrees of freedom variable is specified. This combined residual degrees of freedom is used by estparmtest as the denominator degrees of freedom in an F-test. The value of dfcombine() may be sum (the default), or constant. If sum is specified, then the residual degrees of freedom for the output estimation results is the sum of the values of the input degrees of freedom variable. If constant is specified, then estparm and estparmtest check that the degrees of freedom variable is equal to a constant value in all input observations, and sets the output residual degrees of freedom to that constant value. The specification dfcombine(constant) is useful if the input parameters are uncorrelated parameters of the same model, and that model was estimated with a single estimation command, using a common degrees of freedom for all parameters. This will be the case if the parameters are group means, and the model is a regression model, whose only parameters are group means, and which was fitted using a single regression command, with a single pooled scalar degrees of freedom. If there is no input degrees of freedom variable, then the dfcombine() option is ignored.

Remarks

estparm and estparmtest are designed for use with output datasets (or resultssets) produced (directly or indirectly) by the parmest package, which can be downloaded from SSC, and contains the programs parmest, parmby, parmcip and metaparm. They are intended mainly to perform tests of the hypothesis that several independently estimated parameters are equal. In particular, if there are multiple subsets of observations in the data, and a regression model is fitted to each subset, then estparmtest may be used to test the hypothesis that all the regression coefficients are equal. This hypothesis is thought by some scientists to be an interesting hypothesis to test. For instance, in the medical sector, if the subsets are defined by genotypes, and the regression model measures the effect of an exposure variable on levels of a disease, then a test of the hypothesis of equality between the regression coefficients in the subsets is known as a test of gene-environment interaction.

Technical note

If the model fitted to each subset is the same linear regression model, fitted using regress with the robust option, then the F-test statistic output by estparmtest should be the same as the corresponding F-test statistic output by test if those models are fitted as a single regression model to all subsets (using regress with the robust option), with a separate parameter set for each subset. If the model fitted to each subset is the same generalized linear model, fitted using glm with the vce(robust) option, then the chi-squared test statistic output by estparmtest should be slightly less than the corresponding chi-squared statistic output by test if those models are fitted as a single regression model to all subsets (using glm with the vce(robust) option), with a separate parameter set for each subset. However, the two chi-squared statistics will be asymptotically equivalent in large samples. This is because the Huber variance formula used by Stata uses a scale factor of N/(N-1) in generalized linear models and a scale factor of N/(N-k) in linear regression models, where N is the number of observations and k is the number of parameters.

Examples

The following examples use the auto data, with an added variable seqgp4, which groups the dats using groups of 4 successive observations. In each group of 4 successive observations, the first, second, third and fourth observation are assigned to groups 1, 2, 3 and 4, respectively. In each example, the parmby module of the SSC package parmest is used to create an output dataset (or resultsset), with 1 observation per estimated parameter. In this resultsset, estparmtest is used to test the hypothesis that the regression slopes or odds ratios in the 4 groups are equal.

------------------------------------------------------------------------------- Setup . sysuse auto, clear . gene seqgp4=mod(_n,4)+1 . lab var seqgp4 "Sequential group (of 4)" . tab seqgp4

------------------------------------------------------------------------------- Example with regress . preserve . parmby "regress mpg weight", by(seqgp4) norestore escal(N) ren(es_1 N) . estparmtest estimate stderr dof if parm=="weight", obs(N) . ereturn list . return list . restore

------------------------------------------------------------------------------- Example with logit . preserve . parmby "logit foreign mpg", by(seqgp4) norestore escal(N) ren(es_1 N) . estparmtest estimate stderr if parm=="mpg", obs(N) . ereturn list . return list . restore

------------------------------------------------------------------------------- Example with logit and the eform option . preserve . parmby "logit foreign mpg, or", by(seqgp4) norestore escal(N) ren(es_1 N) eform . estparmtest estimate stderr if parm=="mpg", obs(N) eform . ereturn list . return list . restore

The following example illustrates the use of estparmtest with statsby. Note that the basepop() option of statsby is used to prevent the matsize from being exceeded.

. statsby pvalue=r(p), by(snpoly) clear basepop(snpoly==1): estparmtest estimate stderr if parm=="exposure", obs(N) eform

Saved results

estparm and estparmtest save the following in e():

Scalars e(N) total number of observations e(df_r) total degrees of freedom

Macros e(cmd) estparm) e(cmdline) command as typed e(predict) program used to implement predict e(dfcombine) dfcombine() option e(properties) b V)

Matrices e(b) coefficient vector) e(V) variance-covariance matrix of the estimators e(df_estparm) parameter-specific degrees of freedom

The parameter-specific degrees of freedom vector has the dimensions of the coefficient vector, and contains the contents of the degrees of freedom variable.

estparmtest also saves the following in r():

Scalars r(p) two-sided p-value r(F) F statistic r(df) test constraints degrees of freedom r(df_r) residual degrees of freedom r(dropped_i) index of ith constraint dropped r(chi2) chi-squared r(ss) sum of squares (test) r(rss) residual sum of squares r(drop) 1 if constraints were dropped, 0 otherwise

Author

Roger Newson, Imperial College London, UK. Email: r.newson@imperial.ac.uk

Also see

Manual: [R] estimates, [R] test, [P] ereturn, [D] statsby

Help: [P] estimates, [R] test, [P] ereturn, [D] statsby parmest, parmby, parmcip, metaparm if installed