.- help for ^orthog^ (statalist: 10 July 1998) .- Orthogonalize variables ----------------------- ^orthog^ [varlist] [weight] [^if^ exp] [^in^ range] ^,^ ^g^enerate^(^newvarlist^)^ [ ^mat^rix^(^matname^)^ ^float^ ] ^aweight^s and ^fweight^s are allowed; see help @weights@. Description ----------- ^orthog^ orthogonalizes "varlist" and creates a new set of orthogonal variables "newvarlist" using a modified Gram-Schmidt procedure (Golub and Van Loan 1989). The order of the variables in "varlist" determines the orthogonalization. That is, if "varlist" is ^x1 x2 x3^, then the effect of the constant is first removed from ^x1 x2 x3^, then ^x1^ is removed from ^x2^ and ^x3^, and then ^x2^ is removed from ^x3^. If "newvarlist" is ^q1 q2 q3^, we have q1 = a10 + a11*x1 q2 = a20 + a21*x1 + a22*x2 q3 = a30 + a31*x1 + a32*x2 + a33*x3 where ^q1 q2 q3^ are orthogonal and aij are constants. Options ------- ^generate(^newvarlist^)^ is not optional. It creates new variables containing the orthogonalized "varlist". "newvarlist" must either contain exactly the same number of variable names as "varlist" or be abbreviated using either "newvar1-newvar#" or "newvar*". See examples below. ^matrix(^matname^)^ creates a m x m matrix called "matname" containing the matrix R defined by X = QR, where X is the m x n matrix representation of "varlist" and Q is the m x n matrix representation of "newvarlist" (m = number of variables in "varlist" plus the constant; n = number of observations). ^float^ specifies that the new variables be of type float. The default is double. Warning ------- With many variables, ^orthog^ will be slow. Time required is proportional to the square of the number of variables. Examples -------- . ^orthog x1 x2 x3, gen(u1 u2 u3)^ . ^orthog x1 x2 x3, gen(u1-u3)^ . ^orthog x1 x2 x3, gen(u*)^ . ^orthog x1 x2 x3, gen(u*) matrix(r)^ . ^orthog x*, gen(u*) mat(R) float^ The matrix R created by the ^matrix()^ option can be used to transform coefficients from a regression: . ^orthog x*, gen(u*) mat(R)^ . ^regress y u*^ . ^matrix bu = get(_b)^ . ^matrix invR = inv(R)^ . ^matrix b1 = bu*invR'^ [note that the transpose of invR is used] . ^regress y x*^ . ^matrix b2 = get(_b)^ Then b1 and b2 will be the same. The matrix R can also be used to recover X (original "varlist") from Q (orthogonalized "newvarlist") one variable at a time: . ^orthog price weight mpg, gen(upr uwei umpg) mat(R)^ . ^matrix c = R[.,"price"]^ . ^matrix c = c'^ [^matrix score^ requires a row vector] . ^matrix score double samepr = c^ . ^compare price samepr^ That is, the variable ^samepr^ is the same as the original ^price^. This procedure can be performed as a check of the numerical soundness of the orthogonalization. Methods and formulas -------------------- The X = QR orthogonalization is computed using a modified Gram-Schmidt procedure (Golub and Van Loan 1989). The columns of Q are orthogonal and R is upper triangular (actually R is a permuted upper triangular matrix with row/column 1 interchanged with row/column m so that the last row corresponds to the constant term). Q is normalized so that Q'WQ = NI where W = diag(w1, w2,..., wn) with w1, w2,..., wn the weights (all 1 if weights not specified), and N is the sum of the weights. If the weights are ^aweight^s, they are first normalized so that N is the number of observations. Author ------ Bill Sribney Stata Corporation 702 University Drive East College Station, TX 77840 Phone: 409-696-4600 800-782-8272 Fax: 409-696-4601 email: tech_support@@stata.com Reference --------- Golub, G.H. and C.F. Van Loan. 1989. Matrix Computations, 2nd ed. Baltimore: Johns Hopkins University Press, pp. 218-219. Also see -------- Manual: ^[R] orthpoly^ On-line: help for @matrix@, @orthpoly@, @regress@