/********************** Introduction ************************/ This program implements the methods discussed in : Steele, B.M. 1996. A modified EM algorithm for estimation in generalized mixed models. Biometrics 52, 1295-1310. The GLMM assumes that the response variable Y follows a generalized linear model conditional on a vector b of random effects. This package handle conditionally binomial and Poisson responses. Extensions to additional distributions are possible. We assume a canonical link k relating the conditional mean m = E[Y | b] and the linear predictor eta such that k(m) = eta = Xa + Zb. b is composed of r subvectors corresponding to r random factors. X and Z are known design matrices, a and b are called the random and fixed effect vectors respectively. b is stacked as b1 | b2 | .... | br and Z is partitioned conformably as Z1~Z2~...~Zr. The bi's are assumed to be composed of iid N(0,si) or iid log-gamma(si) random variables. (See Steele [1996] for a discussion of the log-gamma model and extensions to other error distributions.) The si's are the variance components. The MEMN and MEMLG procs will estimate a, b, and s = s1 | ... | sr. The conditional response variable dist'n is specified by setting respv = "B" (binomial) respv = "P" (Poisson). Additional distributions can be introduced. You need to modify the procs M_B to compute m = E[Y | b], W_B to compute W = Var[Y | b], and Q_B to compute derivative of W wrt to eta or equivalently, the second derivative of m wrt to eta. The random effect dist'ns is specified by setting eff = "N" (normal), eff = "LG" (log-gamma). If eff = "LG" then set sn = 1 for the right-skewed log-gamma and sn = -1 for the left-skewed log-gamma. sn is ignored if eff = "N". Additional distributions can be introduced with some effort. /********************** INSTRUCTIONS ************************/ Program structure: 1. A user-defined data proc initializes or reads a data file and sets up scalars and matrices. The data proc is to return: r = number of variance components Z = the design matrix for the random effects q = an r-vector containing the number of random effects associated with the r random factors Important: q lists, in order, the number of columns in each of the submatrices Zi where Z = Z1~Z2~...~Zr. X = design matrix for the fixed effects. y = response vector name = character vector, names of the fixed effect vectors dname = string used as a header to identify the data nvec is returned if the response is binomial. nvec is a vector of binomial denomiators conformable with y and used if the response variable distribution is chosen to be binomial. nvec is ignored otherwise. Several examples of data procs using published data sets are provided in the program file. 2. Initial estimates of the fixed effects vector a are computed by a Newton-Raphson algorithm called MEMINIT. It is passed X, y, respv, and sn (described above) and it returns an initial estimate of a. 3. Parameter estimates are computed via the modified EM (MEM) algorithm. MEM estimates are computed by calling MEMN or MEMLG to depending on whether the random effect dist'n is normal or log-gamma, respectively. Both procs are passed the arguments a,s,r,z,q,x,y,nvec,respv,eff,s,test,lflag Most of the arguments are discussed above. The remainder are: s is a user-supplied guess are the variance component vector. s has r elements where r is the number of random factors. For the log-gamma, s is the parameter vector. Variances are 1/s so choose large initial guesses (e.g., 10). test is the MEM convergence criteron. 1E-5 is usually adequate. lflag = 1 forces the algorithm to compute and print out the approximate marginal likelihood for each interation. lflag = 0 suppresses the computation. Both procs return the parmeter estimates and a prediction of given by the mode of the complete data likelihood and an approximation of E[b | y]. A sharper approximation is obtained adding the second- order approximation computed by the proc A_Nb, i.e., b = b + A_Nb(qzvin,vec(Z*vin)) for normal random effects and b = b + A_LGb(s,b,qzvin,vec(Z*vin)); for log-gamma random effects. 4. Standard errors are computed by one of two algorithms: IOMEMN - use if normal random effects are assumed IOMEMLG - use if log-gamma random effects are assumed Standard errors are approximated by differentiating the approximate marginal likelihood (Laplace approximation). An alternative approach which seems to work well in some instances is the SEM algorithm of Meng and Rubin. Both procs IOMEMN and IOMEMLG return ion = observed information matrix se = vector of se's conformable with a|s The proc prints out the square root of the estimates and the large sample (Delta method) estimates of the standard errors. Please contact me if you have any questions.