/****** Panel-Corrected Standard-Errors (Franzese,Mar96) ****"A GAUSS Procedure to Implement Beck-Katz PCSEs in Data Sets with Non-Rectangular and/or Missing Data (with other Bells and Whistles)" It is a rare pooled data set whose data starts and ends in the sametime periods (t-p's) for each cross-section (c-s) and is devoid of missingvalues. Fortunately, this poses no intrinsic problem for the estimation ofthe panel-corrected standard-errors (PCSEs) suggested by Beck and Katz(APSR 1995). Unfortunately, the RATS and GAUSS procedures written byBeck and Katz to estimate PCSEs do assume rectangularity and the absenceof missing values. In writing a GAUSS procedure to handle missingvalues, I decided to automate the process as much as possible in thehopes that this would save me some effort in the future (it already has)and that it might be of some use to others working with this sort of data. The procedure takes as inputs vectors of variable and c-s names, thedependent and independent variables, the number of c-s's, the first andlast t-p's, and scalar switches for optional (panel) WLS, and/or c-s and/ort-p fixed effects. Organize your data in the usual way (c-s 1, t-p 1 toT; c-s 2, t-p 1 to T; etc). Always include a constant as your firstindependent variable and the dependent variable _name_ as your lastvariable name. (All this is noted in the procedure's comments.) If thedata are non-rectangular, just put missing values in to rectangularizeit. (Try to make your y-vector and x-matrix as compact as possible bynot including any t-p which no c-s uses; this will avoid procedure crashes.)The output is simlar to the OLS.SRC (Aptech) output in GAUSS, except thatstandard errors (and t-stats and p-levels) are from PCSEs and that the lastcolumn (correlation of that X with Y) is replaced with the OLS standard errors. The only trick to dealing with missing data is to realize that thecross-products of residuals, i.e., the sum over t of e(i,t)*e(j,t) doesnot contain the same number of valid observations for each i and j. Thusthe denominator of each element of the variance-covariance matrices are(potentially) different. The procedure offered here handles this bycreating a vector of ones for valid and zeros for missing observations.This vector can be reshaped and/or manipulated (summed, multiplied, ...) asnecessary to divide each element of the variance-covariance matrices bythe correct number of valid observations. (The code is heavily remarkedand should be understandable if you know GAUSS and PCSEs.) There are three additional bivariate options I have included in theprocedure. First, you can choose WLS or OLS. If WLS is requested, OLSis run first (this first-stage regression is currently set not to beoutput to the screen; you may change it by hardwiring the appropriate_output=0 line to _output=1). The residuals from that are squared andregressed on c-s dummies (in such a way that the F-stat for theregression is a test of the panel-weight model against homoskedasticity).(This regression is currently set to be output to the screen; you maychange that by hardwiring the appropriate line.) The inverse of thesquare roots of the fitted values from that regression are your usualpanel weights. (An alternative model of residual variance can besubstituted by hardwiring changes to the lines noted in the procedure.)WLS is then performed by running OLS on the transformed data, and PCSEsare calculated from that weighted regression. Weighted and unweightedstatistics are reported. The procedure will also automatically create a set of c-s and/or t-pdummies for fixed effect models if requested to do so. For a recentdiscussion of when this is appropriate and of interpretation see Smith(1995). (Keep the constant in your independent variables; it'll bedeleted for you. Make sure you do not otherwise create perfectcolinearity or the procedure with end with an error message from OLS.) Finally, in writing this procedure, I noticed that OLS.SRC ((c)APTECH), generates incorrect DW statistics in the presence of missingvalues (it "packs" vectors with missing values, making some of the"adjacent" observations not truly temporally adjacent; APTECH hasbeen notified of the error). This has been corrected in my procedure,but be warned that whenever you are using the DW statistic produced byGAUSS OLS.SRC outside of PCSE.G you are not getting the correct DW.(Incidentally, the DW is invalid when there are lagged dependentvariable(s) in the equation. An alternative test forfirst-order serial correlation with a lag is: h = (approx) (1-DW/2)*[n/(1-n*V(b1))] which is distributedassymptotically standard normal (where V(b1) is the estimated variance ofthe coefficient on the lagged dependent variable). Q and LM tests are other valid alternatives, see Johnston 1984 sections8-5 through 10-3.) Any suggestions on improving this procedure would be gratefullyreceived. The procedure is public domain and you may use it as you wish(though a citation of the source would be appropriate/appreciated). I amconfident the procedure correctly calculates PCSEs; however, I cannotaccept responsibility for any difficiulties, financial or otherwise,arising from its use. (My lawyer/sister insisted I officially note that.)