Poisson Regression Methods

In many situations of practical interest the dependent variable in an experiment or observational study is a count that follows the Poisson distribution. The relation between the count or event rate and the explanatory variables is described by a regression function that is, in general, nonlinear in the unknown parameters. The equivalence of maximum likelihood (ML) and iterative weighted least squares (IWLS) was first shown for a nonlinear survival curve model that is widely used in radiation biology by Frome and Beauchamp(Biometrics,1968) The equivalence of IWLS and ML estimation for general nonlinear regression functions for Poisson distributed data was presented by Frome 1972 ,and Frome, Kutner and Beauchamp (JASA,1973)---hence the term Poisson regression.

More generally, it was shown that IWLS computations yield ML estimates for data from the regular exponential family--- see Charnes, Frome, and Yu (JASA,1976) . This result is important since it showed that statistical software that supported weighted least squares could be used to obtain ML estimates for arbitrary regression functions when the dependent variable was in the regular exponential family (e.g., Poisson, Binomial,Gamma and Gaussian distributions.) . For a review of Poisson and Binomial Regression Model see Regression Methods for Binomial and Poisson Distributed Data(2MB PDF)

Another important area where Poisson regression methods are used is in the analysis of data organized in a life-table type of format that is encountered in epidemiologic follow-up studies ---see Frome(Biometrics,1983) Nelder-Frome comments (Biometrics,1986) and Frome and Checkoway (AJE,1985). Poisson regression yields ML estimates, their asymptotic covariance matrix, hypothesis testing procedures, and regression diagnostics that are now widely used in cohort studies and other epidemiologic applications that involve event rates---see Frome and Morris(1989), Frome et al(1990) Frome et al(1997)

A new graph traversal algorithm has been developed that is used for model selection for large multidimensional tables encountered in epidemiology and other disciplines---Ostrouchov and Frome(1993).

A third area where Poisson regression methods are of importance is in the analysis of chromosome aberrations. In many situations the number of chromosome aberrations follows the Poisson distribution and the dose-response relations of biologic interest are intrinsically nonlinear in the unknown parameters. Frome and DuFrain (Biometrics,1986) show how Poisson regression methods could be used to analyze cytogenetic dose-response data. The examples include regression functions that are linear or nonlinear (as opposed to log-linear).

An efficient algorithm based on Fisher's exact test was developed (Frome,1982). This test is used to verify the assumption of Poisson variation in situations where the data may be sparse.

The effective use of Poisson regression in over 30 publications and citations to above references are presented here, demonstrating the value of these methods to researchers in many areas that routinely deal with discrete data and regression models.

Most of the computaions described in the papers on this page can be done using R. R is a language and environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms and similar systems (including FreeBSD and Linux), Windows, and MacOS. R is available as Free Software under the terms of the Free Software Foundation's GNU General Public License in source code form and as binaries. Visit the R Homepage and read What is R?( see Menu on left) and the FAQs under Documentation.