Overdispersion in the Poisson Regression Model
Abstract
Abstract. This simulation study compares different strategies to solve the problem of underestimating standard errors in the Poisson regression model when overdispersion is present. The study analyses the importance of sample size, Poisson distribution mean, and dispersion parameter in choosing the best index or estimate. Results show that standard error (SE) estimates obtained by resampling (nonparametric bootstrap and jackknife) are the least biased, followed by the direct index based on the χ2, and the so-called robust indexes, in third place. Nevertheless, the inefficiency of resampling estimates is also evident, especially in small samples.
References
Belsley, D.A. , Kuh, E. , Welsch, R.E. (1980). Regression diagnostics: Identifying influential data and sources of collinearity . New York: WileyBreslow, N.E. (1996). Generalized linear models: Checking assumptions and strengthening conclusions. Statistica Applicata, 8, 23– 41Cameron, A.C. , Trivedi, P.K. (1986). Econometric models based on count data: Comparisons and applications of some estimators and tests [Electronic version]. Journal of Applied Econometrics, 1, 29– 53Cameron, A.C. , Trivedi, P.K. (1990). Regression-based tests for overdispersion in the Poisson model. Journal of Econometrics, 46, 347– 364Cameron, A.C. , Trivedi, P.K. (1998). Regression analysis of count data . Econometric Society Monographs, 30. Cambridge: Cambridge University PressCox, D.R. , Snell, E.J. (1989). Analysis of binary data . London: Chapman and HallCribari-Neto, F. (2004). Asymptotic inference under heteroskedasticity of unknown form. Computational Statistics and Data Analysis, 45, 215– 233Cribari-Neto, F. , Zarkos, S.G. (1999). Bootstrap methods for heteroskedastic regression models: Evidence on estimation and testing. Econometric Reviews, 18, 211– 228Cribari-Neto, F. , Zarkos, S.G. (2001). Heteroskedasticity-consistent covariance matrix estimation. Journal of Statistical Computation and Simulation, 68, 391– 412Davidson, R. , MacKinnon, J.G. (1993). Estimation and inference in econometrics . Oxford: Oxford University PressDean, C.B. , Eaves, D.M. , Martinez, C.J. (1995). A comment on the use of empirical covariances matrices in the analysis of count data. Journal of Statistical Planning and Inference, 48, 197– 205Efron, B. (1979). Bootstrap methods: Another look at the jackknife. Annals of Statistics, 7, 1– 26Efron, B. (1982). The jackknife, the bootstrap, and another resampling plans . CBMS Regional Conference Series in Applied Mathematics 38. Philadelphia: SIAM PublicationsEicker, F. (1967). Limit theorems for regressions with unequal and dependent errors. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability (pp 59-82). Berkeley, CA: University of California PressGardner, W. , Mulvey, E. , Shaw, E. (1995). Regression analyses of counts and rates: Poisson, overdispersed Poisson, and negative binomial models. Psychological Bulletin, 118, 392– 404Greene, W.H. (1993). Econometric analysis . New York: MacmillanGreene, W.H. (2000). Econometric analysis (4th ed.). New York: Prentice HallGuo, J.Q. , Li, T. (2002). Poisson regression models with errors-in-variables: Implication and treatment. Journal of Statistical Planning and Inference, 104, 391– 401Gurmu, S. (1991). Tests for detecting overdispersion in the positive Poisson regression model. Journal of Business and Economic Statistics, 9, 215– 222Heinzl, H. , Mittlböck, M. (2003). Pseudo R-squared measures for Poisson regression models with over- or under-dispersion. Computational Statistics and Data Analysis, 44, 253– 271Hinkley, D.V. (1977). Jackknifing in unbalanced situations. Technometrics, 19, 285– 292Huber, P.J. (1967). The behavior of maximum likelihood estimates under nonstandard conditions. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability (pp. 221-233). Berkeley, CA: University of California PressHutchinson, M.K. , Holtman, M.C. (2005). Analysis of count data using Poisson regression. Research in Nursing and Health, 28, 408– 418Kauermann, G. , Carroll, R.J. (2001). A note on the efficiency of sandwich covariance matrix estimation. Journal of the American Statistical Association, 96, 1387– 1396Krzanowski, W.J. (1998). An introduction to statistical modeling . London: ArnoldLee, L.F. (1986). Specification test for Poisson regression models. International Economic Review, 27, 689– 706Liao, T.F. (1994). Interpreting probability models. Logit, probit, and other generalized linear models . London: SageLindsey, J.K. (1998). Counts and times to events. Statistics in Medicine, 17, 1745– 1751Lindsey, J.K. (1999). On the use of corrections for overdispersion. Journal of the Royal Statistical Society - Series C (Applied Statistics), 48, 553– 561Long, J.S. , Ervin, L.H. (1998). Correcting for heteroscedasticity with heteroscedasticity-consistent standard errors in the linear regression model: Small sample considerations . Unpublished working paper, Indiana UniversityLong, J.S. , Ervin, L.H. (2000). Using heteroskedasticity consistent standard errors in the linear regression model. The American Statistician, 54, 217– 224MacKinnon, J.G. , White, H. (1985). Some heteroskedasticity-consistent covariance matrix estimators with improvement finite samples properties. Journal of Econometrics, 29, 305– 325McCullagh, P. , Nelder, J.A. (1989). Generalized linear models (2nd ed.). London: Chapman & HallMiller, R.G. (1974). The jackknife - A review. Biometrika, 61, 1– 15Quenouille, M.H. (1949). Approximate test of correlation in time series. Journal of the Royal Statistical Society - Series B (Methodological), 11, 68– 84Quenouille, M.H. (1956). Notes on bias in estimation. Biometrika, 43, 353– 360Sturman, M.C. (1999). Multiple approaches to analyzing count data in studies of individual differences: The propensity for type I errors, illustrated with the case of absenteeism prediction. Educational and Psychological Measurement, 59, 414– 430Tukey, J.W. (1958). Bias and confidence in not quite large samples (Abstract). Annals of Mathematical Statistics, 29, 614–Wedderburn, R.W.M. (1974). Quasi-likelihood functions, generalized linear models, and Gauss-Newton methods. Biometrika, 61, 439– 447White, H. (1980). A heteroskedastic-consistent covariance matrix estimator and direct test of heteroskedasticity. Econometrica, 48, 817– 838Winkelmann, R. (2000). Econometric analysis of count data (3rd ed.). Berlin: SpringerWu, C.F.J. (1986). Jackknife, bootstrap, and other resampling methods in regression analysis. Annals of Statistics, 14, 1261– 1295