Elsevier

Economics Letters

Volume 107, Issue 2, May 2010, Pages 310-312
Economics Letters

On the existence of the maximum likelihood estimates in Poisson regression

https://doi.org/10.1016/j.econlet.2010.02.020Get rights and content

Abstract

We note that the existence of the maximum likelihood estimates in Poisson regression depends on the data configuration, and propose a strategy to identify the existence of the problem and to single out the regressors causing it.

Introduction

The Poisson regression model is defined byPryi=j|xi=expλλjj!,j=0,1,2,where λ is generally specified as λ = exp(xiβ) = exp(β0 + β1x1i + …).2 With this formulation, β, the vector of parameters of interest, can be estimated my maximizing the log-likelihood function given bylnLβ=i=1nexpxiβ+xiβyilnyi!.

Poisson regression is not only the most widely used model for count data (see Winkelmann, 2008, Cameron and Trivedi, 1998), but it is also becoming increasingly popular to estimate multiplicative models for other kinds of data (see, among others, Manning and Mullahy, 2001, Santos Silva and Tenreyro, 2006).

The reasons that make this estimator popular can be clearly understood by inspecting the corresponding score vector and Hessian matrix, given respectively bysβ=i=1nyiexpxiβxi,andHβ=i=1nexpxiβxixi.

The form of the score vector makes clear that β will be consistently estimated as long as E(yi|xi) = exp(xiβ), i.e., the only condition required for consistency is the correct specification of the conditional mean. This is the well known pseudo-maximum likelihood result of Gourieroux et al. (1984).

Besides this robustness property, the estimator also has the advantage of being very well behaved. Indeed, it is easy to see that the Hessian is negative definite for all x and β, which facilitates the estimation and ensures the uniqueness of the maximum, if it exists. Consequently, estimation of β is relatively simple and generally the estimation algorithm converges in a handful of iterations, even for relatively large problems.

In spite of this general result, for certain data configurations, some of the parameters in β are not identified by the (pseudo) maximum likelihood estimator described above. That is, for certain data configurations, the maximum likelihood estimates of β, say β̂̂ , do not exist. Because this type of identification failure has not been widely recognized as a problem in count data models, standard software does not check for its presence and therefore the practitioner may be surprised to find that estimation of the Poisson regression is unusually difficult, even in some apparently simple problems. This letter provides details on when this problem arises, on how it can be detected, and on how it can be overcome.

Section snippets

The problem

To better see the nature of the problem, it is useful to start by considering the case where a regressor, say xi2, is zero when yi is positive, otherwise being non-negative with at least one positive observation. The leading example of a regressor with these characteristics is a dummy variable that is equal to zero for all observations with positive yi, having some positive values for yi = 0. From Eq. (2), the first order condition for a maximum of Eq. (1) corresponding to the parameter

Discussion

The results of the previous section make clear that the non-existence of the (pseudo) maximum likelihood estimates of the Poisson regression models is more likely when the data has a large number of zeros.5 For example, this problem is likely to arise when modelling the number of crimes committed, the number of instances of substance abuse, or the volume of trade

Acknowledgements

We are grateful to Ines Buono, Virginia Di Nino, Dave Donaldson, Doireann Fitzgerald, Lissandra Flach and Randi Hjalmarsson for noting the problem and for providing examples of data sets where it occurs. We also thank J.M. Andrade e Silva and an anonymous referee for the valuable comments. The usual disclaimer applies. Santos Silva also gratefully acknowledges partial financial support from Fundação para a Ciência e Tecnologia (FEDER/POCI 2010). Tenreyro gratefully acknowledges the support of

References (8)

  • W.G. Manning et al.

    Estimating log models: to transform or not to transform?

    Journal of Health Economics

    (2001)
  • A. Albert et al.

    On the existence of maximum likelihood estimates in logistic models

    Biometrika

    (1984)
  • A.C. Cameron et al.

    Regression Analysis of Count Data

    (1998)
  • C. Gourieroux et al.

    Pseudo maximum likelihood methods: applications to Poisson models

    Econometrica

    (1984)
There are more references available in the full text version of this article.

Cited by (0)

1

Fax: +44 20 78311840.

View full text