|
|
Line 1: |
Line 1: |
| {{Probability distribution|
| | My name is Antony Culbertson but everybody calls me Antony. I'm from Germany. I'm studying at the high school (final year) and I play the Lap Steel Guitar for 10 years. Usually I choose music from my famous films ;). <br>I have two sister. I like Microscopy, watching TV (Arrested Development) and Gongoozling.<br><br>Feel free to surf to my web-site :: [http://www.tdytrade.com/ backup plugin] |
| name =Conway–Maxwell–Poisson|
| |
| type = mass|
| |
| pdf_image =|
| |
| cdf_image =|
| |
| type = mass|
| |
| parameters =<math>\lambda > 0, \nu \geq 0</math>|
| |
| support =<math>x \in \{0,1,2,\dots\}</math>|
| |
| pdf =<math>\frac{\lambda^x}{(x!)^\nu}\frac{1}{Z(\lambda,\nu)}</math>|
| |
| cdf =<math>\sum_{i=0}^x \mathbb{P}(X = i)</math>|
| |
| mean =<math>\sum_{j=0}^\infty \frac{j\lambda^j}{(j!)^\nu Z(\lambda, \nu)}</math>|
| |
| median =No closed form|
| |
| mode =Not listed|
| |
| variance =<math>\sum_{j=0}^\infty \frac{j^2\lambda^j}{(j!)^\nu Z(\lambda, \nu)} - \mu^2</math>|
| |
| skewness =Not listed|
| |
| kurtosis =Not listed|
| |
| entropy =Not listed|
| |
| mgf =Not listed|
| |
| char =Not listed|
| |
| }}
| |
| | |
| In [[probability theory]] and [[statistics]], the '''Conway–Maxwell–Poisson (CMP or COM-Poisson) distribution''' is a [[discrete probability distribution]] named after [[Richard W. Conway]], [[William L. Maxwell]], and [[Siméon Denis Poisson]] that generalizes the [[Poisson distribution]] by adding a parameter to model [[overdispersion]] and [[underdispersion]]. It is a member of the [[exponential family]], has the Poisson distribution and [[geometric distribution]] as [[special case]]s and the [[Bernoulli distribution]] as a [[limiting case]].
| |
| | |
| The COM-Poisson distribution was originally proposed by Conway and Maxwell in 1962 <ref>{{Citation|last1=Conway| first1=R. W.| last2=Maxwell| first2= W. L.| year=1962| title=A queuing model with state dependent service rates|journal= Journal of Industrial Engineering| volume=12| pages=132–136}}</ref> as a solution to handling [[queueing system]]s with state-dependent service rates. The probabilistic and statistical properties of the distribution were published by Shmueli et al. (2005).<ref>Shmueli G., Minka T., Kadane J.B., Borle S., and Boatwright, P.B. "A useful distribution for fitting discrete data: revival of the Conway–Maxwell–Poisson distribution." Journal of the Royal Statistical Society: Series C (Applied Statistics) 54.1 (2005): 127–142.[http://dx.doi.org/10.1111/j.1467-9876.2005.00474.x]</ref>
| |
| | |
| The COM-Poisson is defined to be the distribution with [[probability mass function]]
| |
| | |
| :<math>
| |
| \Pr(X = x) = f(x; \lambda, \nu) = \frac{\lambda^x}{(x!)^\nu}\frac{1}{Z(\lambda,\nu)}, </math>
| |
| for ''x'' = 0,1,2,... , <math>\lambda > 0</math> and <math>\nu</math> ≥ 0,
| |
| where
| |
| | |
| :<math>
| |
| Z(\lambda,\nu) = \sum_{j=0}^\infty \frac{\lambda^j}{(j!)^\nu}.
| |
| </math>
| |
| | |
| The function <math>Z(\lambda,\nu)</math> serves as a [[normalization constant]] so the probability mass function sums to one. Note that <math>Z(\lambda,\nu)</math> does not have a closed form.
| |
| | |
| The additional parameter <math>\nu</math> which does not appear in the [[Poisson distribution]] allows for adjustment of the rate of decay. This rate of decay is a non-linear decrease in ratios of successive probabilities, specifically
| |
| | |
| :<math>
| |
| \frac{\Pr(X = x-1)}{\Pr(X = x)} = \frac{x^\nu}{\lambda}.
| |
| </math>
| |
| | |
| When <math>\nu = 1</math>, the COM-Poisson distribution becomes the standard [[Poisson distribution]] and as <math>\nu \to \infty</math>, the distribution approaches a [[Bernoulli distribution]] with parameter <math>\lambda/(1+\lambda)</math>. When <math>\nu=0</math> the CoM-Poisson distribution reduces to a [[geometric distribution]] with probability of success <math>1-\lambda</math> provided <math>\lambda<1</math>.
| |
| | |
| For the COM-Poisson distribution, moments can be found through the recursive formula
| |
| | |
| :<math>
| |
| \operatorname{E}[X^{r+1}] = \begin{cases}
| |
| \lambda \, \operatorname{E}[X+1]^{1-\nu} & \text{ if } r = 0 \\
| |
| \lambda \, \frac{d}{d\lambda}\operatorname{E}[X^r] + \operatorname{E}[X]\operatorname{E}[X^r] & \text{ if } r > 0. \\
| |
| \end{cases}
| |
| </math>
| |
| | |
| == Parameter estimation ==
| |
| | |
| There are a few methods of estimating the parameters of the CMP distribution from the data. Two methods will be discussed: weighted least squares and maximum likelihood. The weighted lease squares approach is simple and efficient but lacks precision. Maximum likelihood, on the other hand, is precise, but is more complex and computationally intensive.
| |
| | |
| === Weighted least squares ===
| |
| | |
| The [[weighted least squares]] provides a simple, efficient method to derive rough estimates of the parameters of the CMP distribution and determine if the distribution would be an appropriate model. Following the use of this method, an alternative method should be employed to compute more accurate estimates of the parameters if the model is deemed appropriate.
| |
| | |
| This method uses the relationship of successive probabilities as discussed above. By taking logarithms of both sides of this equation, the following linear relationship arises
| |
| | |
| :<math>
| |
| \log \frac{p_{x-1}}{p_x} = - \log \lambda + \nu \log x
| |
| </math>
| |
| | |
| where <math>p_x</math> denotes <math>\mathbb{P}(X = x)</math>. When estimating the parameters, the probabilities can be replaced by the [[relative frequencies]] of <math>x</math> and <math>x-1</math>. To determine if the CMP distribution is an appropriate model, these values should be plotted against <math>\log x</math> for all ratios without zero counts. If the data appear to be linear, then the model is likely to be a good fit.
| |
| | |
| Once the appropriateness of the model is determined, the parameters can be estimated by fitting a regression of <math>\log (\hat p_{x-1} / \hat p_x)</math> on <math>\log x</math>. However, the basic assumption of [[homoscedasticity]] is violated, so a [[weighted least squares]] regression must be used. The inverse weight matrix will have the variances of each ratio on the diagonal with the one-step covariances on the first off-diagonal, both given below.
| |
| | |
| :<math>
| |
| \mathbb{V}\left[\log \frac{\hat p_{x-1}}{\hat p_x}\right] \approx \frac{1}{np_x} + \frac{1}{np_{x-1}}
| |
| </math>
| |
| :<math>
| |
| \text{cov}\left(\log \frac{\hat p_{x-1}}{\hat p_x}, \log \frac{\hat p_x}{\hat p_{x+1}} \right) \approx - \frac{1}{np_x}
| |
| </math>
| |
| | |
| === Maximum likelihood ===
| |
| | |
| The COM-Poisson [[likelihood function]] is
| |
| | |
| :<math>
| |
| \mathcal{L}(\lambda,\nu|x_1,\dots,x_n) = \lambda^{S_1} \exp(-\nu S_2) Z^{-n}(\lambda, \nu)
| |
| </math>
| |
| | |
| where <math>S_1 = \sum_{i=1}^n x_i</math> and <math>S_2 = \sum_{i=1}^n \log x_i!</math>. Maximizing the likelihood yields the following two equations
| |
| | |
| :<math>
| |
| \mathbb{E}[X] = \bar X
| |
| </math>
| |
| :<math>
| |
| \mathbb{E}[\log X!] = \overline{\log X!}
| |
| </math>
| |
| | |
| which do not have an analytic solution.
| |
| | |
| Instead, the [[maximum likelihood]] estimates are approximated numerically by the [[Newton's method|Newton–Raphson method]]. In each iteration, the expectations, variances, and covariance of <math>X</math> and <math>\log X!</math> are approximated by using the estimates for <math>\lambda</math> and <math>\nu</math> from the previous iteration in the expression
| |
| | |
| :<math>
| |
| \mathbb{E}[f(x)] = \sum_{j=0}^\infty f(j) \frac{\lambda^j}{(j!)^\nu Z(\lambda, \nu)}.
| |
| </math>
| |
| | |
| This is continued until convergence of <math>\hat\lambda</math> and <math>\hat\nu</math>.
| |
| | |
| == Generalized linear model ==
| |
| | |
| The basic COM-Poisson distribution discussed above has also been used as the basis for a [[generalized linear model]] (GLM) using a Bayesian formulation. A dual-link GLM based on the CMP distribution has been developed,<ref name=GC>Guikema, S.D. and J.P. Coffelt (2008) "A Flexible Count Data Regression Model for Risk Analysis", ''Risk Analysis'', 28 (1), 213–223. {{doi|10.1111/j.1539-6924.2008.01014.x}}</ref>
| |
| and this model has been used to evaluate traffic accident data.<ref name=Lord1>Lord, D., S.D. Guikema, and S.R. Geedipally (2008) "Application of the Conway–Maxwell–Poisson Generalized Linear Model for Analyzing Motor Vehicle Crashes," ''Accident Analysis & Prevention'', 40 (3), 1123–1134. {{doi|10.1016/j.aap.2007.12.003}}</ref><ref name=Lord2>Lord, D., S.R. Geedipally, and S.D. Guikema (2010) "Extension of the Application of Conway-Maxwell-Poisson Models: Analyzing Traffic Crash Data Exhibiting Under-Dispersion," ''Risk Analysis'', 30 (8), 1268-1276. {{doi|10.1111/j.1539-6924.2010.01417.x}}</ref> The CMP GLM developed by Guikema and Coffelt (2008) is based on a reformulation of the CMP distribution above, replacing <math>\lambda</math> with <math>\mu=\lambda^{1/\nu}</math>. The integral part of <math>\mu</math> is then the mode of the distribution. A full Bayesian estimation approach has been used with [[Markov chain Monte Carlo|MCMC]] sampling implemented in [[WinBugs]] with [[non-informative prior]]s for the regression parameters.<ref name=GC/><ref name=Lord1/> This approach is computationally expensive, but it yields the full posterior distributions for the regression parameters and allows expert knowledge to be incorporated through the use of informative priors.
| |
| | |
| A classical GLM formulation for a COM-Poisson regression has been developed which generalizes [[Poisson regression]] and [[logistic regression]].<ref name=SS>Sellers, K. S. and Shmueli, G. (2010), [http://projecteuclid.org/euclid.aoas/1280842147 "A Flexible Regression Model for Count Data"], ''Annals of Applied Statistics'', 4 (2), 943-961</ref> This takes advantage of the [[exponential family]] properties of the COM-Poisson distribution to obtain elegant model estimation (via [[maximum likelihood]]), inference, diagnostics, and interpretation. This approach requires substantially less computational time than the Bayesian approach, at the cost of not allowing expert knowledge to be incorporated into the model.<ref name=SS/> In addition it yields standard errors for the regression parameters (via the Fisher Information matrix) compared to the full posterior distributions obtainable via the Bayesian formulation. It also provides a [[statistical hypothesis test|statistical test]] for the level of dispersion compared to a Poisson model. Code for fitting a COM-Poisson regression, testing for dispersion, and evaluating fit is available.<ref>[http://www9.georgetown.edu/faculty/kfs7/research Code for COM_Poisson modelling], Georgetown Univ.</ref>
| |
| | |
| The two GLM frameworks developed for the COM-Poisson distribution significantly extend the usefulness of this distribution for data analysis problems.
| |
| | |
| == References ==
| |
| <references/>
| |
| | |
| == External links ==
| |
| * [http://cran.r-project.org/web/packages/compoisson/index.html Conway–Maxwell–Poisson distribution package for R (compoisson) by Jeffrey Dunn, part of Comprehensive R Archive Network (CRAN)]
| |
| * [http://alumni.media.mit.edu/~tpminka/software/compoisson Conway–Maxwell–Poisson distribution package for R (compoisson) by Tom Minka, third party package]
| |
| {{ProbDistributions|discrete-infinite}}
| |
| | |
| {{DEFAULTSORT:Conway-Maxwell-Poisson distribution}}
| |
| [[Category:Discrete distributions]]
| |
| [[Category:Poisson processes]]
| |
| [[Category:Probability distributions]]
| |
My name is Antony Culbertson but everybody calls me Antony. I'm from Germany. I'm studying at the high school (final year) and I play the Lap Steel Guitar for 10 years. Usually I choose music from my famous films ;).
I have two sister. I like Microscopy, watching TV (Arrested Development) and Gongoozling.
Feel free to surf to my web-site :: backup plugin