Poisson distribution - Wikipedia, the free encyclopedia

advertisement
Poisson distribution - Wikipedia, the free encyclopedia
Create account
Article Talk
Read Edit
Log in
Search
Please read: a personal appeal from Wikipedia founder Jimmy Wales
Read now
Main page
Contents
Featured content
Poisson distribution
From Wikipedia, the free encyclopedia
Current events
Random article
Donate to Wikipedia
Interaction
Help
About Wikipedia
Community portal
Recent changes
Contact Wikipedia
Toolbox
Print/export
Languages
‫العربية‬
Български
Català
Česky
In probability theory and statistics, the
Poisson distribution (pronounced
[pwasɔ̃]) is a discrete probability
distribution that expresses the probability
of a given number of events occurring in
a fixed interval of time and/or space if
these events occur with a known average
rate and independently of the time since
For instance, suppose someone typically
gets on the average 4 pieces of mail per
day. There will be, however, a certain
spread: sometimes a little more,
sometimes a little less, once in a while
Lietuvių
Magyar
occurrence.[2]
Ελληνικά
Español
Euskara
‫فارسی‬
Français
한국어
Bahasa Indonesia
Italiano
‫עברית‬
Nederlands
日本語
Norsk (bokmål)
Novial
Cumulative distribution function
large numbers.[3]
Contents [hide] Português
Simple English
The horizontal axis is the index k, the number of occurrences.
The function is only defined at integer values of k. The
connecting lines are only guides for the eye.
The distribution's practical usefulness has
been described by the Poisson law of
Polski
Русский
Probability mass function
the last event. [1] (The Poisson distribution
can also be used for the number of
events in other specified intervals such as
distance, area or volume.)
nothing at all.[2] Given only the average
rate, for a certain period of observation
(pieces of mail per day, phonecalls per
hour, etc.), and assuming that the
process, or mix of processes, that produce
the event flow are essentially random, the
Poisson distribution specifies how likely it
is that the count will be 3, or 5, or 11, or
any other number, during one period of
observation. That is, it predicts the degree
of spread around a known average rate of
Deutsch
Poisson
1 History
2 Definition
http://en.wikipedia.org/wiki/Poisson_distribution[11/3/2012 2:35:46 PM]
The horizontal axis is the index k, the number of occurrences.
The CDF is discontinuous at the integers of k and flat
everywhere else because a variable that is Poisson
distributed only takes on integer values.
Poisson distribution - Wikipedia, the free encyclopedia
Slovenčina
Slovenščina
Basa Sunda
Suomi
Svenska
Türkçe
Українська
Tiếng Việt
粵語
中文
3 Properties
3.1 Mean
Notation
3.2 Median
Parameters λ > 0 (real)
3.3 Higher moments
Support
3.4 Other properties
k ∈ { 0, 1, 2, 3, ... }
PMF
4 Related distributions
5 Occurrence
5.1 Derivation of Poisson distribution
— The law of rare events
CDF
--or--
5.2 Multi-dimensional Poisson process
(for
5.3 Other applications in science
6 Generating Poisson-distributed random
variables
7 Parameter estimation
7.1 Maximum likelihood
gamma function and
is the Incomplete
is the floor function)
Mean
Median
7.2 Confidence interval
Mode
7.3 Bayesian inference
Variance
8 Bivariate Poisson distribution
Skewness
9 See also
Ex.
kurtosis
10 Notes
11 References
History
where
Entropy
[edit]
The distribution was first introduced by
Siméon Denis Poisson (1781–1840) and
published, together with his probability
theory, in 1837 in his work Recherches
sur la probabilité des jugements en
matière criminelle et en matière civile
(“Research on the Probability of
Judgments in Criminal and Civil
(for large
)
MGF
CF
Matters”).[4] The work focused on certain
PGF
random variables N that count, among
other things, the number of discrete occurrences (sometimes called “arrivals”) that take place during
a time-interval of given length. The result had been given previously by de Abraham de Moivre
(1711) in De Mensura Sortis seu; de Probabilitate Eventuum in Ludis a Casu Fortuito Pendentibus in
Philosophical Transactions of the Royal Society, p. 219. [5]
A practical application of this distribution was made by Ladislaus Bortkiewicz in 1898 when he was
given the task of investigating the number of soldiers in the Prussian army killed accidentally by
horse kick; this experiment introduced the Poisson distribution to the field of reliability engineering.[6]
Definition
[edit]
A discrete stochastic variable X is said to have a Poisson distribution with parameter λ>0, if for k = 0,
1, 2, ... the probability mass function of X is given by:
where
e is the base of the natural logarithm (e = 2.71828...)
http://en.wikipedia.org/wiki/Poisson_distribution[11/3/2012 2:35:46 PM]
Poisson distribution - Wikipedia, the free encyclopedia
k! is the factorial of k.
The positive real number λ is equal to the expected value of X, but also to the variance:
The Poisson distribution can be applied to systems with a large number of possible events, each of
which is rare. The Poisson distribution is sometimes called a Poissonian.
Properties
[edit]
Mean
[edit]
The expected value of a Poisson-distributed random variable is equal to λ and so is its variance.
, while the index of dispersion is 1.[5]
The coefficient of variation is
The mean deviation about the mean is [5]
The mode of a Poisson-distributed random variable with non-integer λ is equal to
, which is the
largest integer less than or equal to λ. This is also written as floor(λ). When λ is a positive integer,
the modes are λ and λ − 1.
All of the cumulants of the Poisson distribution are equal to the expected value λ. The nth
factorial moment of the Poisson distribution is λ n .
Median
[edit]
Bounds for the median ( ν ) of the distribution are known and are sharp: [7]
Higher moments
[edit]
The higher moments mk of the Poisson distribution about the origin are Touchard polynomials in
λ:
where the {braces} denote Stirling numbers of the second kind.[8] The coefficients of the polynomials
have a combinatorial meaning. In fact, when the expected value of the Poisson distribution is 1, then
Dobinski's formula says that the nth moment equals the number of partitions of a set of size n.
Sums of Poisson-distributed random variables:
If
are independent, and
, then
.[9] A converse is Raikov's theorem, which says that if the sum of
two independent random variables is Poisson-distributed, then so is each of those two
independent random variables.[10]
Other properties
The Poisson distributions are infinitely divisible probability distributions.[11][12]
The directed Kullback-Leibler divergence between Pois(λ) and Pois(λ 0 ) is given by
http://en.wikipedia.org/wiki/Poisson_distribution[11/3/2012 2:35:46 PM]
[edit]
Poisson distribution - Wikipedia, the free encyclopedia
Bounds for the tail probabilities of a Poisson random variable
can be derived
using a Chernoff bound argument. [13]
Related distributions
If
and
[edit]
are independent, then the difference
follows a Skellam distribution.
If
and
are independent, then the distribution of
conditional on
is a binomial distribution. Specifically, given
,
. More generally, if X1 , X2 ,..., Xn are independent
Poisson random variables with parameters λ 1 , λ 2 ,..., λ n then
. In fact,
given
.
The Poisson distribution can be derived as a limiting case to the binomial distribution as the
number of trials goes to infinity and the expected number of successes remains fixed — see law
of rare events below. Therefore it can be used as an approximation of the binomial distribution if n
is sufficiently large and p is sufficiently small. There is a rule of thumb stating that the Poisson
distribution is a good approximation of the binomial distribution if n is at least 20 and p is smaller
than or equal to 0.05, and an excellent approximation if n ≥ 100 and np ≤ 10. [14]
The Poisson distribution is a special case of generalized stuttering Poisson distribution (or
stuttering Poisson distribution) with only a parameter.[15] Stuttering Poisson distribution can be
deduced from the limiting distribution of multinomial distribution.
For sufficiently large values of λ, (say λ>1000), the normal distribution with mean λ and variance λ
), is an excellent approximation to the Poisson distribution. If λ is greater
(standard deviation
than about 10, then the normal distribution is a good approximation if an appropriate continuity
correction is performed, i.e., P(X ≤ x), where (lower-case) x is a non-negative integer, is replaced
by P(X ≤ x + 0.5).
Variance-stabilizing transformation: When a variable is Poisson distributed, its square root is
and variance of about
approximately normally distributed with expected value of about
1/4.[16][17] Under this transformation, the convergence to normality (as λ increases) is far faster
than the untransformed variable. [citation needed] Other, slightly more complicated, variance
stabilizing transformations are available,[17] one of which is Anscombe transform. See Data
transformation (statistics) for more general uses of transformations.
If for every t > 0 the number of arrivals in the time interval [0,t] follows the Poisson distribution
with mean λ t, then the sequence of inter-arrival times are independent and identically distributed
http://en.wikipedia.org/wiki/Poisson_distribution[11/3/2012 2:35:46 PM]
Poisson distribution - Wikipedia, the free encyclopedia
exponential random variables having mean 1 / λ. [18]
The cumulative distribution functions of the Poisson and chi-squared distributions are related in
the following ways: [19]
and[20]
Occurrence
[edit]
Applications of the Poisson distribution can be found in many fields related to counting:[21]
Electrical system example: telephone calls arriving in a system.
Astronomy example: photons arriving at a telescope.
Biology example: the number of mutations on a strand of DNA per unit length.
Management example: customers arriving at a counter or call centre.
Civil Engineering example: cars arriving at a traffic light.
Finance and Insurance example: Number of Losses/Claims occurring in a given period of Time.
Earthquake Seismology example: An asymptotic Poisson model of seismic risk for large
earthquakes. (Lomnitz, 1994).
The Poisson distribution arises in connection with Poisson processes. It applies to various
phenomena of discrete properties (that is, those that may happen 0, 1, 2, 3, ... times during a given
period of time or in a given area) whenever the probability of the phenomenon happening is constant
in time or space. Examples of events that may be modelled as a Poisson distribution include:
The number of soldiers killed by horse-kicks each year in each corps in the Prussian cavalry. This
example was made famous by a book of Ladislaus Josephovich Bortkiewicz (1868–1931).
The number of yeast cells used when brewing Guinness beer. This example was made famous
by William Sealy Gosset (1876–1937). [22]
The number of phone calls arriving at a call centre per minute.
The number of goals in sports involving two competing teams. [23]
The number of deaths per year in a given age group.
The number of jumps in a stock price in a given time interval.
Under an assumption of homogeneity, the number of times a web server is accessed per minute.
The number of mutations in a given stretch of DNA after a certain amount of radiation.
The proportion of cells that will be infected at a given multiplicity of infection.
Derivation of Poisson distribution — The law of rare events
See also: Poisson limit theorem
In several of the above examples—such
as, the number of mutations in a given
sequence of DNA—the events being
counted are actually the outcomes of
discrete trials, and would more precisely
be modelled using the binomial
distribution, that is
In such cases n is very large and p is
very small (and so the expectation np is
http://en.wikipedia.org/wiki/Poisson_distribution[11/3/2012 2:35:46 PM]
[edit]
Poisson distribution - Wikipedia, the free encyclopedia
of intermediate magnitude). Then the
distribution may be approximated by the
less cumbersome Poisson
distribution[citation needed]
This is sometimes known as the law of
rare events,[citation needed] since each of
Comparison of the Poisson distribution (black lines) and the
binomial distribution with n=10 (red circles), n=20 (blue circles),
the n individual Bernoulli events rarely
n=1000 (green circles). All distributions have a mean of 5. The
occurs. The name may be misleading
horizontal axis shows the number of events k. Notice that as n
because the total count of success
gets larger, the Poisson distribution becomes an increasingly
events in a Poisson process need not be
better approximation for the binomial distribution with the same
mean.
rare if the parameter np is not small. For
example, the number of telephone calls
to a busy switchboard in one hour follows
a Poisson distribution with the events appearing frequent to the operator, but they are rare from the
point of view of the average member of the population who is very unlikely to make a call to that
switchboard in that hour.[citation needed]
The word law is sometimes used as a synonym of probability distribution, and convergence in law
means convergence in distribution. Accordingly, the Poisson distribution is sometimes called the law
of small numbers because it is the probability distribution of the number of occurrences of an event
that happens rarely but has very many opportunities to happen. The Law of Small Numbers is a book
by Ladislaus Bortkiewicz about the Poisson distribution, published in 1898. Some have suggested
that the Poisson distribution should have been called the Bortkiewicz distribution. [24]
Multi-dimensional Poisson process
[edit]
Main article: Poisson process
The poisson distribution arises as the distribution of counts of occurrences of events in
(multidimensional) intervals in multidimensional Poisson processes in a directly equivalent way to the
result for unidimensional processes. This,is D is any region the multidimensional space for which |D|,
the area or volume of the region, is finite, and if N(D) is count of the number of events in D, then
Other applications in science
[edit]
In a Poisson process, the number of observed occurrences fluctuates about its mean λ with a
standard deviation
. These fluctuations are denoted as Poisson noise or (particularly in
electronics) as shot noise. [citation needed]
The correlation of the mean and standard deviation in counting independent discrete occurrences is
http://en.wikipedia.org/wiki/Poisson_distribution[11/3/2012 2:35:46 PM]
Poisson distribution - Wikipedia, the free encyclopedia
useful scientifically. By monitoring how the fluctuations vary with the mean signal, one can estimate
the contribution of a single occurrence, even if that contribution is too small to be detected directly.
For example, the charge e on an electron can be estimated by correlating the magnitude of an
electric current with its shot noise. If N electrons pass a point in a given time t on the average, the
mean current is
; since the current fluctuations should be of the order
(i.e., the standard deviation of the Poisson process), the charge
can be estimated from the ratio
.[citation needed]
An everyday example is the graininess that appears as photographs are enlarged; the graininess is
due to Poisson fluctuations in the number of reduced silver grains, not to the individual grains
themselves. By correlating the graininess with the degree of enlargement, one can estimate the
contribution of an individual grain (which is otherwise too small to be seen unaided). [citation needed]
Many other molecular applications of Poisson noise have been developed, e.g., estimating the
number density of receptor molecules in a cell membrane.
Generating Poisson-distributed random variables
[edit]
A simple algorithm to generate random Poisson-distributed numbers (pseudo-random number
sampling) has been given by Knuth (see References below):
algorithm poisson random number (Knuth):
init:
Let L ← e −λ, k ← 0 and p ← 1.
do:
k ← k + 1.
Generate uniform random number u in [0,1] and let p ← p × u.
while p > L.
return k − 1.
While simple, the complexity is linear in λ. There are many other algorithms to overcome this. Some
are given in Ahrens & Dieter, see References below. Also, for large values of λ, there may be
numerical stability issues because of the term e −λ. One solution for large values of λ is Rejection
sampling, another is to use a Gaussian approximation to the Poisson.
Inverse transform sampling is simple and efficient for small values of λ, and requires only one uniform
random number u per sample. Cumulative probabilities are examined in turn until one exceeds u.
Parameter estimation
[edit]
Maximum likelihood
[edit]
Given a sample of n measured values k i we wish to estimate the value of the parameter λ of the
Poisson population from which the sample was drawn. The maximum likelihood estimate is [25]
Since each observation has expectation λ so does this sample mean. Therefore the maximum
likelihood estimate is an unbiased estimator of λ. It is also an efficient estimator, i.e. its estimation
variance achieves the Cramér–Rao lower bound (CRLB).[citation needed] Hence it is MVUE. Also it
can be proved that the sample mean is a complete and sufficient statistic for λ. [citation needed]
Confidence interval
http://en.wikipedia.org/wiki/Poisson_distribution[11/3/2012 2:35:46 PM]
[edit]
Poisson distribution - Wikipedia, the free encyclopedia
The confidence interval for a Poisson mean is calculated using the relationship between the Poisson
and Chi-square distributions, and can be written as:
where k is the number of event occurrences in a given interval and
is the chi-square
deviate with lower tail area p and degrees of freedom n.[19][26] This interval is 'exact' in the sense
that its coverage probability is never less than the nominal 1 – α.
When quantiles of the chi-square distribution are not available, an accurate approximation to this
exact interval was proposed by DP Byar (based on the Wilson–Hilferty transformation ):[27]
,
where
denotes the standard normal deviate with upper tail area α / 2 .
For application of these formulae in the same context as above (given a sample of n measured
values k i ), one would set
calculate an interval for μ=nλ, and then derive the interval for λ.
Bayesian inference
[edit]
In Bayesian inference, the conjugate prior for the rate parameter λ of the Poisson distribution is the
gamma distribution. [citation needed] Let
denote that λ is distributed according to the gamma density g parameterized in terms of a shape
parameter α and an inverse scale parameter β:
Then, given the same sample of n measured values k i as before, and a prior of Gamma(α, β), the
posterior distribution is
The posterior mean E[λ] approaches the maximum likelihood estimate
in the limit as
.[citation needed]
The posterior predictive distribution for a single additional observation is a negative binomial
distribution distribution, [citation needed] sometimes called a Gamma-Poisson distribution.
Bivariate Poisson distribution
This distribution has been extended to the bivariate case. [28] The generating function for this
distribution is
with
http://en.wikipedia.org/wiki/Poisson_distribution[11/3/2012 2:35:46 PM]
[edit]
Poisson distribution - Wikipedia, the free encyclopedia
The marginal distributions are Poisson( θ 1 ) and Poisson( θ 2 ) and the correlation coefficient is
limited to the range
The Skellam distribution is a particular case of this distribution. [citation needed]
See also
[edit]
Compound Poisson distribution
Poisson Process
Conway–Maxwell–Poisson distribution
Poisson sampling
Erlang distribution
Queueing theory
Index of dispersion
Renewal theory
Negative binomial distribution
Robbins lemma
Poisson regression
Tweedie distributions
Notes
[edit]
1. ^ Frank A. Haight (1967). Handbook of the Poisson Distribution. New York: John Wiley & Sons.
2. ^ a b "Statistics | The Poisson Distribution"
. Umass.edu. 2007-08-24. Retrieved 2012-04-05.
3. ^ Gullberg, Jan (1997). Mathematics from the birth of numbers. New York: W. W. Norton. pp. 963–965.
ISBN 0-393-04002-X.
4. ^ S.D. Poisson, Probabilité des jugements en matière criminelle et en matière civile, précédées des
règles générales du calcul des probabilitiés (Paris, France: Bachelier, 1837), page 206 .
5. ^ a b c Johnson, N.L., Kotz, S., Kemp, A.W. (1993) Univariate Discrete distributions (2nd edition). Wiley.
ISBN 0-471-54897-9, p157
6. ^ Ladislaus von Bortkiewicz, Das Gesetz der kleinen Zahlen [The law of small numbers] (Leipzig,
Germany: B.G. Teubner, 1898). On page 1 , Bortkiewicz presents the Poisson distribution. On pages
23-25 , Bortkiewicz presents his famous analysis of "4. Beispiel: Die durch Schlag eines Pferdes im
preussischen Heere Getöteten." (4. Example: Those killed in the Prussian army by a horse's kick.).
7. ^ Choi KP (1994) On the medians of Gamma distributions and an equation of Ramanujan. Proc Amer
Math Soc 121 (1) 245–251
8. ^ Riordan, John (1937). "Moment recurrence relations for binomial, Poisson and hypergeometric
frequency distributions". Annals of Mathematical Statistics 8: 103–111. Also see Haight (1967), p. 6.
9. ^ E. L. Lehmann (1986). Testing Statistical Hypotheses (second ed.). New York: Springer Verlag.
ISBN 0-387-94919-4. page 65.
10. ^ Raikov, D. (1937). On the decomposition of Poisson laws. Comptes Rendus (Doklady) de l' Academie
des Sciences de l'URSS, 14, 9–11. (The proof is also given in von Mises, Richard (1964). Mathematical
Theory of Probability and Statistics. New York: Academic Press.)
11. ^ Laha, R. G. and Rohatgi, V. K.. Probability Theory. New York: John Wiley & Sons. p. 233. ISBN 0471-03262-X.
12. ^ Johnson, N.L., Kotz, S., Kemp, A.W. (1993) Univariate Discrete distributions (2nd edition). Wiley. ISBN
0-471-54897-9, p159
13. ^ Massimo Franceschetti and Olivier Dousse and David N. C. Tse and Patrick Thiran (2007). "Closing
the Gap in the Capacity of Wireless Networks Via Percolation Theory"
. IEEE Transactions on
Information Theory 53 (3): 1009–1018.
14. ^ NIST/SEMATECH, '6.3.3.1. Counts Control Charts ', e-Handbook of Statistical Methods, accessed
25 October 2006
15. ^ Huiming, Zhang; Lili Chu, Yu Diao (2012). "Some Properties of the Generalized Stuttering Poisson
Distribution and its Applications". Studies in Mathematical Sciences 5 (1): 11–26.
doi:10.3968/j.sms.1923845220120501.Z0697 .
16. ^ McCullagh, Peter; Nelder, John (1989). Generalized Linear Models. London: Chapman and Hall.
http://en.wikipedia.org/wiki/Poisson_distribution[11/3/2012 2:35:46 PM]
Poisson distribution - Wikipedia, the free encyclopedia
ISBN 0-412-31760-5. page 196 gives the approximation and higher order terms.
17. ^ a b Johnson, N.L., Kotz, S., Kemp, A.W. (1993) Univariate Discrete distributions (2nd edition). Wiley.
ISBN 0-471-54897-9, p163
18. ^ S. M. Ross (2007). Introduction to Probability Models (ninth ed.). Boston: Academic Press. ISBN 9780-12-598062-3. pp. 307–308.
19. ^ a b Johnson, N.L., Kotz, S., Kemp, A.W. (1993) Univariate Discrete distributions (2nd edition). Wiley.
ISBN 0-471-54897-9, p171
20. ^ Johnson, N.L., Kotz, S., Kemp, A.W. (1993) Univariate Discrete distributions (2nd edition). Wiley. ISBN
0-471-54897-9, p153
21. ^ "The Poisson Process as a Model for a Diversity of Behavioural Phenomena"
22. ^ Philip J. Boland. "A Biographical Glimpse of William Sealy Gosset"
. The American Statistician, Vol.
38, No. 3. (Aug., 1984), pp. 179-183.. Retrieved 2011-06-22. "At the turn of the 19th century, Arthur
Guinness, Son & Co. became interested in hiring scientists to analyze data concerned with various
aspects of its brewing process. Gosset was to be one of the first of these scientists, and so it was that in
1899 he moved to Dublin to take up a job as a brewer at St. James' Gate... Student published 22
papers, the first of which was entitled "On the Error of Counting With a Haemacytometer" (Biometrika,
1907). In it, Student illustrated the practical use of the Poisson distribution in counting the number of
yeast cells on a square of a haemacytometer. Up until just before World War II, Guinness would not
allow its employees to publish under their own names, and hence Gosset chose to write under the
pseudonym of "Student.""
23. ^ "Using Poisson Distribution for Soccer Betting"
24. ^ Good, I. J. (1986). "Some statistical applications of Poisson's work". Statistical Science 1 (2): 157–180.
doi:10.1214/ss/1177013690 . JSTOR 2245435 .
25. ^ Paszek, Ewa. "Maximum Likelihood Estimation - Examples" .
26. ^ Garwood, F. (1936). "Fiducial Limits for the Poisson Distribution". Biometrika 28 (3/4): 437–442.
doi:10.1093/biomet/28.3-4.437 .
27. ^ Breslow, NE; Day, NE (1987). Statistical Methods in Cancer Research: Volume 2—The Design and
Analysis of Cohort Studies . Paris: International Agency for Research on Cancer. ISBN 978-92-8320182-3.
28. ^ Loukas S, Kemp CD (1986) The index of dispersion test for the bivariate Poisson distribution.
Biometrics 42(4) 941-948
References
[edit]
Joachim H. Ahrens, Ulrich Dieter (1974). "Computer Methods for Sampling from Gamma, Beta,
Poisson and Binomial Distributions". Computing 12 (3): 223–246. doi:10.1007/BF02293108 .
Joachim H. Ahrens, Ulrich Dieter (1982). "Computer Generation of Poisson Deviates". ACM
Transactions on Mathematical Software 8 (2): 163–179. doi:10.1145/355993.355997 .
Ronald J. Evans, J. Boersma, N. M. Blachman, A. A. Jagers (1988). "The Entropy of a Poisson
Distribution: Problem 87-6". SIAM Review 30 (2): 314–317. doi:10.1137/1030059 .
Donald E. Knuth (1969). Seminumerical Algorithms. The Art of Computer Programming, Volume
2. Addison Wesley.
v
· t· e·
Probability distributions
[show]
v
· t· e·
Some common univariate probability distributions
[hide]
Continuous
Discrete
beta · Cauchy · chi-squared · exponential · F · gamma · Laplace · log-normal · normal · Pareto ·
Student's t · uniform · Weibull ·
Bernoulli · binomial · discrete uniform · geometric · hypergeometric · negative binomial · Poisson ·
List of probability distributions
Rate this page
What's this?
http://en.wikipedia.org/wiki/Poisson_distribution[11/3/2012 2:35:46 PM]
View page ratings
Download