Uploaded by Zaran Nats

Continuous Probabilitty Distribution

advertisement
Introduction
The previous chapter discussed probability distributions for discrete random
variables. Now we will be extending that to continuous random variables. In
experiments where continuous variables are of interests (like length
measurements), it is reasonable to model the range of possible values of the
random variable by an interval (finite or infinite) of real numbers. Since the
range is any value in the interval, the number of possible values of the random
variable X is uncountably infinite and would have a different distribution from
the discrete random variables previously discussed. In this chapter, the
distributions, probability computations, means and variances for continuous
random variables would be discussed
Intended Learning Outcomes
At the end of this module, it is expected that the students will be able to:
1. Determine the probabilities from probability density functions
2. Determine the probabilities from cumulative distribution functions
3. Calculate means and variances for continuous random variables
4. Standardize normal random variables
5. Use the table for cumulative distribution function of a standard normal
distribution to calculate probabilities
6. Approximate probabilities for some binomial and Poisson distributions
7. Use continuity corrections to improve the normal approximations to those
binomial and Poisson distributions.
4.1 Continuous Random Variables and their Probability Distribution
A continuous random variable has a probability of zero of assuming exactly
any of its values. Consequently, its probability distribution cannot be given in
tabular form. At first this may seem startling, but it, becomes more plausible
when we consider a particular example. Let us discuss a random variable
whose values are the heights of all people over 21 years of age. Between any
two values, say 163.5 and 164.5 centimeters, or even 163.99 and 164.01
centimeters, there are an infinite number of heights, one of which is
164 centimeters. The probability of selecting a person at random who is
exactly 164 centimeters tall and not one of the infinitely large set of heights so
close to 164 centimeters that you cannot humanly measure the difference is
remote, and thus we assign a probability of zero to the event. This is not the
case, however, if we talk about the probability of selecting a person who is at
least 163 centimeters but not more than 165 centimeters tall. Now we are
dealing with an interval rather than a point value of our
random variable.We shall concern ourselves with computing probabilities for
various intervals of continuous random variables such as
P (a < X < b), P (W > c), and so forth. Note that when X is continuous,
P (a < X < b) = P (a < X < b) + P(X = b) = P (a < X < b).
That is, it does not matter whether we include an endpoint of the interval or not.
This is not true, though, when X is discrete. Although the probability
distribution of a continuous random variable cannot be presented in tabular
form, it can be stated as a formula. Such a formula would necessarily be a
function of the numerical values of the continuous random variable X and as
such will be represented by the functional notation
f(x). In dealing with continuous variables, f(x) is usually called the probability
density function, or
simply the density function of A'. Since X is defined over a continuous sample
space, it is possible for f(x) to have a finite number of discontinuities. However,
most density functions that have practical applications in the analysis of
statistical data are continuous and their graphs may take any of several forms,
some of which are shown in Figure 4.1.
Because areas will be used to represent probabilities and probabilities arc
positive numerical values, the density function must lie entirely above the x
axis. A probability density function is constructed so that the area under its
curve bounded by the x axis is equal to 1 when computed over the range of X
for which f(x) is defined. Should this range of X be a finite interval, it is always
possible to extend the interval to include the entire sot of real numbers by
defining f(x) to be zero at all points in the extended portions of the
interval. In Figure 4.2, the probability that X assumes a value between a and /;
is equal to the shaded area under the density function between the ordinates
at. x = a and x = b, and from integral calculus is given by
analysis of data because the distributions of several important sample
statistics tend
towards a Normal distribution as the sample size increases.
Empirical studies have indicated that the Normal distribution provides an
adequate approximation to the distributions of many physical variables.
Specific examples include meteorological data, such as temperature and
rainfall, measurements on living organisms, scores on aptitude tests, physical
measurements of manufactured parts, weights of contents of food packages,
volumes of liquids in bottles/cans, instrumentation errors and other deviations
from established norms, and so on. The graphical appearance of the Normal
distribution is a symmetrical bell-shaped curve that extends without bound in
both positive and negative directions. The probability density function is given
by
The graphs below illustrate the effect of changing the values of μ and σ on the
shape of the probability density function. Low variability (σ = 0.71) with respect
to the mean gives a pointed bell-shaped curve with little spread. Variability of σ
= 1.41 produces a flatter bell-shaped curve with a greater spread.
4.5 Normal Approximation to Binomial and Poisson Distribution
Binomial Approximation
The normal distribution can be used as an approximation to the binomial
distribution if X is a binomial random variable,
In the previous example, the probability that there are no log-ons in a 6-minute
interval is 0.082 regardless of the starting time of the interval. A Poisson
process assumes that events occur uniformly throughout the interval of
observation; that is, there is no clustering of events. If the log-ons are well
modeled by a Poisson process, the probability that the first log-on after noon
occurs after 12:06 P.M. is the same as the probability that the first log-on after
3:00 P.M. occurs after 3:06 P.M. And if someone logs on at 2:22 P.M., the
probability the next log-on occurs after 2:28 P.M. is still 0.082. Our starting
point for observing the system does not matter. However, if there are
high-use periods during the day, such as right after 8:00 A.M., followed by a
period of low use, a Poisson process is not an appropriate model for log-ons
and the distribution is not appropriate for computing probabilities. It might be
reasonable to model each of the high and low-use periods by a separate
Poisson process, employing a larger value for during the high-use periods and
a smaller value otherwise. Then, an exponential distribution with
the corresponding value of can be used to calculate log-on probabilities for the
high- and low-use periods.
Related documents
Download