Statistics Lecture Notes
Probability Density Functions a continuous random variable is one that can assume an uncountable number of values.
We cannot list the possible values because there is an infinite number of them.
Because there is an infinite number of values, the probability of each individual value is virtually 0.
Thus, we can determine the probability of a range of values only.
E.g. with a discrete random variable like tossing a die, it is meaningful to talk about P(X=5), say. In a continuous setting (e.g. with time as a random variable), the probability the random variable of interest, say task length, takes exactly 5 minutes is infinitesimally small, hence P(X=5) = 0
Statistics Lecture Notes – Chapter 07
Probability Density Functions
A function f(x) is called a probability density function
(over the range
if it meets the following requirements:
1.
for all
between
and
, and f(x) area=1
2. The total area under the curve between
and
is
Statistics Lecture Notes – Chapter 07
Probability Density Functions
Cumulative Distribution Function:
The cumulative distribution function, 𝐹(𝑥) , for a continuous random variable
𝑋 expresses the probability that
𝑋 does not exceed the value of 𝑥
F(x)
P(X
x) f(x) a b
P ( a ≤ x ≤ b )
= P ( a < x < b )
(Note that the probability of any individual value is zero) x
Statistics Lecture Notes – Chapter 07
Uniform Distribution
The uniform distribution is a probability distribution that has equal probabilities for all possible outcomes of the random variable f(x) Total area under the uniform probability density function is 1.0
x min x max x
Statistics Lecture Notes – Chapter 07
Uniform Distribution
Uniform Probability Density Function:
1 b
a if a
x
b f(x) =
0 otherwise
Statistics Lecture Notes – Chapter 07
Uniform Distribution
The mean of a uniform distribution is
μ a
b
2
The variance is
σ 2
(b a)
2
12
Statistics Lecture Notes – Chapter 07
Uniform Distribution
To Calculate the probability of any interval, simply find the area under the curve.
For example, to find the probability that X falls between and 𝑥
2 we use the following formula.
𝑥
1
𝑃 𝑥
1
< 𝑋 < 𝑥
2
= 𝑃 𝑥
1
≤ 𝑋 ≤ 𝑥
2
= 𝑥
2
− 𝑥
1 𝑏 − 𝑎
Statistics Lecture Notes – Chapter 07
Uniform Distribution
Ex7.1
– The amount of gasoline sold daily at a service station is uniformly distributed with a minimum of 2000 galons and a maximum of 5000 gallons.
a. Find the probability that daily sales will fall between
2500 and 3000 gallons.
b. What is the probability that the station will sell at least
4000 gallons?
c. What is the probability that the station will sell excatly
2500 gallons?
Statistics Lecture Notes – Chapter 07
Uniform Distribution
A7.1a –
𝑃 2500 < 𝑋 < 3000 = 𝑥
2
− 𝑥 𝑏 − 𝑎
1
=
3000 − 2500
5000 − 2000
= 0.1667
A7.2a –
𝑃 𝑋 > 4000 = 𝑥
2
− 𝑥 𝑏 − 𝑎
1
=
5000 − 4000
5000 − 2000
= 0.3333
A7.3a –
𝑃 𝑋 = 2500 = 0
Statistics Lecture Notes – Chapter 07
Normal Distribution
The normal distribution is the most important of all probability distributions. The probability density function of a normal random variable is given by: f(x)
1
2π
e
(x
μ) 2
2σ
2
1
2π
e
1
2
σ
μ
2
Where 𝑒 = the mathematical constant approximated by 2.71828
𝜋
= the mathematical constant approximated by
3.14159
𝜇 = the population mean 𝜎 = the population standard deviation 𝑥
= any value of the continuous variable,
< 𝑥 <
Statistics Lecture Notes – Chapter 07
Normal Distribution
‘Bell Shaped’
Symmetrical
Mean, Median and Mode are Equal
Location is determined by the mean, μ
Spread is determined by the standard deviation, σ
The random variable has an infinite theoretical range:
+ to
Statistics Lecture Notes – Chapter 07
Normal Distribution
The normal distribution closely approximates the probability distributions of a wide range of random variables
Distributions of sample means approach a normal distribution given a “large” sample size
Computations of probabilities are direct and elegant
The normal probability distribution has led to good business decisions for a number of applications
Statistics Lecture Notes – Chapter 07
Normal Distribution
For a normal random variable 𝑋 with mean 𝜇 and variance 𝜎 2 , i.e., 𝑋~𝑁(𝜇, 𝜎 2 ) , the cumulative distribution function is
F(x
0
)
P(X
x
0
) f(x)
P(X
x
0
)
0 x
0 x
Statistics Lecture Notes – Chapter 07
Normal Distribution
Finding Normal Probabilities
F(b)
P(X
b) a μ b
F(a)
P(X
a) a μ b
P(a
X
b)
F(b)
F(a) a μ b
Statistics Lecture Notes – Chapter 07 x x x
Normal Distribution
Any normal distribution (with any mean and variance combination) can be transformed into the standardized normal distribution (Z) , with mean 0 and variance 1 f(Z)
Z ~ N(0 , 1)
1
Z
Need to transform X units into Z units by subtracting the mean of X and dividing by its standard deviation
0
X
μ
Z
σ
Statistics Lecture Notes – Chapter 07
Normal Distribution
If X is distributed normally with mean of 100 and standard deviation of 50, the 𝑍 value for
𝑋 = 200 is
Z
X
μ
σ
200
100
2.0
50
This says that 𝑋 = 200 is two standard deviations
(2 increments of 50 units) above the mean of 100.
Statistics Lecture Notes – Chapter 07
Normal Distribution
Note that the distribution is the same, only the scale has changed. We can express the problem in original units ( 𝑋 ) or in standardized units ( 𝑍 )
Statistics Lecture Notes – Chapter 07
100
0
200 X
( μ = 100, σ = 50)
2.0
Z
(μ = 0, σ = 1)
Using The Normal Tables
What is P(Z > 1.6) ?
P(0 < Z < 1.6) = .4452
0 1.6
P(Z > 1.6) = .5 – P(0 < Z < 1.6)
= .5 – .4452
= .0548
Statistics Lecture Notes – Chapter 07 z
Using The Normal Tables
What is P(Z < -2.23) ?
P(0 < Z < 2.23)
P(Z < -2.23)
-2.23
0
P(Z < -2.23) = P(Z > 2.23)
= .5 – P(0 < Z < 2.23)
= .0129
2.23
Statistics Lecture Notes – Chapter 07
P(Z > 2.23) z
Using The Normal Tables
P(Z < 0) = .5
What is P(Z < 1.52) ?
P(0 < Z < 1.52)
0 1.52
P(Z < 1.52) = .5 + P(0 < Z < 1.52)
= .5 + .4357
= .9357
Statistics Lecture Notes – Chapter 07 z
Using The Normal Tables
What is P(0.9 < Z < 1.9) ?
P(0 < Z < 0.9)
P(0.9 < Z < 1.9) z
0 0.9
1.9
P(0.9 < Z < 1.9) = P(0 < Z < 1.9) – P(0 < Z < 0.9)
=.4713 – .3159
= .1554
Statistics Lecture Notes – Chapter 07
Normal Distribution
Ex7.2 The time required to build a computer is normally distributed with a mean of 50 minutes and a standard deviation of 10 minutes. What is the probability that a computer is assembled in a time between 45 and 60 minutes?
A7.2 Algebraically speaking, what is
Statistics Lecture Notes – Chapter 07
Normal Distribution
…mean of 50 minutes and a standard deviation of 10 minutes…
0
Statistics Lecture Notes – Chapter 07
Normal Distribution
We can break up
into:
The distribution is symmetric around zero, so we have:
Hence:
𝑷(– . 𝟓 < 𝒁 < 𝟏) = 𝑷(𝟎 < 𝒁 < . 𝟓) +
Statistics Lecture Notes – Chapter 07
Normal Distribution
This table gives probabilities
𝑷(𝟎 < 𝒁 < 𝒛)
First column = integer + first decimal
Top row = second decimal place
P(0 < Z < 0.5)
P(0 < Z < 1)
Statistics Lecture Notes – Chapter 07
Normal Distribution
Finding the X value for a Known Probability:
Steps to find the X value for a known probability:
1. Find the Z value for the known probability
2. Convert to X units using the formula:
Statistics Lecture Notes – Chapter 07
Normal Distribution
Ex7.3 Suppose X is normal with mean 8.0 and standard deviation 5.0.
Now find the X value so that only 20% of all values are below this X
.2000
Statistics Lecture Notes – Chapter 07
?
8.0
?
0
X
Z
Normal Distribution
A7.3
–
Standardized Normal
Probability Table (Portion) z F(z)
.82 .7939
.83
.7967
.84
.7995
.85
.8023
.20
?
8.0
-0.84
0
.80
X
Z
Statistics Lecture Notes – Chapter 07
Normal Distribution
A7.3
–
2. Convert to X units using the formula:
X
μ
Zσ
8 .
0
(
0 .
84 ) 5 .
0
3 .
80
So 20% of the values from a distribution with mean
8.0 and standard deviation 5.0 are less than 3.80
Statistics Lecture Notes – Chapter 07
Exponential Distribution
Used to model the length of time between two occurrences of an event (the time between arrivals)
Examples:
Time between trucks arriving at an unloading dock
Time between transactions at an ATM Machine
Time between phone calls to the main operator
The exponential random variable 𝑥 has a probability density function f(x)
λ e
λ x for x
0
e= 2.71828 and 𝜆 parameter of the distribution
Statistics Lecture Notes – Chapter 07
Exponential Distribution
Defined by a single parameter (lambda) 𝜇 = 𝜎 =
1 𝜆
If X is an exponential random variable,
−𝜆𝑥
−𝜆𝑥
1
2
2
1
−𝜆𝑥
1
−𝜆𝑥
2
Statistics Lecture Notes – Chapter 07
Exponential Distribution
Ex7.4
– The lifetime of an alkaline battery (measured in hours) is exponentailly distributed with 𝜆 = 0.05
a. What are the mean and standard deviation of the battery’s lifetime?
b. Find the probability that a battery will last between 10 and 15 hours?
c. What is the probability that a battery will last for more than 20 hours?
Statistics Lecture Notes – Chapter 07
Exponential Distribution
A7.4a
– Mean and standard deviation are the same and equal to 1 𝜆
1 1 𝜇 = 𝜎 = = = 20 ℎ𝑜𝑢𝑟𝑠 𝜆 0.05
A7.4b
– Let X donate the lifetime of a battery. The required probability is;
𝑃 10 ≤ 𝑋 ≤ 15 = 𝑒
−𝑜.𝑜5(10)
= 𝑒 −0.5
− 𝑒
− 𝑒
−0.05(15)
−0.75
= 0.6065 − 0.4724
= 0.1341
Statistics Lecture Notes – Chapter 07
Exponential Distribution
A7.4c
–
𝑃 𝑋 > 20 = 𝑒 −0.05(20) = 𝑒 −1 = 0.3679
Statistics Lecture Notes – Chapter 07
Exercises
Q7.1
– Delta Airlines quotes a flight time of 2 hours, 5 minutes for its flights from Cincinnati to Tampa. Suppose we believe that actual flight times are uniformly distributed between 2 hours and 2 hours, 20 minutes a. Show the graph of the probability density function for flight time.
b. What is the probability that the flight will be no more than 5 minutes late?
c. What is the probability that the flight will be more than
10 minutes late?
d. What is the expected flight time?
Statistics Lecture Notes – Chapter 07
Exercises
Q7.2
– The driving distance for the top 100 golfers on the
PGA tour is between 284.7 and 310.6 yards (Golfweek,
March 29, 2003). Assume that the driving distance for these golfers is uniformly distributed over this interval.
a. Give a mathematical expression for the probability density function of driving distance.
b. What is the probability the driving distance for one of these golfers is less than 290 yards?
c. What is the probability the driving distance for one of these golfers is at least 300 yards?
d. What is the probability the driving distance for one of these golfers is between 290 and 305 yards?
e. How many of these golfers drive the ball at least 290 yards?
Statistics Lecture Notes – Chapter 07
Exercises
Q7.3
– A muffler company advertises that you will receive a rebate if it takes longer than 30 minutes to replace your muffler. Experience has shown that the time taken to replace a muffler is approximately normally distributed with a mean of 25 minutes and a standard deviation of 2.5
minutes.
a. What proportion of customers receive a rebate?
b. What proportion of mufflers take between 22 and 26 minutes to replace?
c. What should the rebate-determining time of 30 minutes be changed to if the company wishes to provide only 1% of customers with a rebate?
Statistics Lecture Notes – Chapter 07
Exercises
Q7.4
– Records show that the playing time of major league baseball games is approximately normally distributed with a mean of 156 minutes and a standard deviation of 34 minutes. If one game is selected at random, find the probability that it will last: a. more than 3 hours?
b. between 2 and 3 hours?
c. less than 1.5 hours?
Statistics Lecture Notes – Chapter 07
Exercises
Q7.5
– The average stock price for companies making up the S&P 500 is $30, and the standard deviation is $8.20
(BusinessWeek, Special Annual Issue, Spring 2003).
Assume the stock prices are normally distributed: a. a. What is the probability a company will have a stock price of at least $40?
b. b. What is the probability a company will have a stock price no higher than $20?
c. c. How high does a stock price have to be to put a company in the top 10%?
Statistics Lecture Notes – Chapter 07
Exercises
Q7.6 – The length of life of a certain brand of light bulb is exponentially distributed with a mean of 5,000 hours a. Find the probability that a bulb will burn out within the first 1,000 hours?
b. Find the probability that a bulb will last more than
7,000 hours?
c. Find the probability that the lifetime of a bulb will be between 2,000 and 8,000 hours?
Statistics Lecture Notes – Chapter 07
Exercises
Q7.7
– The time between arrivals of vehicles at a particular intersection follows an exponential probability distribution with a mean of 12 seconds a. Sketch this exponential probability distribution.
b. What is the probability that the arrival time between vehicles is 12 seconds or less?
c. What is the probability that the arrival time between vehicles is 6 seconds or less?
d. What is the probability of 30 or more seconds between vehicle arrivals?
Statistics Lecture Notes – Chapter 07