pptx

advertisement
Probability distributions:
part 1
BSAD 30
Dave Novak
Source: Anderson et al., 2015
Quantitative Methods for Business 12th
edition – some slides are directly from
J. Loucks © 2013 Cengage Learning
Covered so far…

Chapter 1: Introduction
What is modeling
 Types of models
 Basic problem formulation
 Review of basic linear (algebraic) problems


Chapter 2: Introduction to probability

2
Review of probability concepts (complement,
union, intersection, conditional probability,
joint probability table, independence,
mutually exclusive)
Overview
Random Variables
 Discrete Probability Distributions

Uniform Probability Distribution
 Binomial Probability Distribution
 Poisson Probability Distribution


3
Link to examples of types of discrete
distributions
• http://www.epixanalytics.com/modelassist/AtRisk/
Model_Assist.htm#Distributions/Discrete_distribu
tions/Discrete_distributions.htm
Overview

In general, what is a probability distribution?

4
A table, equation, or graphical
representation that links the possible
outcomes of an experiment to their likelihood
(probability) of occurrence
Overview

We will briefly look at three “common”
discrete probability examples
Uniform
 Binomial
 Poisson


5
In business applications, we often find
instances of random variables that follow a
discrete uniform, binomial, or Poisson
probability distribution
What is a random variable?
A random variable (RV) is a numerical
description of the outcome of an experiment
 Keep in mind that there is a difference
between numeric variables and categorical
variables

Numeric: temperature, speed, age,
monetized data, etc.
 Categorical: state of residence, gender,
blood type, etc.

6
What is a random variable?

7
Two types of numeric random variables:

Discrete

Continuous
Random variables
8
Random variables
Question
Family
size
Random Variable x
x = Number of dependents in
family reported on tax return
Distance from x = Distance in miles from
home to store home to the store site
Own dog
or cat
9
Type
Discrete
Continuous
Discrete
x = 1 if own no pet;
= 2 if own dog(s) only;
= 3 if own cat(s) only;
= 4 if own dog(s) and cat(s)
Example

Discrete random variable (RV) with a finite
number of possible values
Let x = number of TVs sold at the store in one day,
where x can take on 5 values (0, 1, 2, 3, 4)
There is a readily identifiable upper bound to
the number of TVs sold on any given day
 In this case, no more than 4 TVs sold

10
Example

Discrete random variable (RV) with an
infinite number of possible values
Let x = number of customers arriving in one day,
where x can take on the values 0, 1, 2, . . .
There is no readily identifiable upper bound
on the number of customers coming into the
store on any given day
 There cannot be an infinite # of customers,
but we are not setting an upper bound (could
be 75, 500, or 2,000)

11
Discrete probability
distributions
The probability distribution for a random
variable describes how probabilities
associated with each value are distributed
(or allocated) over all possible values
 We can describe a discrete probability
distribution with a table, graph, or equation


12
In the TV sales example, we would want a
mathematical and/or visual representation of
the probability of selling 0, 1, 2, 3, or 4 TVs
on any given day
Discrete probability
distributions

The probability distribution is defined by a
probability function, denoted by f(x), which
provides the probability for each value of the
random variable
The function f(x) is a mathematical
representation of the probability distribution
 The following conditions are required:

f(x) > 0
13
f(x) = 1
Discrete distribution:
DiCarlo motors example

Using historical data on car sales, a tabular
representation of sales is created
Units Sold
0
1
2
3
4
5
14
Number
of Days
54
117
72
42
12
3
300
.18 = 54/300
x
0
1
2
3
4
5
f(x)
.18
.39
.24
.14
.04
.01
1.00
.04 = 12/300
Discrete distribution:
DiCarlo motors example

Graphical representation
Probability
.50
.40
.30
.20
.10
0
15
1
2
3
4
5
Values of Random Variable x (car sales)
Discrete distribution:
DiCarlo motors example

The probability distribution provides the
following information
There is a 0.18 probability that no cars will
be sold during a day  f(0) = 18%
 The most probable sales volume is 1, with
f(1) = 0.39  f(1) = 39%
 There is a 0.05 probability of either four or
five cars being sold  f(4) + f(5) = 5%

16
Summary
Up to this point, we have not discussed
the specific TYPE of discrete probability
distribution (i.e. uniform, binomial, Poisson,
etc.)
 We have only discussed probability
distributions in terms of being discrete as
opposed to continuous
 A review of basic statistical concepts is next

17
Expected value and
variance

The expected value, or mean, of a random
variable is a measure of its central location

Mean, median, and mode are measures of
central tendency because they identify a
single value as “typical” or representative of
all values in a probability distribution
E(x) =  = x f(x)
18
Expected value and
variance
The variance, 2, summarizes the variability
in the values of a random variable
 The standard deviation, , is defined as the
positive square root of the variance

Var(x) =  2 = (x - )2f(x)
StdDev(x) =  =  2
19
Expected value and
variance
Both the StdDev and variance provide a
measure of how much the values in the
probability distribution differ from the mean
 The higher the standard deviation, the more
different the different observations are from
one another and from the mean
 When a probability distribution has a high
standard deviation, the mean is not a good
measure of central tendency

20
Expected value and
variance
Scores = 1,4,3,4,2,7,18,3,7,2,4,3
Mean = 5
Median = 3.5
Standard Deviation = 4.53

21
The standard deviation indicates that the average
difference between each score and the mean is around
4.5 points. However, only one score (18) is 4.5 or more
points different from the mean. The one extreme score
(18) overly influences the mean. The median (3.5) is a
better measure of central tendency in this case
because extreme scores do not influence the median
Discrete distribution:
DiCarlo motors example
Units Sold
0
1
2
3
4
5
22
Number
of Days
54
117
72
42
12
3
300
x
0
1
2
3
4
5
f(x)
.18
.39
.24
.14
.04
.01
1.00
DiCarlo motors example

Calculate expected value of discrete RV
x
0
1
2
3
4
5
23
f(x)
.18
.39
.24
.14
.04
.01
E(x) =
expected number of
cars sold in a day
xf(x)
.00
.39
.48
.42
.16
.05
1.50
0 x 0.18 = 0
1 x 0.39 = 0.39
DiCarlo motors example

Calculate variance and StdDev
x
x-
(x - )2
f(x)
0
1
2
3
4
5
-1.5
-0.5
0.5
1.5
2.5
3.5
2.25
0.25
0.25
2.25
6.25
12.25
.18
.39
.24
.14
.04
.01
(x - )2f(x)
.4050
.0975
.0600
.3150
.2500
.1225
Variance of daily sales =  2 = 1.2500
Standard deviation of daily sales =  = 1.2500 = 1.118 cars
24
cars
squared
DiCarlo motors example

Calculate variance and StdDev
Var(x) =  2 = (x - )2f(x) = 0.4050 + 0.0975 +

0.0600
+ 0.3150 + 0.2500 + 0.1225
Var(x) =  2 = 1.25
Standard deviation of daily sales = 
= 1.2500 = 1.118 cars
25
Expected value and
variance

From a decision-making or analyst
perspective what are some of the practical
implications of this discussion?
If the data you are analyzing have a high
variance, making decisions based on the
mean, or even stressing the importance of
the average, is likely to be misleading
 The median might be a better measure of
central tendency

26
Expected value and
variance

What should you do?
Generate a visual representation of the data!
 You need to better characterize the data to
see if they fit into any well-known families of
probability distributions – this would be the
first step in analysis
 Are data skewed or symmetrical?
 Knowing what the data “aren’t” is also useful

27
Expected value and
variance

What should you do?
Knowing that data do not follow a particular
distribution is important in terms of analysis
 There are particular characteristics
associated with different types of
distributions that can guide you in your
analysis

28
Discrete Distributions we
will examine

1) Uniform

2) Binomial or Bernoulli

3) Poisson
29
Discrete uniform probability
distribution

The discrete uniform probability distribution
is the simplest example of a discrete
probability distribution given by a formula
f(x) = 1/n
the values of the
random variable
are equally likely
where:
n = the number of values the random
variable may assume

30
Example: getting a 1, 2, 3, 4, 5, or 6 when
rolling single die – f(x) = 1/6
Binomial probability
distribution
Also known as Bernoulli distribution
 Has four properties:

1) Experiment consists of n, independent
trials
 2) Only TWO outcomes are possible for
each trial (success/failure, good/bad, on/off,
yes/no, etc.)
 3) The probability of success stays the same
for all trials
 4) All trials are independent

31
Binomial probability
distribution

We are interested in the number of
successes, or positive outcomes occurring
in the n trials

x denotes the number of successes, or
positive outcomes occurring in the n trials
n!
f (x) 
p x (1  p)( n x )
x !(n  x )!
32
where:
f(x) = the probability of x successes in n trials
n = the number of trials
p = the probability of success on any one trial
Binomial probability
distribution
n!
x
( nx )
f (x) 
p (1  p)
x !(n  x )!
Number of experimental
outcomes providing exactly
x successes in n trials
33
Probability of a particular
sequence of trial outcomes
with x successes in n trials
Binomial probability
distribution
Assume the probability that any customer
who comes into a store and actually makes
a purchase is 0.3 (30% chance of success)
 What is the probability that 2 of the next 3
customers who enter the store make a
purchase?


34
Identify: n, x, p
Binomial probability
distribution
35
Binomial probability
distribution (decision tree)
1st Customer
2nd Customer
Purchases
(.3)
Purchases
(.3)
Does Not
Purchase (.7)
(.7)
Does Not
Purchase
36
Purchases
(.3)
Does Not
Purchase (.7)
3rd Customer
P (.3)
x
3
Prob.
.027
DNP (.7)
2
.063
P (.3)
2
.063
DNP (.7)
1
.147
P (.3)
2
.063
DNP (.7)
P (.3)
1
.147
1
.147
DNP (.7)
0
.343
Binomial probability
distribution

If a six-sided die is rolled three times, what
is the probability that the number 5 comes
up twice?

37
Identify: n, x, p
Binomial probability
distribution
38
Binomial probability
distribution
1st roll
2nd roll
Success
(.17)
Success “5”
(.17)
Failure (.83)
(.83)
Failure
(1,2, 3, 4, 6)
Success
(.17)
Failure (.83)
3rd roll
S (.17)
3
Prob.
.005
F (.83)
2
.024
S (.17)
2
.024
F (.83)
1
.117
S (.17)
2
.024
F (.83)
1
.117
S (.17)
1
.117
0
.572
F (.83)
39
x
Binomial probability
distribution

What’s the probability if I roll a die 10 times,
the number 5 comes up four times?

40
Identify: n, x, p
Binomial probability
distribution

Expected value
E(x) =  = np

Variance
Var(x) =  2 = np(1  p)

Standard deviation
  np(1  p)
41
Binomial probability
distribution

42
In the clothing store example, calculate:

Expected value

Variance

Standard deviation
Poisson probability
distribution

A Poisson distributed random variable is
often useful in estimating the number of
occurrences over a specified interval of time
or space which can be counted in whole
numbers


43
Very useful in RISK analysis
It is a discrete random variable that may
assume an infinite sequence of values (x =
0, 1, 2, . . . ∞)
Poisson versus Binomial

How is an RV that follows a Poisson
distribution different from an RV that follows
a binomial distribution?
It is possible to count how many events have
occurred, but meaningless to ask how many
events have NOT occurred
 In the binomial situation, we know the
probability of two mutually exclusive events
(p, q) – in the Poisson situation, we have no
q (it has only one parameter the average
frequency an event occurs)

44
Poisson versus Binomial
Binomial Distribution
Poisson Distribution
Fixed Number of Trials (n)
[10 pie throws]
Infinite Number of Trials
Only 2 Possible Outcomes
[hit or miss]
Unlimited Number of Outcomes
Possible
Probability of Success is Constant (p)
[0.4 success rate]
Mean of the Distribution is the Same
for All Intervals (mu)
Each Trial is Independent
[throw 1 has no effect on throw 2]
Number of Occurrences in Any Given
Interval Independent of Others
Predicts Number of Successes within a Predicts Number of Occurrences per
Set Number of Trials
Unit Time, Space, ...
45
Source: http://www2.cedarcrest.edu/academic/bio/hale/biostat/session12links/BvsP.html
Poisson probability
distribution

Examples
Number of customers arriving at a
supermarket checkout between 5 PM and 6
PM
 Number of text messages you receive over
the course of a week
 Number of car accidents over the course of
a year

46
Poisson probability
distribution

Two properties of Poisson distributions
1) The probability of occurrence is the same
over any two time intervals of equal length
 2) The occurrence or nonoccurrence in any
time interval is independent of occurrence or
nonoccurrence in any other time interval

47
Poisson probability
distribution
f ( x) 
x e  
x!
where:
f(x) = probability of x occurrences in an interval
 = mean number of occurrences in an interval
e = 2.71828
For more info: https://en.wikipedia.org/wiki/E_(mathematical_constant)
48
Drive-up teller window
example
Suppose that we are interested in the number of cars
arriving at the drive-up teller window of a bank during a
15-minute period on weekday mornings
 We assume that the probability of a car arriving is the
same for any two time periods of equal length (i.e. prob
of a car arriving in the first minute is exactly the same
as the prob of a car arriving in the last minute), and the
arrival or non-arrival of a car in any time period is
independent of the arrival or non-arrival in any other
time period
 An analysis of historical data shows that the average
number of cars arriving during a 15-minute interval of
time is 10, so the Poisson probability function with  =
49 10 applies
Drive-up teller window
example
We want to know the probability that exactly 5 cars
will arrive over the 15 minute time interval
Identify: x and 
 = 10 arrivals / 15 minutes, x = 5
X=5
 => we are given that there are 10 arrivals every 15
minutes, so the average # of arrivals over the time
period is 10
50
Drive-up teller window
example
 = 10 arrivals / 15 minutes, x = 5
105 (2.71828)10
f (5) 
 .0378
5!
So, there is a 3.78% chance that exactly 5 cars will
arrive over the 15 minute time period
51
Highway defect example
• Suppose that we are concerned with the
occurrence of major defects in a section of highway
one month after that section was resurfaced
• We assume that the probability of a defect is the
same for any two highway intervals of equal length
(i.e. the probability of a defect between mile
markers 1 and 2 is the same as the probability of a
defect between mile markers 4 and 5, etc.) and
that the occurrence of a defect in any one mile
interval is independent of the occurrence or
nonoccurrence of a defect in any other interval
• Thus, the Poisson probability distribution applies
52
Highway defect example

53
Find the probability that no major defects occur in a
specific 3-mile stretch of highway assuming that major
defects occur at the average rate of two defects per
mile
Highway defect example
54
Poisson probability
distribution

Expected value
E(x) = µ =  the rate or frequency of an event

Variance
Var(x) =  2 = 

Standard deviation
 = 
55
Highway defect example

56
In the highway defect example, calculate:

Expected value

Variance

Standard deviation
Summary

Discussion of random variables
Discrete
 Continuous


Examples of discrete probability
distributions
Uniform
 Binomial
 Poisson

57
Download