SCInput

advertisement
INPUT MODELING
What distribution to pick
How to estimate parameters
BASICS
• Many phenomena in the modeled system
have unknown parametric properties
(stochastic)
– Life length of a bulb, miss distance of a bullet,
time until the next arrival, number of patrols in a
given week
• Must make two decisions:
– Distribution (based on physics of the phenomena)
– Parameters (based on data)
THINGS TO THINK ABOUT
•
•
•
•
Is the value bounded (above or below)?
Is there a most likely value?
Is the value positive?
Is the value a function of several sub-values
– Is it the minimum?
– Is it the sum
• Is it discrete (one of a small set of values) or
continuous (all numbers [in an interval] are
possible)
Bernoulli Trials
 Context: Random events with two possible values
 Two events: Yes/No, True/False, Success/Failure
 Two possible values: 1 for success, 0 for failure.
 Example: Tossing a coin, Packet Transmission Status,
 An experiment comprises n trial.
 Probability Mass Function (PMF): Probability in one trial
x j  1, j  1, 2,..., n
p,

PMF: p  one trial   p (x j )  1  p  q , x j  0 , j  1, 2 ,..., n
0,
otherwise

Expected Value: E  X j   p
Variance :V  X j    2  p  1  p 
 Bernoulli Process: n trials
p  X 1 , X 1,..., X n ,   p  X 1   p  X 2   ...p  X n 
4
Geometric Distribution
 Context: the number of Bernoulli trials until achieving the
FIRST SUCCESS.
 It is used to represent random time until a first transition occurs.
q
PMF: p (X  k )  
0,
k 1
p , k  0,1, 2,..., n
PMF
otherwise
CDF: F  X   p  X  k   1  1  p 
Expected Value : E  X  
Variance :V  X    
2
k
1
p
k
q
p
2

1 p
p2
Discrete example: roll of a die
Discrete Uniform
p(x)
1/6
1
2
3
4
5
6
 P(x)  1
all x
x
BINOMIAL DISTN
•
•
•
•
•
Let X1, X2, …, Xn be Bernoulli (yes/no’s) w.p. p
Let B = their sum
B is Binomial(n, p)
E[B] = np
VAR[B] = SQRT(np(1-p))
n i
Pr ob( B  k )    p (1  p) n i
i
Probability Distribution
Let X be a continuous rv. Then a probability distribution or probability density
function (pdf) of X is a function f (x) such that for any two numbers a and b,
P  a  X  b    f ( x)dx
b
a
The graph of f is the density curve.
Copyright (c) 2004 Brooks/Cole,
a division of Thomson Learning, Inc.
Probability Density Function
For f (x) to be a pdf
1. The area of the region between the graph of f and the x – axis is equal to 1.
y  f ( x)
Area = 1
Copyright (c) 2004 Brooks/Cole,
a division of Thomson Learning, Inc.
Probability Density Function
P(a  X  b)is given by the area of the shaded
region.
y  f ( x)
a
b
Copyright (c) 2004 Brooks/Cole,
a division of Thomson Learning, Inc.
UNIFORM RANDOM VARIABLE
1/(b-a)
a
b
• Hard boundaries
• Everything in between has equal (1/(b-a))
probability
• Special a = 0, b = 1
TRIANGULAR
a
c
b
• Sum two uniforms to get a triangular
• If the uniforms are identical, c bisects ab
• If you know min, max, most likely, use the triang.
Exponential Distribution
• Usually, exponential distribution is used to
describe the time or distance until some event
happens.
• It is in the form of: 1  x
f ( x) 

e

– where x ≥ 0 and μ>0. μ is the mean or expected
value.
What should exponential distribution
look like
1
0.8
0.6
R(x)
0.4
0.2
0
0
x
Exponential Distribution
µ=20
1
 x 

exp
x 0

  ,
f (x )   20
20


0,
otherwise

µ=20
0,

F (x )  
 x 
1

exp
  ,

 20 

x 0
x 0
15
DISTINCTIVE PROPERTIES OF
EXPONENTIALS(1)
P[ X  t  s X  s]  P[ X  t ]
• Memoryless (very weird)
• X always > 0
• X possible over whole real number line
– you can shift this around
• Minimums of a large set of values is exponental
DISTINCTIVE PROPERTIES OF
EXPONENTIALS(1)
min( X 1 ,..., X n ) ~ Expon(nl )
• Let X1, …, Xn ~ Expon(l)
• l is the rate of failure, arrival, transition
– Plants die at a rate of 20/year
• E[X] = 1/l
– The expected life length of a plan is 1/20 years
• The minimum of the X’s is also exponential and has
rate nl
Standard Normal Cumulative Areas
Shaded area = (z )
Standard
normal
curve
0
z
NORMALS
• Most overused distribution in input modeling
• Support on (-inf, inf)
• 2 intuitive parameters to estimate
–  and 
• Available in Excel using NORMDIST and
NORMINV
STRONG LAW OF LARGE NUMBERS
n
X
i 1
~ N (n , n )
2
i
• So X-bar ~N(m, s)
• The approximation depends on the size of n and
the underlying distribution of the X’s
• The X’s do not become Normal if there are a bunch
of them
SOME CHALLENGES
• Number of minutes we wait
before a machine fails.
• Grab every 100th
component off of an
assembly line, inspect for a
specific defect
– A single outcome
– Number of samples before a
defect is detected
– Number of defects in 10
samples
– Number of defects in 300
samples
• Height of a random student
on a basketball team
– If you know the tallest and
smallest
– If you know the tallest, the
smallest, and the most
common height
– If you know all of the heights
THE SECOND STEP
• Distribution choice
– Based on the physical/operational properties of
the quantity
• Parameters
– Based on data we observe
– Use the Method of Moments
• Measure things from the data like X-bar, VAR, max, min
• Use the properties of the distribution to estimate the
parameters
EXAMPLE: BINOMIAL
•
•
•
•
Recall that  = np, 2 = np(1-p)
Binomials (B) are sums of Bernouli’s (n X’s)
We observe M samples of B
Easy case: n known
M
M
n
j 1
j 1 i 1
B   Bi   X i  npˆ
B
 pˆ
n
EXAMPLE: BINOMIAL
• Hard case:
neither n nor p
known
s  nˆ pˆ (1  pˆ )
2
B
B
s  nˆ (1  )
nˆ
nˆ
s2
B
 1
B
nˆ
s2 B
1 
B nˆ
B
 nˆ
2
s
1
B
2
Download