Ali Ghodsi 2G1503 1

advertisement
2G1503 Simulation and Modeling Exercise #4
Ali Ghodsi aligh@imit.kth.se
1
Random Variates: normal 1/5
• Generating random variaties for the
standard normal distribution N(0,1):
1
f ( x) =
e
2π
x2
−
2
• However, ITT, cannot be applied since the
CDF cannot be retrieved in closed form!
x
F ( x) =
∫
−∞
1
e
2π
t2
−
2
dt
2
Random Variates: normal 2/5
• Generating random variaties for the
standard normal distribution N(0,1):
– Assume two random
variables X and Y
from N(0,1)
– We know:
X=Z cos(θ )
Y=Z sin(θ )
Z2=X2+Y2
Y-axis
(X, Y)
Z
θ
X-axis
3
Random Variates: normal 3/5
• We know:
X=Z cos(θ )
Y=Z sin(θ )
Z2=X2+Y2
Y-axis
(X, Y)
Z
θ
X-axis
• It turns out that:
– Z2 is exponentially
distributed with λ=1/2
– θ is uniformly distributed
On [0, 2π)
4
Random Variates: normal 4/5
• To generate two values x and y that are
normally distributed N(0, 1) :
Generate an exponentially distributed z2 with
λ=1/2 and a uniformly distributed θ on [0, 2π)
z 2 = −2 ln( R1 )
where R1 is uniformly distributed on [0,1)
θ = 2πR2
where R2 is uniformly distributed on [0,1)
• Formula:
x = − 2 ln( R1 ) cos(2πR2 )
y = − 2 ln( R1 ) sin(2πR2 )
5
Random Variates: normal 5/5
• To transform x and y from N(0, 1) into
v1 and v2 N(µ, σ2):
–
v1 = µ + xσ
v2 = µ + yσ
– Both v1 and v2 will be normally distributed on
N(µ, σ2)
6
Develop an input model
• Develop an input model for the following historical
data, n=50:
79.919
3.081
0.062
1.961
5.845
3.027
6.505
0.021
0.013
0.123
6.769
59.899
1.192
34.760
5.009
18.387
0.141
43.565
24.420
0.433
144.695
2.663
17.967
0.091
9.003
0.941
0.878
3.371
2.157
7.579
0.624
5.380
3.148
7.078
23.960
0.590
1.928
0.300
0.002
0.543
7.004
31.764
1.005
1.147
0.219
3.217
14.382
1.008
2.336
4.562
7
Input Modeling
• Making an input model:
1.
2.
3.
4.
Collect Data (already given)
Identify PDF
Estimate Parameters
Perform Goodness-of-Fit test
8
Step 2: Identifying a PDF
•
Determine the number of intervals:
–
•
√n = √50 ≈ 7.1 ≈ 7 intervals
Determine interval widths:
–
–
–
The data seems to have 2 high extreme values :
144.695 and 79.919
Disregard those, data seems varies between
[0.002, 59.899]
(59.899-0.002) / 7 ≈ 8
9
Histogram try #1(coarse)
e
M
or
64
56
48
40
32
24
intervals
16
Try
smaller
40
35
30
25
20
15
10
5
0
8
Many
small
values!
Coarse!
Frequency
Histogram
10
Histogram try #2
Histogram
30
25
20
15
10
5
or
e
M
57
51
45
39
33
27
21
15
9
0
3
Interval
Width 3
11
Step 2: PDF is exponential
• Looks exponential!
Histogram
30
25
20
15
10
5
or
e
M
57
51
45
39
33
27
21
15
9
3
0
12
Step 3: Estimate parameter(s)
•
The exponential distribution only has one
parameter, the inverse mean λ=1/E[X]
•
E[X] = 3.027 + 6.505 + ... + 4.562 / 50 =
11.894
•
λ = 1/11.894 = 0.084
13
Step 4: Goodness-of-fit Test
• Test the null-hypothesis:
H0: the sample has exponential distribution
with λ=0.084 with =0.05
• We choose to use the Chi-Square Test
14
Chi Square Test (1/2)
• The intervals of the Chi Square test can be
either equi-probable or of equal width.
– Use equi-probable for continous distributions
• Exception: Normal Distribution
– Use equal width for discrete distributions
15
Chi Square Test Intervals (2/2)
The number of intervals should be:
Sample Number of
Size, n Intervals, k
• If equi-probable intervals
are chosen, then the number
of intervals, k, should be < n/5
20
Don’t use
Chi
Square!
50
5 to 10
100
10 to 20
>100
√n to n/5
16
Chi Square Test: exponential, equi.prob.
• Null-hypothesis:
H0: the sample has exponential distribution with
λ=0.084 with =0.05
• Continous distribution, use equi-probable intervals!
• Rembember:
–
−1
F (r ) = −
ln(1 − r )
λ
,
for
0.0 ≤ r < 1.0
17
Chi Square Test: exponential, equi.prob.
• If our n=50, the number of classes, k, has to
be ≤n/5 = 10
• Assume k=8, then each interval should have
the probability 1.0 / 8 = 0.125
• The endpoints for the inverse intervals ax
are:
a0=0.0,
a4=0.5,
a8=1.0
a1=0.125,
a5=0.625,
a2=0.25,
a6=0.75,
a3=0.375,
a7=0.875,
18
Chi Square Test: exponential, equi.prob.
• The endpoints for the inverse intervals ax
are:
a0=0.0,
a4=0.5,
a8=1.0
a1=0.125,
a5=0.625,
a2=0.25,
a6=0.75,
a3=0.375,
a7=0.875,
• To get the real interval endpoints i0,...,i10:
–
–
–
–
i0=0, i1 = F-1( a1 ), ..., i8= F-1(a8 ), i9=∞
i0=0,
i1=0.590,
i2=3.425,
i3=5.595
i5=11.677., i6=16.503, i7=24.755
i4=8.252,
i8=∞
19
Chi Square Test: exponential, equi.prob.
•
Interval
Observed, Oi
Expected, Ei
(Oi-Ei)2 / Ei
0,1.59
19
6.25
26.01
1.59,3.4
10
6.25
2.25
3.4,5.6
3
6.25
0.81
5.6,8.3
6
6.25
0.01
8.3,11.7
1
6.25
4.41
11.7,16.5
1
6.25
4.41
16.5,24.8
4
6.25
0.81
24.8, ∞
6
6.25
0.01
50
50
39.6
Sum
20
Chi Square Test: exponential, equi.prob.
• χ2 value is 39.6
• For a 5% chance that the data is rejected
even though normally distributed (=0.05),
k-s-1 = 8-1-1 = 6 degrees if freedom,
χ20.05, 6 is 16.8.
• χ2>χ20.05, 6 the null-hypothesis is rejected!
21
Chi Square Test: exponential, equi.prob.
• Even χ20.01, 6=12.6 would reject the nullhypothesis. (only 1% of the normally distributed
samples with 6 degrees of freedom have values
less than 12.6
• I.e. with 99% certainty the data is not normally
distributed!
22
Download