Week 9-25-06 and some preparation for exam 2. 1

advertisement
Week 9-25-06 and some preparation for exam 2.
1
Week 9-25-06 and some preparation for exam 2.
2
NORMAL DISTRIBUTION
BERNOULLI TRIALS
BINOMIAL DISTRIBUTION
POISSON DISTRIBUTION
3
note the
point of
inflexion
note the
balance point
4
SD=15
MEAN = 100
point of
inflexion
5
5
50
6
6.3
39.7
7
Illustrated for the
Standard Normal
Mean=0, SD=1
~68%
8
Illustrated for the
Standard normal
Mean=0, SD=1
~95%
9
15
~68/2
=34%
~95/2=47.5%
85
100
130
10
15
~68/2
=34%
~95/2=47.5%
85
100
130
11
IQ
15
100
1
Z
0
Standard Normal
12
13
P(Z > 0) = P(Z < 0 ) = 0.5
P(Z > 2.66) = 0.5 - P(0 < Z < 2.66)
= 0.5 - 0.4961 = 0.0039
P(Z < 1.92) = 0.5 + P(0 < Z < 1.92)
= 0.5 + 0.4726 = 0.9726
14
x
p(x)
1
0
p
q
__
1
(1 denotes “success”)
(0 denotes “failure”)
0<p<1
q=1-p
15
P(success) = P(X = 1) = p
P(failure) = P(X = 0) = q
e.g. X = “sample voter is Democrat”
Population has 48% Dem.
p = 0.48, q = 0.52
P(X = 1) = 0.48
16
P(S1 S2 F3 F4 F5 F6 S7) = p3 q4
just write P(SSFFFFS) = p3 q4
“the answer only depends upon how
many of each, not their order.”
e.g. 48% Dem, 5 sampled, with-repl:
P(Dem Rep Dem Dem Rep) = 0.483 0.522 17
e.g. P(exactly 2 Dems out of sample of 4)
= P(DDRR) + P(DRDR) + P(DDRR)
+ P(RDDR) + P(RDRD) + P(RRDD)
= 6 .482 0.522 ~ 0.374.
There are 6 ways to arrange 2D 2R.
18
e.g. P(exactly 3 Dems out of sample of 5)
= P(DDDRR) + P(DDRDR) + P(DDRRD)
+ P(DRDDR) + P(DRDRD) + P(DRRDD)
+ P(RDDDR) +P(RDDRD) + P(RDRDD)
+ P(RRDDD) = 10 .483 0.522 ~ 0.299.
There are 10 ways to arrange 3D 2R.
Same as the number of ways to select 3 from 5. 19
5! ways to arrange 5 things in a line
Do it thus (1:1 with arrangements):
select 3 of the 5 to go first in line,
arrange those 3 at the head of line
then arrange the remaining 2 after.
5! = (ways to select 3 from 5) 3! 2!
So num ways must be 5! /( 3! 2!) = 10.
20
Let random variable X denote the number of
“S” in n independent Bernoulli p-Trials.
By definition, X has a Binomial Distribution
and for each of x = 0, 1, 2, …, n
P(X = x) = (n!/(x! (n-x)!) ) px qn-x
e.g. P(44 Dems in sample of 100 voters) =
(100!/(44! 56!)) 0.4844 0.52100-44 = 0.05812.
21
n!/(x! (n-x)!) is the count of how many
arrangements there are of a string of x letters
“S” and n-x letters “F.”
px qn-x is the shared probability of each string
of x letters “S” and n-x letters “F.”
(define 0! = 1, p0 = q0 = 1 and the formula goes
through for every one of x = 0 through n)
.
is short for the arrangement count
=
22
Week 9-25-06
23
n = 10, p = 0.4
mean = n p = 4
sd = root(n p q)
~ 1.55
Week 9-25-06
24
n = 30, p = 0.4
mean = n p = 12
sd = root(n p q)
~ 2.683
Week 9-25-06
25
n = 100, p = 0.4
mean = n p = 40
sd = root(n p q)
~ 4.89898
Week 9-25-06
26
-mean
e
x
mean
p(x) =
/ x!
for x = 0, 1, 2, ..ad infinitum
Week 9-25-06
27
e..g. X = number of times ace
of spades turns up in 104 tries
X~ Poisson with mean 2
-mean
x
p(x) = e
mean / x!
e.g. p(3) = e-2 23 / 3! ~ 0.18
Week 9-25-06
28
e.g. X = number of raisins in MY
cookie. Batter has 400 raisins
and makes 144 cookies.
E X = 400/144 ~ 2.78 per cookie
-mean
x
p(x) = e
mean / x!
-2.78
2
e.g. p(2) = e
2.78 / 2! ~ 0.24
(around 24% of cookies have 2 raisins)
Week 9-25-06
29
THE FIRST BEST THING
ABOUT THE POISSON IS
THAT THE MEAN ALONE
TELLS US THE ENTIRE
DISTRIBUTION!
note: Poisson sd = root(mean)
Week 9-25-06
30
E X = 400/144 ~ 2.78 raisins per cookie
sd = root(mean) = 1.67
(for Poisson)
Week 9-25-06
31
THE SECOND BEST THING ABOUT
THE POISSON IS THAT FOR A MEAN
AS SMALL AS 3 THE NORMAL
APPROXIMATION WORKS WELL.
1.67 = sd = root(mean)
Special to Poisson
Week 9-25-06
mean 2.78
32
E X = 127.8 accidents
If Poisson then sd = root(127.8) =
11.3049 and the approx dist is:
~
Week 9-25-06
sd = root(mean) = 11.3
Special to Poisson
mean 127.8 accidents
33
Week 9-25-06
34
The overwhelming majority of
samples of n from a population of
N can stand-in for the population.
35
The overwhelming majority of
samples of n from a population of
N can stand-in for the population.
36
Sample size n must be “large.”
For only a few characteristics at a
time, such as profit, sales, dividend.
SPECTACULAR FAILURES MAY OCCUR!
37
With-replacement
38
With-replacement
vs without replacement.
39
This sample is obviously
“not representative.”
40
Rule of thumb: With and without
replacement are about the same if
root [(N-n) /(N-1)] ~ 1.
41
They would have you believe
the population is {8, 9, 12, 42}
and the sample is {42}.
A SET is a collection of distinct entities.
42
IF THE OVERWHELMING
MAJORITY OF SAMPLES
ARE “GOOD SAMPLES”
THEN WE CAN OBTAIN A
“GOOD” SAMPLE BY
RANDOM SELECTION.
43
Digits are made to correspond to letters.
a = 00-02 b = 03-05 …. z = 75-77
Random digits then give random letters.
1559 9068 … (Table 14, pg. 809)
15 59 90 68 etc… (split into pairs)
f t
* w etc… (take chosen letters)
For samples without replacement just
pass over any duplicates.
44
The Great Trick is far more powerful than we
have seen.
A typical sample closely estimates such things
as a population mean or the shape of a
population density.
But it goes beyond this to reveal how much
variation there is among sample means and
sample densities.
A typical sample not only estimates
population quantities.
It estimates the sample-to-sample
variations of its own estimates.
45
The average account balance is $421.34 for a
random with-replacement sample of 50
accounts.
We estimate from this sample that the average
balance is $421.34 for all accounts.
From this sample we also estimate
and display a “margin of error”
$421.34 +/- $65.22 =
.
46
NOTE: Sample standard deviation s
may be calculated in several equivalent ways,
some sensitive to rounding errors, even for n = 2.
47
The following margin of error calculation for
n = 4 is only an illustration. A sample of four
would not be regarded as large enough.
Profits per sale = {12.2, 15.3, 16.2, 12.8}.
Mean = 14.125, s = 1.92765, root(4) = 2.
Margin of error = +/- 1.96 (1.92765 / 2)
Report: 14.125 +/- 1.8891.
A precise interpretation of margin of error will be given later in the course,
including the role of 1.96. The interval 14.125 +/- 1.8891 is called a “95%
confidence interval for the population mean.”
We used: (12.2-14.125)2 + (15.3-14.125)2
+ (16.2-14.125)2 + (12.8-14.125)2 = 11.1475. 48
A random with-replacement sample of 50
stores participated in a test marketing. In 39
of these 50 stores (i.e. 78%) the new package
design outsold the old package design.
We estimate from this sample that 78% of all
stores will sell more of new vs old.
We also estimate a “margin of error +/- 11.5%
Figured:
1.96 root(pHAT qHAT)/root(n)
=1.96 root(.78 .22)/root(50)
= 0.114823 in Binomial setup
49
A sample of only n = 600 from a
population of N = 500 million.
(FINE resolution)
50
51
Download