ries

advertisement
•
•
Stat 402B (Spring 2016): Slide Set 1
2
– distributional assumptions made about data from agricultural expts.
may not hold for industrial data
– explains how concepts in DOE enable us to make inference from
industrial data even if the assumptions do not hold
Text book takes these differences into consideration:
– Time span of an experiment - usually long for ag expts. requiring
designs that allow more control e.g., using blocked designs
– Cost of exprimentation - industrial expts are more expensive so expts
are smaller expts run in sequence compared to complete expts in ag.
– Type of inference - estimation of effects important in industry rather
than hypothesis tests as the use of anova in ag.
Some differences exist between agrcultual and industrial experimentation:
Preliminaries (continued)
Last update: January 18, 2016
Stat402B (Spring 2016)
Notes Set #1
n≥
Which gives us the inequality
L2
4s2t2α/2,n−1
√
W ≤ L for a fixed α, i.e. 2(s/ n)tα/2,n−1 ≤ L
√
A (1 − α)100% CI for μ√is given by ȳ ± tα/2,n−1(s/ n). Thus the width
of this CI is W = 2(s/ n)tα/2,n−1
3
Could use this relationship to choose a sufficiently largely sample size to
have the required precision as specified by L?
•
•
Example: Single sample case
Choices of Sample Size
Stat 402B (Spring 2016): Slide Set 1
1
A field experiment to compare varieties of soybean or a lab experiment
run to compare operating temperatures in a catalytic cracker may use
the same model
•
Terms like treatment, plot, block are some examples. When appropriate
these continue to be used today
•
Does not really matter as statisticians think in term of linear mathematical
models which apply whatever the field of application is.
Major ideas of DOE were developed at the time. Naturally, common
terms in DOE came from this background
•
•
Sir Ronald Fisher pioneered use of mathematical statistics in DOE at the
Rothampstead Agricultural Experiment Station 1920-1940
Stat 402B (Spring 2016): Slide Set 1
•
Preliminaries
Stat 402B (Spring 2016): Slide Set 1
We must use an estimate s2 from a preliminary experiment, and then
collect additional observations if the sample size is not enough.
•
4 × .08t2.025,n−1
.32
RHS
18.2
16.4
16.14
15.98
The inequality n must exceed the RHS is satisfied when n ≥ 17.
n
10
15
16
17
6
H0 : μ1 = μ2 vs H1 : μ1 = μ2 (σ 2 unknown)
Example Two-sample Problem
7
And that 1 − β is the power of the test for a specified alternative Ha. We
use the operating characteristic curve of a test (OC) curve for determining
the sample size required to have a certain power.
β=P(type II error)=P(fail to reject H0|H0 is false)
Hypothesis Testing
We want to select a sample size that will allow us to control the type II
error rate (or equivalently the power of the test) for a specified type I error
rate, α.
Recall:
Choices of Sample Size (Continued)
Stat 402B (Spring 2016): Slide Set 1
If an estimator s2 was available from a previous experiment, we could
use trial and error to arrive at a suitable value for n. Suppose s2 ≈ .08.
From the total sample of 17 observations we have ȳ = 42.576, s2 =
.075959 which gives a 95% CI of 42.576±.146 , i.e. W = .292
42.80 42.82 42.66 42.24 42.84 42.15 42.88
Stat 402B (Spring 2016): Slide Set 1
Now substituting values for n in both sides of the inequality:
n≥
Then the inequality becomes
Stat 402B (Spring 2016): Slide Set 1
4s2tα/2,n−1 4(.070987)(2.262)
=
= 16.14 ≈ 17
L2
.32
Draw 17-10=7 more samples:
n≥
Compute a new sample size n using the inequality
5
•
•
4
ȳ = 42.541 and s2 = .070987 gives a 95% CI of 42.541 ± .1905 i.e.
W = .381
Thus W ≥ L, the precision of our estimate does not meet the
specification.
42.71 42.65 42.73 42.78 42.48 42.09 42.12 42.49 42.48 42.88
Draw a sample of size 10:
Example
Suppose we want the mean of certain measurement to be estimated within
±0.15 of the sample mean with 95% confidence. That is, we are specifying
that L = 0.30 when α = 0.05 in the above formula.
•
Stat 402B (Spring 2016): Slide Set 1
Hence n*=2n-1=25 giving the required sample size of n=13.
10
From the chart for a one-sided test, (see below), for d=.6 and β = .1,
we get n* approximately equal to 25.
•
= .6
−μB |
=
Compute d= |μA2σ
•
Set α = .05. Want to have a Power of .9 to detect a difference in mean
egg production of 12. Assume σ 2=100
H0 : μA = μB vs Ha : μB > μA
11
Stat 402B (Spring 2016): Slide Set 1
Stat 402B (Spring 2016): Slide Set 1
−μ2 |
0.25
= 0.5
Compute d = |μ12σ
2σ = σ . So if σ = .25 (from past data or
experimenter’s educated guess), d = 1 and from the curve the sample size
needed to meet the criterion is approx. 8.
Suppose that in the Portland cement mortar problem (see Table 2.1), if the
difference in the means is as 0.5 kgf/cm2 we want to detect it with type II
error probability of 0.05 or less.
2 For fixed d and α, β decreases as the sample size n increases. Thus , the
larger the sample size the higher the power of the test (1 − β) to detect
a given difference in the means.
1 For fixed α and n, β decreases as d increases. Thus, the larger the
difference in the means the more likely the test will detect it, a useful
property for a test to have.
9
Power of a test is defined as 1-β
Requiring a low type II error probability is equivalent to the test having high
power. Want to compare mean egg production for Diets A and B. Say we
want to test whether B is better.
12
20
Stat 402B (Spring 2016): Slide Set 1
Some properties of the test can be illustrated using the OC curve
8
Note that the value on the curve in this case is actually 2n-1.
Define δ = μ1 − μ2, d = |μ1 − μ2|/2σ = |δ|/2σ. The OC curve for the
two-sided test is below:
Stat 402B (Spring 2016): Slide Set 1
ȳ1 −ȳ2
S.E.(y¯1 −ȳ2 )
=
Sp
1
2
ȳ1 −ȳ2
1
1
n +n
12
Design 2 Paired Design (Section 2.5)
Take 10 metal specimens. Use each specimen to test both tips. Decide
which (Tip 1 or Tip 2) is used with each specimen first by tossing a coin.
The estimate of error Sp2 measures the variability of the specimens. Thus
any different due to the two tips may be harder to detect if S.E.(ȳ1 − ȳ2)
is inflated because this variability is large.
Two-sample t-statistic =
Design 1 Completely Randomized Design
Take 20 specimens (say, same metal but scrap pieces cut from different
rods). Assign Tip 1 to 10 specimens and Tip 2 to the other 10 completely
randomly. Obtain yij , j = 1, . . . , 10; j = 1, . . . , 10
The Paired Comparison Design (Section 2.5)
Stat 402B (Spring 2016): Slide Set 1
13
The two observations taken from each specimen are paired. Since
the variability of specimen does not affect the difference in these pairs of
observations, the sample variance computed from the difference is a more
accurate estimate of S.E.(ȳ1 − ȳ2)
• The two measurement made from the same specimen of metal form
a block. Hence the variability among specimens does not affect the
difference.
• Blocks are formed so that experiment units within blocks are relatively
more homogenous than among blocks with respect to the response being
measured.
• Blocking represents a restriction on randomization.
• In the CRD case, the t-statistic has 18 d.f. We have “lost” 10 d.f. in
estimating the variance of difference in sample means. (Question: Is it
better or worse to lose d.f.?)
• A “blocked experiment” is not always better. Have to compare designs
after experiments.
Download