f o riments e

advertisement
(d) Interpretation of results.
2
(c) Computation of diagnostic statistics and other methods such as
graphical plots of residuals, predicted values, normal plots, etc., to
determine the adequacy of the model.
(b) Computation of various descriptive statistics and of various test
statistics.
(a) Data collection process.
III The analysis
(e) What mathematical model or models are most meaningful for the
experiment? What are the assumptions involved?
Stat 402B (Spring 2016): Notes Set #2
Last update: January 10, 2016
Stat 402B (Spring 2016): Notes Set #2
1
3
(b) A barrel of crude oil is divided up into 4 portions, and one portion is
used with each of the catalyst. The portion used with each catalyst is
chosen randomly and the runs are made in random order.
(a) Make 5 runs using each catalyst.
2. The Design
(c) Several plant runs are made using each of 4 catalysts.
(b) Crude oil is fed into the plant which is charged with the catalyst. The
product is extracted from the liquid that comes out and the response
measured is the percentage of ‘feedstock’ converted into the product.
(a) A chemical engineer wants to investigate several catalysts in the hope
of improving the yield of a petro-chemical in an oil refinery.
Example
1. The Experiment
(c) How should the randomization be carried out?
(b) Order of experimentation
(a) How many observations should be taken (the size of the experiment)?
II The Statistical Design
(c) Factors to be varied and levels of each factor. How are they chosen?
Stat402B (Spring 2016)
(a) Statement of the problem to be solved.
I The Experiment
(b) The response or the dependent variable to be studied. How will it be
measured?
Slide set 2
Stat 402B (Spring 2016): Notes Set #2
Introduction to the Design of Planning Experiments of
an Experiment
Stat 402B (Spring 2016): Notes Set #2
= observed yield from the run using the i-th catalyst
on the crude from the j-th barrel,
= mean(expected) yield from this run,
= random error or noise with mean 0 and varianceσ 2,
= μij + ij ; i = 1, ..., 4; j = 1, ..., 5;
Stat 402B (Spring 2016): Notes Set #2
6
dj = y1j − y2j , j = 1, 2, . . . , n
7
Suppose instead that we had a paired design. Let
yi = μ + i, i = 1, . . . , n
where E(yi) = μ Expected mean (fixed or constant),
Linear Model
ȳ1. − ȳ2.
ȳ1. − ȳ2.
=
S.E.(ȳ1. − ȳ2.)
s n2
If n is fixed, smaller s2 will lead to a larger tc resulting in H0 being rejected.
tc =
H0 : μ1 = μ2, vs Ha : μ1 = μ2
Example Consider testing equality of means two sample experiment:
i is random with E(i) = 0, Var(i) = σ
σ 2 measures the experimental error and is called the “Error Variance”. It is
(yi−ȳ)2
estimated by the sample variance s2 =
n−1 if the data are a random
sample.
Experimental Error Variation among replicated observations
Suppose y1, y2, . . . , yn are the n observations obtained from n replications
of a treatment.
Replications Independent applications of a treatment to experimental units.
Observation The measurement made on the experimental unit, also called
the response
Experimental Units The thing to which a treatment is applied in a single
trial of the experiment.
Treatments Things that are being compared. These may be fertilizers
level, varieties, machines, methods, etc.
Some Terminology, Definitions and Basic Concepts
Stat 402B (Spring 2016): Notes Set #2
Stat 402B (Spring 2016): Notes Set #2
2
5
(d) Was blocking necessary? Were there missing data and how these were
dealt with?
(c) Use multiple comparisons to compare means if necessary.
(b) Test Hypothesis about pre-planned comparisons about the yield means
or obtain confidence intervals.
(a) Compute the analysis of variance and estimate all parameters (μij ’s
and σ 2).
3. The Analysis
4
The μij may be partitioned as μij = μ + αi + βj for further analysis,
where αi is the effect of the ith catalyst, and βj is the j th block effect.
μij
ij
where yij
yij
(d) The model:
(c) The above procedures is repeated for 5 barrels of crude. The design
is a randomized complete block design with barrels as blocks. Thus a
total of 20 runs are made in the experiment.
Stat 402B (Spring 2016): Notes Set #2
2
σ̂
n
10
(3) Assign treatment 1 to the n1 experimental units corresponding to the
first n1 numbers in 2), and treatment 2 to the next (n − n2) numbers
(2) Randomly select a permutation of numbers 1 through n
(1) Assign a number 1 through n to each experimental unit.
Example
•
•
•
The allocation of treatments to the experimental units randomly also
ensures that any inherent sources of variation that the experiment is not
aware of , do not systematically bias the response to the treatment.
•
11
Experimental error variance is estimated from the sample variance of
observations obtained from experimental runs repeated under the same
treatment i.e., replications
Why is replication required in experiments? To determine whether
treatment differences are significant, we need to compare the differences
with experimental error variance
An independent application of a treatment to an experimental unit; an
experimental run repeated under the same experimental conditions
•
Usually assumed that observations of experimental runs are random
samples from normal distributions
•
Stat 402B (Spring 2016): Notes Set #2
Stat 402B (Spring 2016): Notes Set #2
Replications
9
When this is smaller, it is easier to detect a difference between μ1 and μ2.
S.E.(ȳ1. − ȳ2.) =
When we construct CI’s for μ1 − μ2 or test H0 : μ1 = μ2. We use
y21, y22, . . . , y2n ∼ N (μ2, σ 2)
y11, y12, . . . , y1n ∼ N (μ1, σ 2)
8
d¯√
;
sd / n
Stat 402B (Spring 2016): Notes Set #2
Experimental error is important because we compare differences in treatment
means to the standard error of the difference (which depends on the estimate
of experimental error) to determine whether the different is significant.
Suppose we wish to compare expected means of 2 treatments
Randomization
(a) Application of treatments
(b) Measurement of observations
(c) Experimental technique
(ii) Lack of control of experimental procedure
(i) Variation among experimental units
Experimental Error
from the unpaired experiment. Recall that for the paired design tc =
and thus H0 is more likely to be rejected.
¯ = √sd
d1, d2, . . . , dn ∼ N (μd, σd2), d¯ = ȳ1. − ȳ2., S.E.(d)
n
If the experimental units in each pair are homogenous, then
one would
sd
¯
expect S.E.(d) = √n to be much less than S.E.(ȳ1. − ȳ2.) = s n2 obtained
Download