Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS) EXAMPLE

advertisement
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-1
Topic (14) – COMPARING TWO
POPULATIONS (OR TREATMENTS)
A) Two Population Proportions Using
Independent Samples
EXAMPLE: The article “Foraging Behavior in the
Indian False Vampire Bat” reported that 36 of 193
female bats in flight spent more than 5 minutes in the
air before locating food. For male bats, 64 of 168
exceeded 5 minutes when locating food. Is there
sufficient evidence to indicate that the proportion of
flights taking longer than 5 minutes differs for the
two sexes?
Note: we have two independent samples and the
interest is in comparing the proportions for the two
genders
Notation:
Population Population Sample
Proportion
Size
π1
n1
1
π2
n2
2
Sample
Proportion
p1
p2
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-2
To compare 2 population proportions based on 2
independent samples we shall consider the size of the
difference π 1 − π 2 :
π1 − π 2 = 0 ⇒ π1 = π 2
π1 − π 2 > 0 ⇒ π1 > π 2
π1 − π 2 < 0 ⇒ π1 < π 2
Our sampling estimator of this difference is the
difference in the sample proportions p1 − p 2 when
the two samples are independent of one another.
Sampling Distribution of p1 − p 2 when the two
samples are independently and randomly taken:
1) the mean of the distribution is
µ p1− p2 = π 1 − π 2 (that is, p1 − p 2 is unbiased)
2) the standard deviation of the distribution is
π 1(1 − π 1 ) π 2 (1 − π 2 )
σ p1− p2 =
+
n1
n2
3) the shape of the distribution is approximately
normal (a bell curve) if both n1 and n1 are large.
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-3
The sample sizes are large enough to invoke the CLT
if
both 1) n1p1 ≥ 10 and n1(1 − p1 ) ≥ 10
and 2) n 2 p2 ≥ 10 and n 2 (1 − p 2 ) ≥ 10 .
So, if p1 − p 2 is at least approximately normally
distributed we get that
z=
( p1 − p 2 ) − (π 1 − π 2 )
estimate of σ p1− p2
has an approximate standard normal distribution, i.e.
z is approximately N(0,1)
As we’ll see, the estimator of
σ p1− p2
π 1(1 − π 1 ) π 2 (1 − π 2 )
=
+
depends on
n1
n2
whether we are constructing a confidence interval or
performing a test of the difference in the two
population proportions.
Let’s look at hypothesis testing first.
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-4
LARGE SAMPLE HYPOTHESIS TEST OF THE
DIFFERENCE OF TWO POPULATION
PROPORTIONS BASED ON TWO
INDEPENDENT SAMPLES:
H0: π 1 − π 2 = 0
Null hypothesis:
Alternative Hypothesis is one of three:
a)
b)
c)
HA: π 1 − π 2 > 0
HA: π 1 − π 2 < 0
HA: π 1 − π 2 ≠ 0
Test Statistic:
where pC
z=
( p1 − p 2 )
⎛ 1
1 ⎞
⎜
⎟⎟
pC (1 − pC )⎜ +
⎝ n1 n 2 ⎠
n1p1 + n 2 p 2
=
n1 + n 2
total # successes in both samples
=
total sample size
P-value: depends on the alternative hypothesis:
a) P-value = Pr( Z > z)
b) P-value = Pr( Z < z)
c) P-value = 2 Pr( Z < - |z| )
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-5
Decision Rule: reject Ho if P-value ≤ α
Assumptions:
1. n1 and n2 are large enough for the sample
proportions to be approximately normally distributed
2. the sampling was random and not more than 5%
of the population.
3. the two samples are independent
EXAMPLE Bats:
Sample Statistics:
Population
1= female
2= male
Hypotheses:
Sample
Size
n1= 193
# Suc- Sample Proportion
cesses
36
36
p1 =
= .1865
193
64
n2 = 168
64
p2 =
= .3809
168
Ho: π 1 − π 2 = 0
HA: π 1 − π 2 ≠ 0
Significance level:
let’s use α = 0.05
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
Assumptions:
n1 p1 ≥ 10, n1 (1 − p1 ) ≥ 10
n 2 p 2 ≥ 10, n 2 (1 − p 2 ) ≥ 10
14-6
have
been met. And we have 2 random samples.
Test Statistic: first we need to calculate the common
proportion
pC
Then,
z=
=
n1p1 + n 2 p 2
36 + 64
=
=
= .277
n1 + n 2
193 + 168
( p1 − p 2 )
⎛ 1
1 ⎞
⎜
⎟⎟
pC (1 − pC )⎜ +
⎝ n1 n 2 ⎠
(.1865 − .3809 )
1 ⎞
⎛ 1
+
.277(1 − .277 )⎜
⎟
193
168
⎝
⎠
= −4.12
P-value: = 2 Pr(Z< -|z|) = 2 Pr(Z<-4.12) <0.0001 ≈ 0+
Alternative Decision Rule: Reject H0 if the test
statistic meets the condition |z| > z*(1-α/2)
|z| = |-4.12| = 4.12 >>> z*(1-α/2) = z*(0.975) = 1.96
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-7
Conclusions: We reject the null hypothesis since pvalue <0.0001 <<<< α=0.05. There is strong
evidence, based on these samples, that the population
proportion of female false vampire bats taking longer
than 5 minutes before locating food is different from
the proportion for male bats doing the same.
Now, we would like to estimate the size of the
difference between the two proportions. That’s done
with a confidence interval.
LARGE SAMPLE CONFIDENCE INTERVAL
ESTIMATION OF THE DIFFERENCE OF TWO
PROPORTIONS BASED ON INDEPENDENT
SAMPLES:
Interval Estimator:
p1(1 − p1 ) p 2 (1 − p 2 )
( p1 − p 2 ) ± ( z value)
+
n1
n2
where the z critical value is based on the confidence
level desired
Assumptions:
1. n1 and n2 are large enough for the sample
proportions to be approximately normally distributed
2. the sampling was random and not more than 5%
of the population.
3. the two samples are independently taken
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-8
Note that the estimator of the standard deviation of
p1 − p 2 is different than the one used in hypothesis
testing!
EXAMPLE for the bats let’s use a 90% C.I. to
estimate the difference in proportions of time spent
searching for food between males and females.
From topic 11, the z critical value for 90% is 1.645.
So, a 90% C.I. is
p1 (1 − p1 ) p 2 (1 − p 2 )
( p1 − p 2 ) ± 1.645
+
n1
n2
.187(1 − .187 ) .381(1 − .381)
= (.187 − .381) ± 1.645
+
193
168
= −.194 ± 1.645(.0468 )
= −.194 ± .077 = ( −.271, − .117 )
Hence, with 90% confidence, the population
proportion of female false vampire bats that spend
more than 5 minutes locating food is between 11.7%
and 27.1% lower than the population proportion of
male bats that spend more than 5 minutes locating
food.
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-9
EXAMPLE Old Faithful, the geyser at Yellowstone
National Park, is known to have two distinct types of
eruptions: long-duration (> 3 minutes) and short
duration (< 3 min). If the types of eruptions are
equally likely at all times of the day, then the
proportion of long duration eruptions occurring
during the day should be the same as the proportion
at night. A geologist hypothesized that the length of
duration was affected by solar heating during the day
and hence, the proportion of daytime long duration
eruptions should be higher than the nighttime
proportion. Two samples were taken in August using
randomly selected dates. The geologist observed 53%
long duration eruptions during the day (out of 35
eruptions) and 49% (out of 41 eruptions) at night. Is
there sufficient evidence to support the scientist’s
claim? Use a significance level of 0.025. Let day
eruptions be population #1 and night, #2.
Hypotheses:
Ho: π 1 − π 2 = 0
HA: π 1 − π 2 > 0
Significance level:
Assumptions:
α = 0.025
n1 p1 ≥ 10, n1 (1 − p1 ) ≥ 10
n 2 p 2 ≥ 10, n 2 (1 − p 2 ) ≥ 10
have
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-10
been met. And we have 2 random samples.
Test Statistic: first we need the common proportion
n1p1 + n2 p2 35(.53) + 41(.49)
pC =
=
= 0.508
n1 + n2
35 + 41
Then,
z=
( p1 − p 2 )
⎛ 1
1 ⎞
⎟⎟
pC (1 − pC )⎜⎜ +
⎝ n1 n 2 ⎠
(.53 − .49)
=
= 0.35
1⎞
⎛ 1
+ ⎟
.51(1 − .51)⎜
⎝ 35 41⎠
P-value: = Pr(Z> z) = Pr(Z>0.35)=1 – Pr(Z<0.35)
= 1 – 0.6368 = 0.3632
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-11
Conclusions: We fail to reject the null hypothesis
since p-value =0.36>>>> α=0.025. There is
insufficient evidence based on these samples, to
support the geologist’s contention that the proportion
of long duration geyser eruptions is higher during the
day than the proportion at night.
Do we need a CI here?
B) Two Population Means Using Independent
Samples
EXAMPLE A scientist is interested in determining
which of two butterfly species has a larger wingspan.
Species 1 is found on forest understory plants and
tends to feed on its nursery plants. Thus it doesn’t
travel far. The other species is found on open field
flowers and migrates seasonally. She hypothesizes
that the migrating species has larger average
wingspans than the forest species and plans to take
two samples to test her hypothesis.
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-12
Notation:
Popu- Popula- Popula- Sample Sample Sample
lation tion
tion
Size
Mean Standard
Mean Standard
Deviation
Deviation
x1
s1
1
n1
σ1
µ1
x2
s2
2
n2
σ2
µ2
To compare 2 population means we shall consider the
size of the difference µ1 − µ 2 :
µ1 − µ 2 = 0 ⇒
µ1 − µ 2 > 0 ⇒
µ1 − µ 2 < 0 ⇒
µ1 = µ 2
µ1 > µ 2
µ1 < µ 2
Our sampling estimator of this population difference
is the sample mean difference x1 − x 2 when the two
samples are independent of one another.
Sampling Distribution of x1 − x 2 when the two
samples are independently and randomly taken:
1) the mean of the distribution is
µ X − X = µ1 − µ 2
1
2
(that is, x1 − x 2 is unbiased)
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-13
2) the standard deviation of the distribution is
σ 12 σ 22
+
σ X −X =
1
2
n1 n 2
3) the shape of the distribution is approximately
normal (a bell curve) if
a) both n1 and n1 are large, or
b) both of the populations being sampled are
approximately normally distributed
The estimator of µ1 − µ 2 is x1 − x 2 and
The estimator of
σX
1− X 2
=
σ 12 σ 22
+
n1 n 2
depends on whether σ 1 ≠ σ 2 (unequal variance case)
or σ 1 = σ 2 (equal variance case).
When σ 1 ≠ σ 2 , the estimator of σ X1 − X 2 is given by
sX
1− X 2
=
s12 s 22
+
.
n1 n 2
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-14
What are the degrees of freedom for s X1 − X 2 ?
Satterthwaite showed that the appropriate degrees of
freedom for this estimator are
df =
(V1 + V2 ) 2
V12
V22
+
n1 − 1 n 2 − 1
s12
s 22
where V1 =
and V2 =
n1
n2
When σ 1 = σ 2 , the estimator of σ X1 − X 2 is given by
s x1− x2 =
1 1⎞
+ ⎟
⎝ n1 n2 ⎠
2⎛
sc ⎜
where the estimator of the common variance is
sc2
s12 (n1 − 1) + s22 (n2 − 1)
.
=
n1 + n2 − 2
The degrees of freedom for this estimator are
n1 + n2 − 2 .
So, if x1 − x 2 is at least approximately normally
distributed we get that
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
t =
14-15
( x 1 − x 2 ) − ( µ1 − µ 2 )
or
t=
s12 s 22
+
n1 n 2
( x1 − x2 ) − ( µ1 − µ 2 )
1 1⎞
+ ⎟
n
⎝ 1 n2 ⎠
2⎛
sc ⎜
have approximate T-distributions.
HYPOTHESIS TEST OF THE DIFFERENCE IN
TWO POPULATION MEANS BASED ON TWO
INDEPENDENT SAMPLES:
Null hypothesis:
H0: µ1 − µ 2 = D0
where D0 is the hypothesized difference in the means
Alternative Hypothesis is one of three:
a) HA: µ1 − µ 2 > D0
b) HA: µ1 − µ 2 < D0
c) HA: µ1 − µ 2 ≠ D 0
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
Test Statistic:
(1) t =
either
( x 1 − x 2 ) − D0
s12
n1
(2) t =
+
s 22
14-16
or
n2
( x1 − x2 ) − ( µ1 − µ 2 )
⎛1 1⎞
sc2 ⎜ + ⎟
⎝ n1 n2 ⎠
The df are
for (1):
(V1 + V2 )2
V12
V22
+
n1 − 1 n2 − 1
s12
s 22
where V1 =
and V2 =
n1
n2
and for (2): n1 + n2 − 2 .
P-value: depends on the alternative hypothesis:
a) P-value = Pr( T > t)
b) P-value = Pr( T < t)
c) P-value = 2 Pr( T > |t|)
Decision Rule: reject Ho if P-value ≤ α
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-17
Assumptions:
1. n1 and n2 are large enough for the sample means
to be approximately normally distributed
2. the sampling was random and not more than 5%
of the population.
3. the two samples are independently taken
EXAMPLE Nitrogen is the most common nutrient
applied to soils. In tropical areas with warm
temperatures and heavy rainfall, only part of the
applied nitrogen is used by crops and the rest is lost.
Information about the mean nitrogen loss (N-loss) is
important for research on optimal growth of plants.
To that end, two nitrogen fertilizer treatments are to
be compared for their average N-loss: Urea alone (1)
and Urea+N-Serve (2).
A sugarcane field was divided into equal size plots
and plots were randomly assigned to one of the two
treatments. There were sufficient numbers of plots so
that no treated plots were adjacent on any side.
Important Point about Experimental Design: when
planning an experiment to compare two or more
treatments:
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-18
1) experimental units (plants, field plots, people,
etc) should be randomly selected from the
larger group from which they could be selected
(the population of potential experimental units)
2) treatments should be randomly assigned to the
experimental units
3) extraneous or confounding factors should be
considered and minimized when assigning and
running the experiment (e.g. all units should be
the same size, have the same weather
conditions, etc)
The following data represent Nitrogen loss (% of
total N applied) over a 16 week period:
Fertilizer
UN
U
Percentage N-loss
10.8, 10.5,14.0, 13.5, 8.0, 9.5, 11.8, 10.0,
8.7, 9.0, 9.8, 13.8, 14.7, 10.3, 12.8
8.0, 7.3, 14.1, 9.8, 7.1, 6.3, 10.0, 7.1, 7.9,
6.1, 6.9, 11.0, 10.0
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-19
15
14
13
NLOSS
12
11
10
9
8
TREATM
7
6
8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8
Count
Count
U
UN
Group N Mean SD
S2
U
13 8.585 2.288 5.235
UN
15 11.147 2.140 4.580
Question: Is there sufficient evidence to support the
hypothesis that the two treatments differ in their
mean percentage N-loss?
Hypotheses:
Ho: µ1 − µ 2 = 0
HA: µ1 − µ 2 ≠ 0
Significance level:
α = 0.05
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-20
Test Statistic (assuming unequal variances)
t =
( x 1 − x 2 ) − D0
s12
n1
+
s 22
(8.58 − 11.15 ) − 0
=
2
(2.29 )
(2.14 )
+
13
15
n2
= −3.045
2
s12 (2.29) 2
=
= 0.4034
Degrees of Freedom: V1 =
n1
13
s 22 (2.14) 2
V2 =
=
= 0.3053
n2
15
df =
(V1 + V2 ) 2
V12
n1 − 1
+
V22
n2 − 1
=
(.4034 + .3053 ) 2
2
2
= 24.8
(.4034 )
(.3053 )
+
13 − 1
15 − 1
round down to 24 df.
P-value: 2 Pr ( T > |t| ) =2 Pr(T> 3.0) = 2(.003) =.006
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-21
Test Statistic (assuming equal variances)
t=
( x1 − x2 ) − D0
⎛1 1⎞
sc2 ⎜ + ⎟
⎝ n1 n2 ⎠
=
(8.58 − 11.15) − 0
= −3.06
1⎞
⎛ 1
4.882⎜ + ⎟
⎝ 13 15 ⎠
where
s12 (n1 − 1) + s22 (n2 − 1) 5.235(12) + 4.580(14)
2
sc =
=
= 4.882
n1 + n2 − 2
13 + 15 − 2
Degrees of Freedom: n1 + n2 − 2 = 26.
P-value: 2 Pr ( T > |t| ) =2 Pr(T> 3.0) = 2(.003) =.006
Conclusion: Regardless of the choice of test, the pvalue = .006 << α=0.05, so reject the null hypothesis.
There is sufficient evidence to indicate that the two
nitrogen treatments differ in their average percentage
nitrogen loss at α=0.05.
How do we identify which test statistic is
appropriate? Well, we can either use the rule of
thumb that the sample variances should be within 3
times each other OR do a test of equality of the two
variances OR assume they are unequal.
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-22
Oneway Analysis of N-Loss By Treatment
16
N-Loss
14
12
10
8
6
U
UN
Treatment
t Test
Assuming equal variances
Difference t Test DF Prob > |t|
Estimate
-2.5621 -3.060 26 0.0051
Std Error
0.8373
Lower 95% -4.2831
Upper 95% -0.8410
Assuming UnEqual Variances
Difference t Test
DF Prob > |t|
Estimate
-2.5621 -3.045 24.8474 0.0054
Std Error
0.8414
Lower 95% -4.2870
Upper 95% -0.8371
Since the two treatments do differ in their mean Nlosses, I’d like to estimate the size of that difference.
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-23
CONFIDENCE INTERVAL ESTIMATION OF
THE DIFFERENCE OF TWO MEANS BASED
ON INDEPENDENT SAMPLES:
Interval Estimator:
( x1 − x2 ) ± (t critical value ) × estimator of σ x1 − x2
where the t critical value is based on the confidence
level desired and the degrees of freedom are
calculated according to which estimator you use
(equal or unequal variance).
Assumptions:
1. n1 and n2 are large enough for the sample means
to be approximately normally distributed
2. the sampling was random and not more than 5%
of the population.
3. the two samples are independently taken
So, to go back to our example:
A 95% confidence interval based on two independent
samples is given by either
s12 s 22
( x1 − x 2 ) ± (t critical value )
+
n1 n 2
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-24
or
⎛1 1⎞
( x1 − x2 ) ± (t critical value) sc2 ⎜ + ⎟
⎝ n1 n2 ⎠
From earlier:
Group
U
UN
N
13
15
Mean
8.585
11.147
SD
2.288
2.140
And the df = 24 for the unequal case and 26 for the
equal case.
T critical value for 95% and 24 df = 2.06. So,
(2.29)2 (2.14)2
+
(8.58 − 11.15) ± 2.06
13
15
= ( −4.2870 − 0.8371)
Similarly, the T critical value for 95% and 26 df =
2.05, so we obtain
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-25
1⎞
⎛ 1
(8.58 − 11.15) ± 2.05 4.882⎜ + ⎟
⎝ 13 15 ⎠
= ( −4.2831, − 0.8410)
Thus, with 95% confidence, the mean nitrogen loss
(%) from Urea alone is between .8% and 4.3% below
the mean loss of the Urea+N-Serve combination.
EXAMPLE Discharge of industrial waste into rivers
affects water quality. To assess the effect of a power
plant on water quality, 24 samples were taken 16 km
upstream of the plant and another 24 were taken at 4
km downstream. Alkalinity (mg/l) was measured on
each water sample. Do the data suggest that the true
mean alkalinity below the plant is more than 50 mg/l
higher than the true mean alkalinity upstream of the
plant?
Output from a statistical software program:
Group
upstream
downstream
N
24
24
t-score = 113.2
df
= 45
Mean
75.9
183.6
SD
1.83
1.70
2-sided P-value = 0+
Pop’ln
2
1
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
Hypotheses:
14-26
Ho: µ1 − µ 2 = 50
HA: µ1 − µ 2 > 50
Check the t-score:
( x 1 − x 2 ) − D0 (183.6 − 75.9) − 50
t=
=
= 113.17
2
2
2
2
s1 s 2
(1.70)
(1.83)
+
+
24
24
n1 n 2
Assumptions:
1) sample sizes large enough?
2) samples independent and random?
Conclusion: There is strong evidence to suggest that
the average alkalinity of the water below the power
plant is more than 50 mg/l higher than the mean
alkalinity of the water above the power plant.
For a 95% confidence interval estimate of the
difference we have:
T critical value for 95% and 45 df ≈ 2.02. So,
(1.70)2 (1.83)2
(183.6 − 75.9) ± 2.02
+
24
24
= (100.67, 102.73)
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-27
We conclude that the mean alkalinity below the
power plant is between 100.7 and 103 mg/l higher
than the mean alkalinity of the water above the power
plant!
C) Comparing Two Population Means Using
Paired Samples
Consider the following experiments:
1. In order to determine if two IQ tests yield similar
results (means and standard deviations), the
researcher selected 50 college students at random to
take both tests. The order in which any given student
took the tests was randomized and the tests were
taken 1 month apart to minimize crossover effects.
The hypothesis is that test # 1 is biased in that it
yields a higher average score than test #2 which has
been in use for many years.
Note the experimental design here as well as the
hypotheses being tested. We can’t use the
independent samples test for this case.
Hypotheses:
Ho: µ1 − µ 2 = 0
HA: µ1 − µ 2 > 0
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-28
2. A swine nutritionist wished to compare a
nitrogen poor + enzyme diet (#1) to a nitrogen rich
diet (#2) for pigs. Rather than take one piglet from
each new litter and assign it a diet at random, he
chose instead to take 2 piglets from each litter and
randomly assign one pig to one diet and the other to
the other diet. The hypothesis is that the nitrogen
rich diet results in a lower average weight gain than
the nitrogen poor + enzyme diet.
Hypotheses:
Ho: µ1 − µ2 = 0
HA: µ1 − µ2 > 0
3. A researcher is interested in the effect of oxygen
exposure on cell fluidity in pulmonary artery cells in
dogs. She intends to collects cells from ten dogs for
the experiment. For each dog, two agar plates of
artery cells are prepared and each plate is randomly
assigned to either receive O2 or not receive O2
treatment. She wishes to test the hypothesis that the
mean fluidity for oxygen treated cells (2) differs from
the mean for untreated cells (1).
Hypotheses:
Ho: µ1 − µ 2 = 0
HA: µ1 − µ 2 ≠ 0
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-29
In all three cases, the samples are NOT independent
of each other. In fact, they are deliberately dependent.
One reason for this is that the estimator of the
difference between two means based on 2
independent samples has a large standard deviation
(recall that it is the square root of the SUM of two
variances).
When samples are paired as is done here, the
standard deviation of the estimator of µ1 − µ 2 used
for a paired experiment is often smaller.
Defn: A PAIRED or “BLOCKED” experiment is
one in which for each randomly selected
experimental unit in the first sample there is a
deliberately selected unit in the second sample. The
units in the second sample are chosen so that they
have characteristics similar to the unit in the first
sample to which they have been paired.
The characteristics used for pairing are usually those
that likely have an effect on the response variable
being studied in the experiment.
It is this last statement that often leads to the standard
deviation being smaller in paired experiments.
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-30
Example #1. Perfect pairing since each experimental
unit in sample 1 is also used in sample 2.
Intuitively, comparing how several people react to
each test is more informative and accurate than
comparing results for independently chosen people
for each test.
Î Look at the individual differences in scores, one
for each person.
Example #2. Genetics has a relatively large
influence on adult size and growth in most animals.
Hence it would not be surprising that two pigs from
the same litter would respond to each of the two diets
similarly in the sense that one would respond as the
other would had it been on the first one’s diet as well.
Hence, the two littermates are paired in this
experiment.
Î Look at the individual differences in growth
between littermates.
Example #3. Although the cells in each of the 2
treatments are not exactly the same, they are as close
as possible, being from the same animal. Hence any
effect due to animal variability is controlled
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-31
somewhat by using the same dogs for both
treatments.
Î Look at the difference in cell fluidity for each dog.
For paired samples, the estimator of the difference
µ1 − µ 2 is the average of the sample differences D .
To obtain this difference: for each pair, calculate the
difference in X under the two treatments. Call this
difference D.
EXAMPLE: Cell fluidity
Dog
without O2 With O2
(X1)
(X2)
1
0.308
0.308
2
0.304
0.309
3
0.305
0.305
4
0.304
0.311
5
0.301
0.303
6
0.278
0.293
7
0.296
0.302
8
0.301
0.300
9
0.302
0.308
10
0.237
0.250
Mean
0.294
0.299
Std Dev
0.022
0.018
Difference
D=(X1-X2)
0.000
-0.005
0.000
-0.007
-0.002
-0.015
-0.006
0.001
-0.006
-0.013
-0.0053
0.00542
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-32
Then, the data consist of the n differences. The
average of the sample differences is
D =
1
D
∑
n
and the standard deviation is
n
∑ (D − D ) 2
sD =
1
n −1
.
Now, the differences can be regarded as a random
sample from a population of differences if the
experimental units (e.g. the ten dogs) can be regarded
as a random selection from among all experimental
units. In that case, we have
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-33
SAMPLING DISTRIBUTION of D :
1) the mean of the distribution is µ D = µ1 − µ 2
2) the standard deviation of the distribution is
σD
σD =
where σ D is the standard deviation of the
n
population of differences from which we sampled n
differences.
3) the shape of the distribution is approximately
normal (a bell curve) if n is large or the population
being sampled is approximately normally distributed.
The estimator of µ D is D , the sample mean
difference and
the estimator of σ D is s D , the sample standard
deviation of the differences.
In that case, the problem reverts to a test of the mean
µ D based on a single sample.
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-34
HYPOTHESIS TEST OF THE DIFFERENCE IN
TWO POPULATION MEANS USING PAIRED
SAMPLES:
Null hypothesis: H0: µ1 − µ 2 = D 0 ( µ D = D 0 )
where D0 is the hypothesized difference in the means
Alternative Hypothesis is one of three:
a) HA: µ1 − µ 2 > D 0 ( µ D > D 0 )
b) HA: µ1 − µ 2 < D 0 ( µ D < D 0 )
c) HA: µ1 − µ 2 ≠ D 0 ( µ D ≠ D 0 )
Test Statistic:
D − Do
t =
sD
n
P-value: depends on the alternative hypothesis:
a) P-value = Pr( T > t)
b) P-value = Pr( T < t)
c) P-value = 2 Pr( T > |t|)
Decision Rule: reject Ho if P-value ≤ α
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-35
Assumptions:
1. D is approximately normally distributed
2. the sampling was random and not more than 5%
of the population.
EXAMPLE So, let’s return to the dog fluidity study
Hypotheses:
Ho: µd = 0
HA: µd ≠ 0
Significance Level: we’ll choose α=0.025.
Now, the numbers we need are: D = −0.0053 ,
s D = 0.00542 , n = 10
Test Statistic:
t =
D −0
− 0.00530
=
= −3.0939
sd
0.00542
10
n
df = n − 1 = 9.
P-value: 2Pr(T>|t|) = 2Pr(T>+3.09) = 0.006
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-36
Conclusion: P-value =0.006 << α=0.025. Hence we
reject Ho and conclude that the data provide sufficient
evidence at α=0.025 to indicate that oxygen
treatment changes the mean fluidity of pulmonary
artery cells in dogs.
Assumptions: The sample size is small but it is
likely that the population of differences are not too
skewed.
CONFIDENCE INTERVAL FOR THE
DIFFERENCE OF TWO MEANS BASED ON A
PAIRED SAMPLE:
⎛ sD ⎞
D ± ( t critical value)⎜
⎟
⎝ n⎠
where the t critical value is based on n-1 df and the
desired confidence level.
Assumptions:
1) sampling is random and
2) either the sample size is large so we can use the
CLT or the original population has a frequency
distribution that is bell-curve shaped.
In our dog EXAMPLE: For a 95% confidence
interval of the difference of two means we need
Topic (14) – COMPARING TWO POPULATIONS (OR TREATMENTS)
14-37
the t critical value for 95% and 9 df. So, t = 2.26.
Hence, the 95% C. I. of the difference between mean
fluidity in cells with and without oxygen is
⎛ .00542 ⎞
− 0.0053 ± 2.26⎜
⎟ = −0.0053 ± 0.0039
⎝ 10 ⎠
= ( −0.0092, − 0.0014 )
Which implies that the mean fluidity in the cells
without oxygen is below the mean fluidity for those
that receive oxygen.
Download