Lecture 9 - Wharton Statistics Department

advertisement
Lecture 9
• Inference about the ratio of two variances
(Chapter 13.5)
• Inference about the difference between
population proportions (Chapter 13.6)
13.5 Inference about the ratio
of two variances
• In this section we draw inference about the ratio of
two population variances.
• This question is interesting because:
– Variances can be used to evaluate the consistency of
production processes.
– It may help us decide which of the equal- or unequalvariances t-test of the difference between means to use.
– Because variance can measure risk, it allows us to
compare risk, for example, between two portfolios.
Parameter and Statistic
• Parameter to be tested is s12/s22
• Statistic used is
2
1
2
2
s s
F
s s
2
1
2
2
• Sampling distribution of s12/s22
– The statistic [s12/s12] / [s22/s22] follows the F distribution
with n1 = n1 – 1, and n2 = n2 – 1.
F Distribution
• Tables 6(a)-(c) only give the right tail
quantiles (critical values) of the F
distribution
• To find quantiles for the left tail of the F
distribution, use the fact
F1 A,n1 ,n 2 
1
FA,n 2 ,n 1
Parameter and Statistic
– Our null hypothesis is always
H0: s12 / s22 = 1
S12/s12
– Under this null hypothesis the F statistic F = 2 2
S2 /s2
becomes
s
F
s
2
1
2
2
Testing the ratio of two population variances
Example 13.6 (revisiting Example 13.1)
(see Xm13-01)
In order to perform a test
regarding average consumption
of calories at people’s lunch in
relation to the inclusion of
high-fiber cereal in their
breakfast, the variance ratio of
two samples has to be tested
first.
The hypotheses are:

s
H0:   1
s 
s
1
H1:

s
Estimating the Ratio of Two
Population Variances
• From the statistic F = [s12/s12] / [s22/s22] we
can isolate s12/s22 and build the following
confidence interval:
2
2 
 s12 

s
s
1
1
 
 1 F / 2,n 2,n1


2
 s2  F
 s2 
s
2
 2   / 2,n1,n 2
 2
where n1  n  1 and n 2  n2  1
Estimating the Ratio of
Two Population Variances
• Example 13.7
– Determine the 95% confidence interval
estimate of the ratio of the two population
variances in Example 13.1
Inference about the difference
between two population proportions
• In this section we deal with two populations
whose data are nominal.
• For nominal data we compare the population
proportions of the occurrence of a certain event.
• Examples
– Comparing the effectiveness of new drug versus older
one
– Comparing market share before and after advertising
campaign
– Comparing defective rates between two machines
Parameter and Statistic
• Parameter
– When the data are nominal, we can only count
the occurrences of a certain event in the two
populations, and calculate proportions.
– The parameter is therefore p1 – p2.
• Statistic
– An unbiased estimator of p1 – p2 is
(the difference between the sample p̂1  p̂ 2
proportions).
Sampling Distribution of p̂1  p̂ 2
• Two random samples are drawn from two
populations.
• The number of successes in each sample is recorded.
• The sample proportions are computed.
Sample 1
Sample size n1
Number of successes x1
Sample proportion
pˆ 1 
x1
n1
Sample 2
Sample size n2
Number of successes x2
Sample proportion
x2
p̂ 2 
n2
Sampling distribution of p̂1  p̂ 2
• The statistic p̂1  p̂ 2 is approximately normally distributed
if n1p1, n1(1 - p1), n2p2, n2(1 - p2) are all greater than or
equal to 5.
• The mean of p̂1  p̂ 2 is p1 - p2.
• The variance of p̂1  p̂ 2 is (p1(1-p1) /n1)+ (p2(1-p2)/n2)
The z-statistic
Z
( pˆ 1  pˆ 2 )  ( p1  p 2 )
p1 (1  p1 ) p 2 (1  p 2 )

n1
n2
Because p1 and p 2 are unknown the standard error
must be estimated using the sample proportions.
The method depends on the null hypothesis
Testing the p1 – p2
• There are two cases to consider:
Case 1:
H0: p1-p2 =0
Calculate the pooled proportion
Case 2:
H0: p1-p2 =D (D is not equal to 0)
Do not pool the data
x1  x 2
p̂ 
n1  n 2
Then
(p̂1  p̂ 2 )  (p1  p 2 )
Z
1
1
p̂(1  p̂)(  )
n1 n2
x1
p̂1 
n1
Then
Z
x2
p̂ 2 
n2
(p̂1  p̂ 2 )  D
p̂1 (1  p̂1 ) p̂ 2 (1  p̂ 2 )

n1
n2
Testing p1 – p2
• Example 13.8
– The marketing manager needs to decide which of two
new packaging designs to adopt, to help improve sales
of his company’s soap.
– A study is performed in two supermarkets:
• Brightly-colored packaging is distributed in supermarket 1.
• Simple packaging is distributed in supermarket 2.
– First design is more expensive, therefore,to be
financially viable it has to outsell the second design.
Testing p1 – p2
• Summary of the experiment results
– Supermarket 1 - 180 purchasers of Johnson Brothers
soap out of a total of 904
– Supermarket 2 - 155 purchasers of Johnson Brothers
soap out of a total of 1,038
– Use 5% significance level and perform a test to find
which type of packaging to use.
Testing p1 – p2
• Example 13.9 (Revisit Example 13.8)
– Management needs to decide which of two new
packaging designs to adopt, to help improve sales
of a certain soap.
– A study is performed in two supermarkets:
– For the brightly-colored design to be financially
viable it has to outsell the simple design by at
least 3%.
Estimating p1 – p2
• Estimating the cost of life saved
– Two drugs are used to treat heart attack victims:
• Streptokinase (available since 1959, costs $460)
• t-PA (genetically engineered, costs $2900).
– The maker of t-PA claims that its drug outperforms
Streptokinase.
– An experiment was conducted in 15 countries.
• 20,500 patients were given t-PA
• 20,500 patients were given Streptokinase
• The number of deaths by heart attacks was recorded.
Estimating p1 – p2
• Experiment results
– A total of 1497 patients treated with
Streptokinase died.
– A total of 1292 patients treated with t-PA died.
• Estimate the cost per life saved by using t-PA
instead of Streptokinase.
Practice Problems
• 13.88, 13.92, 13.102,13.104, 13.106
• Next Class: Chapters 15.1-15.3
15.1 Introduction to ANOVA
• Analysis of variance compares two or more
populations of interval data.
• Specifically, we are interested in determining
whether differences exist between the
population means.
• The procedure works by analyzing the sample
variance.
One Way Analysis of Variance
• The analysis of variance is a procedure that
tests to determine whether differences exits
between two or more population means.
• To do this, the technique analyzes the
sample variances
One Way Analysis of Variance
• Example 15.1 - continued
– An experiment was conducted as follows:
• In three cities an advertisement campaign was launched .
• In each city only one of the three characteristics (convenience,
quality, and price) was emphasized.
• The weekly sales were recorded for twenty weeks following
the beginning of the campaigns.
One Way Analysis of Variance
Convnce
Weekly
sales
529
658
793
514
663
719
711
606
461
Weekly
529
sales
498
663
604
495
485
557
353
557
542
614
Quality
Price
804
630
774
717
679
604
620
697
706
615
492
719
787
699
572
Weekly
523
584
sales
634
580
624
672
531
443
596
602
502
659
689
675
512
691
733
698
776
561
572
469
581
679
532
See file
Xm15 -01
One Way Analysis of Variance
• Solution
– The data are interval
– The problem objective is to compare sales in
three cities.
– We hypothesize that the three population means
are equal
Download