Uploaded by Muhammad Wasi

chap 8. Comparison of Two

advertisement
2023/6/27
Chapter 8. Comparison of Two Populations
 Difference Between Two Means
 𝑥1 − 𝑥2 is (approximately) distributed when the sample sizes are
large.
 𝐸(𝑥1 − 𝑥2 ) = 𝜇1 − 𝜇2
 𝑉(𝑥1 − 𝑥2 ) =
𝜎12
𝑛1
+
𝜎22
𝑛2
 There are two case for Test Statistics for 𝜇1 − 𝜇2 : when 𝜎12 = 𝜎22
1
and when 𝜎12 ≠ 𝜎22 .
 When
𝜎12
=
𝜎22 ,
we use t test. 𝑡 =
(𝑥̅ 1 −𝑥̅ 2 )− (𝜇1 −𝜇2 )
1
1
√𝑠𝑝2 (𝑛 + 2 )
1
ν = n1 + n2 – 2
where
2
2
(
)
(
)
𝑛
−
1
𝑠
+
𝑛
−
1
𝑠
1
2
1
2
𝑠𝑝2 =
𝑛1 + 𝑛2 − 2
– 𝑠𝑝2 is called the pooled variance estimator. It is the weighted
average of the two sample variances with the number of
degrees of freedom used as weights.
 The confidence interval
(𝑥̅1 − 𝑥̅2 ) ±
𝑡𝛼/2 √𝑠𝑝2 (
1
1
+
)
𝑛1
𝑛2
 For the statistic when variances of population are different:
2
𝑠12
𝑠22
(𝑥̅1 − 𝑥̅2 ) ± 𝑡𝛼/2 √( +
)
𝑛1
𝑛2
 Testing for Population Variance
– H0:
– HA:
𝜎12
𝜎22
𝜎12
𝜎22
=1
≠1
– F-test with n1-1, n2-1 degrees of freedom.
 Ex8-1) Comparing salaries for finance and marketing majors. Here
is salary record from randomly sampled 50 recently graduated
students: 25 for each major. Can we infer that finance majors
obtain higher salaries than do marketing majors?
> Ex8_1 <- read_excel("NaverCloud/R/data/Ex8-1.xlsx")
> View(Ex8_1)
> attach(Ex8-1)
3
> var.test(Finance,Marketing)
F test to compare two variances
data: Finance and Marketing
F = 1.3745, num df = 24, denom df = 24, p-value = 0.4416
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
0.6056997 3.1191228
sample estimates:
ratio of variances
1.374501
 The F-statistics (ratio of variance) is 1.3745. The p-value is 0.4416 >
0.05, so we can say the variances of the two samples are equal.
> t.test(Finance, Marketing, var.equal=TRUE)
Two Sample t-test
data: Finance and Marketing
t = 1.0422, df = 48, p-value = 0.3026
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
4
-4833.352 15235.352
sample estimates:
mean of x mean of y
65623.8
60422.8
 The difference between the sample means is $65623-$60422 =
$5,201 (quite large). But the standard deviation for pooled data is
also large ($4,991). T statistic is small 1.04 < 1.96; we cannot infer
that finance majors attract high salaries from this data.

 Ex 8-2)
Does the business do better after the change if the new
boss id the offspring of the owner or does the business do better
when an outsider is made chief executive officer. In pursuit of an
answer, researchers randomly selected 140 firms between 1994
and 2002, 30% of which passed ownership to an offspring and
5
70% of which appointed an outsider as CEO. The change in the
operating income as a proportion of assets before and after the
change was recorded. Do these data allow us to infer that the
effect of making an offspring CEO is different from the effect of
hiring outsider as CEO?

>
>
>
>
Ex8_2 <- read_excel("NaverCloud/R/data/Ex8-2.xlsx")
View(Ex8_2)
attach(Ex8_2)
var.test(Offspring,Outsider)
F test to compare two variances
data: Offspring and Outsider
F = 0.47138, num df = 41, denom df = 97, p-value = 0.008095
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
0.2875692 0.8170022
sample estimates:
ratio of variances
0.4713825
6
 There is enough evidence to infer the population variance differ.
> t.test(Offspring, Outsider)
Welch Two Sample t-test
data: Offspring and Outsider
t = -3.2196, df = 110.75, p-value = 0.001685
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-2.1581458 -0.5136909
sample estimates:
mean of x mean of y
-0.100000 1.235918
 The t-statistic is -3.22, and p-value is 0.0017. Accordingly, we
conclude there is sufficient evidence to infer that the mean
changes in operating income differ
 WE estimate that the man change in operating incomes for
7
outsiders exceeds the mean change in the operating income for
offspring lies between 0.51 and 2.16 percentage point.


 Matched Pairs Experiments
 Ex 8-3) We redo the experiment by grouping students according
to their GPA: from Group 1 (for highest-grade students) down to
Group 25 (lowest grade students). Matching the same grade
students, we can calculate the difference of salaries for each pair.
>
>
>
>
Ex8_3 <- read_excel("NaverCloud/R/data/Ex8-3.xlsx")
View(Ex8_3)
attach(Ex8_3)
var.test(Finance,Marketing)
F test to compare two variances
8
data: Finance and Marketing
F = 0.9479, num df = 24, denom df = 24, p-value = 0.8968
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
0.4177082 2.1510380
sample estimates:
ratio of variances
0.9478956
> t.test(Finance,Marketing,var.equal=TRUE,paired=TRUE)
Paired t-test
data: Finance and Marketing
t = 3.8097, df = 24, p-value = 0.0008511
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
2320.816 7808.224
sample estimates:
mean of the differences
5064.52
> detach(Ex8_3)
 T-statistic is t = 3.81 with p-value of 0.0009. There is now
9
overwhelming evidence to infer that finance majors obtain higher
salaries.
 We estimate that the mean salary offer to finance majors exceeds
the mean salary offer to marketing majors by an amount of $2,321
and $7,808.
10
Download