To view the PPT click here

advertisement
c
2
Content
Further guidance
Hypothesis testing involves making a conjecture
(assumption) about some facet of our world, collecting
data from a sample, performing a specific
mathematical test, and then deciding whether or not
the conjecture is true.
 A conjecture must be stated in two parts:

› The null hypothesis (H0) – states that there is no significant
difference between the two parameters being tested
(they are “not related to” each other, i.e. independent)
› The alternative hypothesis (H1) states that this is a significant
difference.
(they are “related” in some way, i.e dependent)

The only hypothesis test covered by the Studies SL
course is the Chi Squared test.

The Chi-square test itself is quite straight forward, your GDC can
do it in two steps but you also must know the formula and be
able to do it by hand

The hypothesis test which uses Chi-square determines whether or
not two variables are related. It follows a general pattern:
(1) Make a conjecture
(2) Write the null hypothesis using “is not related to, or
“independent”;
and write the alternative hypothesis using is related to or
“dependent”
(3) Calculate the chi-square test
(4) Determine reference values
(5) Compare the two and either accept or reject the null
hypothesis

You can find chi-squared on your GDC by using the
statistics mode
 Press Menu 6: Statistics
 Press 7: Stat Tests
 Select 8: x2 2-way Test
 Enter the name of the observed matrix

Note : You must have entered the data in to
Matrix mode first and name the matrix!!
The table shows the results of a sample of 400 randomly selected
adults classified according to gender and regular exercise.
The observed table, called 2 by 2 contingency table, is given as:
Regular
exercise
No regular
exercise
sum
Male
110
106
216
Female
98
86
184
sum
208
192
400
c 2 test is used when we deal with two categorical variables
The
and we wish to determine whether the variables are dependent, for
example females may tend to exercise more regularly, or
independent, where there is no evidence that the gender of person
has an effect on whether they exercise regularly.
Enter the observed table as a matrix and save as a.
Then use the chi squared test. After you enter the results will show.
How can we obtain the expected frequency table by hand?
For each cell, we multiply the row sum by the column sum and
divide by the total.
Regular
exercise
No regular
exercise
sum
Male
110
106
216
Female
98
86
184
sum
208
192
400
row sum
total
column sum
Regular
exercise
Male
Female
sum
216x208/400
184´ 208
400
208
No regular
exercise
216´ 192
400
sum
216
184x192/400
184
192
400
Regular
exercise
No regular
exercise
sum
Male
112.32
103.68
216
Female
95.68
88.32
184
sum
208
192
400
How do we calculate the chi squared?
The chi squared test examines the difference
between the observed values we obtained
from the original sample, and the expected
values we have calculated . This value will be
obtained from your GDC.
In this case x2=0.217
c
2
The critical value of chi squared depends on the significance
level and degrees of freedom (size of the table).
For a contingency table which has r rows and c columns,
degrees of freedom df are:
df = (r - 1)(c - 1)
Regular
exercise
No regular
exercise
sum
Male
110
106
216
Female
98
86
184
sum
208
192
400
df = (r - 1)(c - 1)
= (2 - 1)(2 - 1) = 1
As 0.217 is less than 3.841 we accept the null hypothesis and conclude
that gender and regular exercise are independent.
Use critical chi squared value of 7.815
BLACK
WHITE
RED
BLUE
Total
Male
51
22
33
24
130
Female
45
36
22
27
130
Total
96
58
55
51
260
BLACK
Male
130´ 96
260
WHITE
58
Total
130
130´ 55
260
96
BLUE
130´ 58
260
Female
Total
RED
55
130´ 51
260
51
130
260
[48.,29.,27.5,25.5]
[48.,29.,27.5,25.5]
As 6.13 is less than 7.815 we accept the null hypothesis.
Conclusion: The favourite colour is independent of gender.
Calculate degrees of freedom:
Number of rows r = 2
Number of column c = 4
df=(2-1)(4-1)= 3
Hand calculations of
c2
We use the formula, as given in your formula booklet:
2
c2=
å
( f0 - fe )
fe
Hand calculations of chi squared. We need to construct the table:
2
fo
fe
( f0 - fe )
2
f0 - fe
( f0 - fe )
51
48
3
9
0.1875
22
29
-7
49
1.6897
33
27.5
5.5
30.25
1.1
24
25.5
-1.5
2.25
0.08824
45
48
-3
9
0.1875
36
29
7
49
1.68966
22
27.5
-5.5
30.25
1.1
27
25.5
1.5
2.25
0.08824
Total
6.13084
fe


From what Lauren observed,
she believes that the
number of hours exercised
per week is dependent on
gender. She collected data
randomly and organised the
results in the table shown.
Determine whether there is
enough evidence to
accept or reject the null
hypothesis:
a) for α=0.01
› b) for α=0.05
› c) for α=0.10
›
Hours exercised per
week
Male
5
10
12
Female
9
8
4

H0 – The number of hours exercised
each week independent on gender
› H1 – The number of hours exercised
each week is dependent on gender
›

Hours exercised per
week
Write the null and alternative
hypotheses
Calculate chi-square and the pvalue
X 2 Test
X 2 = 4.69 (3sf)
p = 0.0959 (3sf)
df = 2
Whilst it is not technically correct
to say “accept H0” it is still
accepted in the IB.
Male
5
10
12
Female
9
8
4
Compare p-value to each
signficance level
a) 0.09>0.01, hence accept
null hypothesis
b) 0.09>0.05, hence accept
null hypothesis
c) 0.09<0.10, hence we reject
the null hypothesis
•

This formula is on the IB formula sheet

2
calc

 f o  fe 
2
fe
› fo is the observed frequencies
(i.e the raw data)
› fe is the expected frequencies

It is easiest to perform this sum calculation
using a table one step at a time.

If you are comparing p-value with α-level then if:
› p > α  accept the null hypothesis
› p < α  reject the null hypothesis

If you are comparing X 2 with CV then if:
› X 2 < CV
 accept the null hypothesis
› X 2 > CV
 reject the null hypothesis
Download