IB Studies 1/28/2014 Name Pearson's correlation coefficient, r

advertisement
IB Studies
1/28/2014
Name ____________________________
Pearson’s correlation coefficient, r
Coefficient of determination, r2: If there is a casual relationship between two groups then r2 indicates
the degree of which change in the independent variable explains the change in the dependent variable.
Example
Father’s Height (cm)
Son’s height
175 183
167 178
170
158
167
158
179 180 183 185
171 167 180 177
170
152
2) Find r2:
1) Find r:
r2 = ____ is the variation in the son’s height can be explained by the variation in the father’s height.
X2 determines whether one category is related to another one or not. Or you might say that X2 tests
the difference between the observed and expected values we obtained from one sample. The Critical
Value of X2 has to be used to conclude that the variables are not independent.
X2 depends on 2 things:
a) the size of the table, called the degree of freedom, df = (# rows -1)(# columns – 1)
and
b) the significance level, the minimum acceptable probability (usually 10%, 5%, or 1%) that the variables
are independent.
The table for the DF and significant level gives the critical value of X2, above which we conclude the
variables are not independent.
Table of the chi square distribution
DF
1
2
3
4
5
6
7
8
9
10
0.200
1.642
3.219
4.642
5.989
7.289
8.558
9.803
11.030
12.242
13.442
0.100
2.706
4.605
6.251
7.779
9.236
10.645
12.017
13.362
14.684
15.987
Level of Significance
0.075 0.050 0.025 0.010
3.170 3.841 5.024 6.635
5.181 5.991 7.378 9.210
6.905 7.815 9.348 11.345
8.496 9.488 11.143 13.277
10.008 11.070 12.833 15.086
11.466 12.592 14.449 16.812
12.883 14.067 16.013 18.475
14.270 15.507 17.535 20.090
15.631 16.919 19.023 21.666
16.971 18.307 20.483 23.209
0.005
7.879
10.597
12.838
14.860
16.750
18.548
20.278
21.955
23.589
25.188
0.001
10.828
13.816
16.266
18.467
20.516
22.458
24.322
26.125
27.878
29.589
0.0005
12.116
15.202
17.731
19.998
22.106
24.104
26.019
27.869
29.667
31.421
For example, at a 5% significance level with a DF of 1, the critical number is 3.84. This means that at a
5% significance level, the departure between the observed level and the expected level is too great if X2
> 3.84.In order for X2 to be distributed appropriately, the sample size must be sufficiently large.
Generally it is sufficiently large if no values in the expected value table no less than 5. If it is less than
five, we can combine this column with an adjacent column.
3) What is the degree of freedom for a table that is shown below?
C
D
E
A
23
17
43
B
7
3
17
sum
30
20
60
F
7
13
20
sum
90
40
130
4) Find the expected values in to an appropriate table. If any expected values are insufficiently large,
combine and appropriated set of columns.
rarely
A few times
More than
Many
usual
Large
`12
13
8
10
Small
22
21
1
11
When using your calculator to find X2, a p-value is provided. This can be used, together with the X2
value and the critical value, to determine whether or not to accept that the variables are independent.
For a given contingency table, the p-value is the probability of obtaining observed values as far or
further from the expected values, assuming the variables are independent.
If the p value is smaller than the significant level, then it is unlikely that we would have obtained the
observed results if the variable had been independent. We therefore conclude that the variables are
not independent.
The Formal Test For Independence:
Step 1: State H0 called the null hypothesis. This is a statement that the two variables being considered
are independent.
State H1 called the alternative hypothesis. This is a statement that the two variables being
considered are not independent.
Step 2: State the rejection inequality X2 > k where k is the critical value of X2.
Step 3: Construct the expected frequency table.
Step 4: Use technology to find X2.
Step 5: We either reject Ho or do not reject Ho, depending on the result of the rejection inequality.
Step 6: We would also use the p-value to help us with our decision making.
For example, at 5% significance level: If p < 0.05 we reject Ho.
If p > 0.05 we do not reject Ho (don’t use “we accept”).
5) A survey was given to randomly chosen high school students from years 9 to 12 on possible changes
to the school’s canteen. The contingency table shows the results.
Year Grouping
9
10
11
12
Change
7
9
13
14
No change
14
12
9
7
At a 5% significance level, test whether the student’s canteen preference depends on the year group.
Find:
a) Null Hypothesis:
b) Alternative Hypothesis:
c) Degree of freedom:
d) Critical value of a 5% significance level (i.e., We reject Ho if X2 > ___?____critical level:
e) What is the Expected frequency table?
f) X2 = ____?____
g) Is X2 > Critical number?
h) What conclusion do you make with this information?
i) What does the calculator say about the p value? Is it > .05, which says not to reject Ho, or is it < .05
which means to reject Ho.
We conclude that at a 5% level of significance, the variables year group and canteen preference
________ ( are. are not) independent.
Problems:
1) This contingency table shows the responses of a randomly chosen sample of adults regarding the
person’s weight and whether they are diabetics. At a 5% significance level, the critical value of X2 is
5.99. Test at a 5% level whether there is a link between weight and suffering diabetes.
State Ho.
Weight
Light
Medium
Heavy
Diabetic
11
19
26
Non-Diabetic
79
68
69
Ho: The weight of a person is independent of them being diabetic.
X2 is approx. 6.61, df = 2, p is approx. 0.0368
If X2 > 5.99, we reject Ho or if p < .05, we reject Ho.
2) The guest staying at a hotel is asked to provide their reason for traveling and to rate the hotel on a
scale from Poor to Excellent. The results are shown below.
State Ho.
Poor
Fair
Good
Excellent
Business
27
25
20
8
Holiday
9
17
23
30
Show that, at a 5% significance level, the variables reasons for traveling and rating are not independent.
3) The hair and eye colors of 150 randomly selected individuals are shown in the table below..
State Ho.
Blonde
Black
Brunette
Red
Blue
14
10
21
5
Brown
11
32
20
12
Green
5
2
14
4
Find the critical level for X2. Determine whether there is an association between hair color and eye
color.
4) A study followed a random sample of 8474 people with normal blood pressure for about four years.
All the individuals were free of heart disease at the beginning of the study. Each person took the
Spielberger Trait Anger Scale test., which measures how prone a person is to sudden anger. Researchers
also recorded whether each individual developed coronary heart disease (CHD). This includes people
who had heart attacks and those who needed medical treatment for heart disease. Here is a two-way
table that summarizes the data.
Low anger
Moderate anger
High anger
CHD
53
100
27
No CHD
3057
4621
606
a) Sate Ho.
b) Find the expected values.
c) Find the X2.
d) Find the p-value.
e) What conclusion do you make?
5) A great folk story is that dog owners and their dogs tend to look alike. Here is a two-way study that
investigates that theory.
Resembles owner
Doesn’t resemble
Purebred Dogs
16
9
Mixed-Breed dogs
7
13
Ho; There is no association between dog breed and resemblance to the owner in the population.
a)
b)
c)
d)
Determine the expected values.
What is the X2 value?
What is the p-value?
What is your conclusion?
Download