Uploaded by yeruzanne

Testing for an association between two species using the Chi

advertisement
Testing for an association between two species using the
Chi-squared test
Null hypothesis (H ): There is no significant difference between the
0
1 First step in statistics is distribution of two species (i.e. distribution is random)
ALWAYS to define the Alternative hypothesis (H1): There is a significant difference between the
distribution of species (i.e. species are associated)
hypotheses
Observed values Marsh bedstraw
present absent
Bottle
sedge
present
total
2 Complete the contingency table of
observed frequencies using the
data provided:
absent
total
Data from: https://www.geography-fieldwork.org/ecology/hydrosere/4-data-analysis.aspx
Testing for an association between two species using the
Chi-squared test
Null hypothesis (H ): There is no significant difference between the
0
1 First step in statistics is distribution of two species (i.e. distribution is random)
ALWAYS to define the Alternative hypothesis (H1): There is a significant difference between the
distribution of species (i.e. species are associated)
hypotheses
Observed values Marsh bedstraw
present absent
Bottle
sedge
total
present
12
29
41
absent
3
56
59
15
85
100
total
2 Complete the contingency table of
observed frequencies using the
data provided:
Data from: https://www.geography-fieldwork.org/ecology/hydrosere/4-data-analysis.aspx
Testing for an association between two species using the
Chi-squared test
Null hypothesis (H ): There is no significant difference between the
0
distribution of two species (i.e. distribution is random)
Alternative hypothesis (H1): There is a significant difference between the
distribution of species (i.e. species are associated)
Observed values Marsh bedstraw
present absent
Bottle
sedge
total
present
12
29
41
absent
3
56
59
15
85
100
total
3
Calculate expected values using the
formula:
= row total x column total
grand total
Expected values Marsh bedstraw
present absent
Bottle
sedge
total
present
41
absent
59
total
15
85
n.b. Expected values are what you would
expect to be find if there is no association
between the species.
100
Data from: https://www.geography-fieldwork.org/ecology/hydrosere/4-data-analysis.aspx
Testing for an association between two species using the
Chi-squared test
Null hypothesis (H ): There is no significant difference between the
0
distribution of two species (i.e. distribution is random)
Alternative hypothesis (H1): There is a significant difference between the
distribution of species (i.e. species are associated)
Observed values Marsh bedstraw
present absent
Bottle
sedge
total
present
12
29
41
absent
3
56
59
15
85
100
total
3
Calculate expected values using the
formula:
= row total x column total
grand total
Expected values Marsh bedstraw
present absent
Bottle
sedge
total
present
6.15
34.85
41
absent
8.85
50.15
59
15
85
100
total
n.b. Expected values are what you would
expect to be find if there is no association
between the species.
Data from: https://www.geography-fieldwork.org/ecology/hydrosere/4-data-analysis.aspx
Testing for the association between two species using the
Chi-squared test
Null hypothesis (H ): There is no significant difference between the
0
distribution of two species (i.e. distribution is random)
Alternative hypothesis (H1): There is a significant difference between the
distribution of species (i.e. species are associated)
Observed values Marsh bedstraw
present absent
Bottle
sedge
4 Calculate the Chi-squared value:
total
present
12
29
41
absent
3
56
59
15
85
100
total
Expected values Marsh bedstraw
present absent
Bottle
sedge
total
present
6.15
34.85
41
absent
8.85
50.15
59
15
85
100
total
χ2
=
= (12 – 6.15)2 + … + (56 – 50.15)2
6.15
50.15
= ??
Data from: https://www.geography-fieldwork.org/ecology/hydrosere/4-data-analysis.aspx
Testing for the association between two species using the
Chi-squared test
Null hypothesis (H ): There is no significant difference between the
0
distribution of two species (i.e. distribution is random)
Alternative hypothesis (H1): There is a significant difference between the
distribution of species (i.e. species are associated)
Observed values Marsh bedstraw
present absent
Bottle
sedge
4 Calculate the Chi-squared value:
total
present
12
29
41
absent
3
56
59
15
85
100
total
Expected values Marsh bedstraw
present absent
Bottle
sedge
total
present
6.15
34.85
41
absent
8.85
50.15
59
15
85
100
total
χ2
=
= (12 – 6.15)2 + … + (56 – 50.15)2
6.15
50.15
= 5.56 + 3.86 + 0.98 + 0.68
= 11.10
Data from: https://www.geography-fieldwork.org/ecology/hydrosere/4-data-analysis.aspx
Testing for the association between two species using the
Chi-squared test
Null hypothesis (H ): There is no significant difference between the
0
distribution of two species (i.e. distribution is random)
Alternative hypothesis (H1): There is a significant difference between the
distribution of species (i.e. species are associated)
Observed values Marsh bedstraw
present absent
Bottle
sedge
5 Determine the degrees of freedom:
total
present
12
29
41
absent
3
56
59
15
85
100
total
Expected values Marsh bedstraw
present absent
Bottle
sedge
total
present
6.15
34.85
41
absent
8.85
50.15
59
15
85
100
total
Degrees of freedom (df)
= (rows* – 1) x (columns* – 1)
= (2 - 1) x (2 - 1)
=1
n.b. for an association between two
species df ALWAYS = 1
*not including totals
Data from: https://www.geography-fieldwork.org/ecology/hydrosere/4-data-analysis.aspx
Testing for the association between two species using the
Chi-squared test
Null hypothesis (H ): There is no significant difference between the
0
distribution of two species (i.e. distribution is random)
Alternative hypothesis (H1): There is a significant difference between the
distribution of species (i.e. species are associated)
6 Compare the χ2 value with the critical values
and validate the hypotheses:
Critical values for the χ2 distribution
p (%
certainty)
df
1
0.5
(50%)
0.455
0.1
(90%)
2.706
0.05
(95%)
3.841
0.01
0.001
(99%) (99.9%)
6.635 10.827
2
1.386
4.605
5.991
9.21 13.815
3
2.366
6.251
7.815
11.345 16.268
4
3.357
7.779
9.488
13.277 18.465
5
4.351
9.236
11.07
15.086 20.517
• It is usual to consider a result
statistically significant at the 95%
certainty (p <0.05) level.
Download