Statistical

advertisement
Chapter 17:
Statistical
Analysis
CONTENTS
• The statistics approach
• Statistical tests
–
–
–
–
–
–
–
–
–
Types of data and appropriate tests
Chi-square
Comparing two means: the t-test
A number of means: one-way analysis of variance
A table of means: factorial analysis of variance
Correlation
Linear regression
Multiple regression
Factor and cluster analysis
The statistics approach
•
•
•
•
•
•
Probabilistic statements
The normal distribution
Probabilistic statement formats
Significance
The null hypothesis
Dependent and independent variables.
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Probabilistic statements
• descriptive: e.g. : 10% of adults play tennis
• comparative: e.g. : 10% play tennis, but 12% play golf
• relational: e.g. 15% of people with high incomes play tennis
but only 7% of people with low incomes do so: there is a
positive relationship between tennis-playing and income.
• However: when based on a samples, the above must be made
using a probabilistic format
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Probabilistic statements contd
• We can be 95% confident that the proportion of
adults that plays tennis is between 9% and 11%
• The proportion of golf players is significantly higher
than the proportion of tennis players (at the 95%
level of probability)
• There is a positive relationship between level of
income and level of tennis playing (at the 95% level)
• (See discussion of Confidence intervals: Chapt 13).
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Probabilistic statement formats
• 95% probability
– sometimes expressed as 5%
– sometimes as 0.05
• 99% probability is also used
– also expressed 1% or 0.01
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Normal distribution (Fig. 17.1):
a. Drawing repeated samples (theory)
Samples
-4
-3
-2
Sample values
-1
+1
+2
+3
+4
Sample values
Popn Value
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Normal distribution contd
b. Normal distribution/curve
Samples
-4 -3 -2 -1
Sample values
+1
+2 +3 +4
Sample values
Popn Value
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Normal curve
(Fig. 13.1)
NUMBER
OF SAMPLES
95%
2.5%
-4
-3
Standard errors
2.5%
-2
-1
+1
Popn Value
+2
+3
+4
Standard errors
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Significance
• Statistically significant: unlikely to have happened by chance
(highly probable)
• Level of significance is affected by sample size (not by population
size)
• Probability of finding happening by chance related to normal
curve and similar theoretical distributions.
• But NB: small differences or weak relationships may not be
socially or managerially significant – even when they are
statistically significant
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Null hypothesis
• H0 – Null hypothesis: there is no significant difference or
relationship
• H1 – Alternative hypothesis: there is a significant difference
or relationship
• eg.
– H tennis and golf participation levels are the same;
– H1 tennis and golf participation levels are significantly
different.
0
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Dependent and independent variables
Independent variable 1
Independent variable 2
Dependent variable
Independent variable 3
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Statistical tests
Task
Format of data
No. of Types of
var’bles variable
Test
Relationship
Crosstabulation of
between 2 variables frequencies
2
Nominal
Chi-square
Difference between
2 means - paired
Difference between
2 means – independent samples
2
Two
scale/ordinal
1. scale/ordinal
(means)
2. nominal (2
grps only)
t-test - paired
Means: for a
whole sample
Means: for 2 subgroups
2
t-test –
independent
samples
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Statistical tests contd
Task
Format of
data
No. of Types of variable
var’bles
Test
Relationship
between 2
variables
Means - for
3+subgroups
2
1. scale/ ordinal (means) One-way
2. nominal (3+ groups) analysis of
variance
Relationship
Means:
between 3 or crosstabulat
more variables ed
3+
1. scale/ordinal (means) Factorial
2. Two or more nominal analysis of
variance
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Statistical tests contd
Task
Format of
data
No. of Types of
var’bles variable
Test
Relationship between 2 Individual
variables
measures
2
Two scale/
ordinal
Correlation
Linear relationship
between 2 variables
Individual
measures
2
Two scale/
ordinal
Linear
regression
Linear relationship
between 3+ variables
Individual
measures
3+
Relationships between Individual
large nos of variables
measures
Many
Three or more Multiple
scale/ ordinal regression
Large nos of
scale/ ordinal
Factor analysis
Cluster analysis
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Data
• Extended version of Campus Sporting Life
survey with
– additional variables
– additional cases
• See Appendix 17.2
• SPSS used, as in Chapt. 16
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Chi-square (X2)
• Testing the relationship between two variables presented in a
frequency crosstabulation.
• Null/alternative hypotheses:
– H0 - there is no relationship between student status and gender in the
population
– H1 - there is a relationship between status and gender in the
population
• Findings (Fig. 17.5):
–
–
–
–
Value of Chi-square: 6.522
Significance: 0.011
Less that 0.05 (5%)
Conclusion: H0 rejected, H1 accepted: there is a relationship
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Comparing two means: t-test
• Paired samples: whole sample: comparing
means for two variables
• Independent samples: sample divided into
two groups (eg. males and females) and
comparing means for one variable
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Comparing 2 means: t-test : Paired samples (Fig. 17.9)
• Example 1: Compare average times played sport in
last 3 months (12.2) with average times visited
national parks (9.8)
• Difference is 2.4
• value of t is 1.245
• Significance is 0.219, which is larger than 0.05
• Null hypothesis is accepted: difference is not
significant
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Comparing 2 means: t-test : Paired samples (Fig. 17.10)
• Compare course costs for males ($110.00 pa) and
females ($136.60)
• Difference is $28.60
• value of t is -1.245
• significance is 0.219
• Null hypothesis is accepted: difference is not
significant
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
One-way analysis of variance (ANOVA)(Fig. 17.11, 13)
• Means of one variable for groups defined by another variable
• F-test rather than r-test
• eg. Means of times played sport by student status:
– F/T student/no paid work: mean = 9.7 times in 3 months
– F/T student/paid work: 9.6 times
– P/T student – F/T job: 19.1 times
– P/T student – Other: 12.2 times
• Value of F: 2.485, Significance 0.072, which is greater than 0.05
• Null hypothesis accepted: no relationship between status and sport
• But for ‘going out for a meal’: F = 6.64 and Sig. = 0.001, which is less than 0.05,
so null hypothesis rejected: there is a significant relationship
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Factorial analysis of variance (Fig. 17.14, 15)
• A table of means: two variables and means of a third
• eg. Mean visits to theatre by gender by student status
Mean number of visits in three months
Status
Male
Female
F/T student/no paid work
3.1
1.5
F/T student/paid work
1.6
5.4
P/T student - F/T job
1.4
2.6
P/T student/Other
3.5
3.2
• Status not significant and gender not significant
• But for status x gender: F = 3.681, Sig. = 0.019, which is <0.05, so
null hypothesis rejected: there is a significant relationship.
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Correlation (Fig. 17.16)
Watched sport by income: weak positive: r = .46
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Correlation (Fig. 7.16)
Played sport by income: weak negative: r = -0.44
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Correlation (Fig. 7.16) Sport exp. by income: strong positive: r= 0.91
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Correlation (Fig. 17.18)
•
•
•
•
Correlation coefficient (r) expresses the relationship numerically
No relationship: r =0
Exact relationship: r = 1 (positive) -1 (negative)
Correlation matrix shows correlations between a number of
variables
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Correlation matrix (simplified Fig. 17.18)
Income
Income
Sport
Watch
sport
Visit
park
Meal
1.00
Sport
-.44**
1.00
Watch sport
.46**
-.68**
1.00
Visit park
.02
.27
-.29*
1.00
Meal
.08
.45**
-.29*
-.04
1.00
.91**
-.37**
.38
.06
.12
Sport exp.
Sport
exp.
1.00
* = significant at the 0.05 level
** = significant at the 0.01 level
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Regression
Fits best fit ‘regression line’ to scatterplot: Fig. 17.21
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Regression: best fit may be a curve (Fig. 17.22)
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Multi-variate analysis
• Multiple regression has one dependent variable and
a number of independent, influencing, variables
• One development: Structural Equation Modelling
explores inter-relationships between a number of
variables
• Cluster and factor analysis: combine large numbers
of variables into groups – eg. lifestyle or personality
groups
A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge
Download