Chapter 17: Statistical Analysis CONTENTS • The statistics approach • Statistical tests – – – – – – – – – Types of data and appropriate tests Chi-square Comparing two means: the t-test A number of means: one-way analysis of variance A table of means: factorial analysis of variance Correlation Linear regression Multiple regression Factor and cluster analysis The statistics approach • • • • • • Probabilistic statements The normal distribution Probabilistic statement formats Significance The null hypothesis Dependent and independent variables. A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge Probabilistic statements • descriptive: e.g. : 10% of adults play tennis • comparative: e.g. : 10% play tennis, but 12% play golf • relational: e.g. 15% of people with high incomes play tennis but only 7% of people with low incomes do so: there is a positive relationship between tennis-playing and income. • However: when based on a samples, the above must be made using a probabilistic format A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge Probabilistic statements contd • We can be 95% confident that the proportion of adults that plays tennis is between 9% and 11% • The proportion of golf players is significantly higher than the proportion of tennis players (at the 95% level of probability) • There is a positive relationship between level of income and level of tennis playing (at the 95% level) • (See discussion of Confidence intervals: Chapt 13). A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge Probabilistic statement formats • 95% probability – sometimes expressed as 5% – sometimes as 0.05 • 99% probability is also used – also expressed 1% or 0.01 A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge Normal distribution (Fig. 17.1): a. Drawing repeated samples (theory) Samples -4 -3 -2 Sample values -1 +1 +2 +3 +4 Sample values Popn Value A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge Normal distribution contd b. Normal distribution/curve Samples -4 -3 -2 -1 Sample values +1 +2 +3 +4 Sample values Popn Value A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge Normal curve (Fig. 13.1) NUMBER OF SAMPLES 95% 2.5% -4 -3 Standard errors 2.5% -2 -1 +1 Popn Value +2 +3 +4 Standard errors A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge Significance • Statistically significant: unlikely to have happened by chance (highly probable) • Level of significance is affected by sample size (not by population size) • Probability of finding happening by chance related to normal curve and similar theoretical distributions. • But NB: small differences or weak relationships may not be socially or managerially significant – even when they are statistically significant A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge Null hypothesis • H0 – Null hypothesis: there is no significant difference or relationship • H1 – Alternative hypothesis: there is a significant difference or relationship • eg. – H tennis and golf participation levels are the same; – H1 tennis and golf participation levels are significantly different. 0 A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge Dependent and independent variables Independent variable 1 Independent variable 2 Dependent variable Independent variable 3 A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge Statistical tests Task Format of data No. of Types of var’bles variable Test Relationship Crosstabulation of between 2 variables frequencies 2 Nominal Chi-square Difference between 2 means - paired Difference between 2 means – independent samples 2 Two scale/ordinal 1. scale/ordinal (means) 2. nominal (2 grps only) t-test - paired Means: for a whole sample Means: for 2 subgroups 2 t-test – independent samples A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge Statistical tests contd Task Format of data No. of Types of variable var’bles Test Relationship between 2 variables Means - for 3+subgroups 2 1. scale/ ordinal (means) One-way 2. nominal (3+ groups) analysis of variance Relationship Means: between 3 or crosstabulat more variables ed 3+ 1. scale/ordinal (means) Factorial 2. Two or more nominal analysis of variance A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge Statistical tests contd Task Format of data No. of Types of var’bles variable Test Relationship between 2 Individual variables measures 2 Two scale/ ordinal Correlation Linear relationship between 2 variables Individual measures 2 Two scale/ ordinal Linear regression Linear relationship between 3+ variables Individual measures 3+ Relationships between Individual large nos of variables measures Many Three or more Multiple scale/ ordinal regression Large nos of scale/ ordinal Factor analysis Cluster analysis A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge Data • Extended version of Campus Sporting Life survey with – additional variables – additional cases • See Appendix 17.2 • SPSS used, as in Chapt. 16 A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge Chi-square (X2) • Testing the relationship between two variables presented in a frequency crosstabulation. • Null/alternative hypotheses: – H0 - there is no relationship between student status and gender in the population – H1 - there is a relationship between status and gender in the population • Findings (Fig. 17.5): – – – – Value of Chi-square: 6.522 Significance: 0.011 Less that 0.05 (5%) Conclusion: H0 rejected, H1 accepted: there is a relationship A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge Comparing two means: t-test • Paired samples: whole sample: comparing means for two variables • Independent samples: sample divided into two groups (eg. males and females) and comparing means for one variable A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge Comparing 2 means: t-test : Paired samples (Fig. 17.9) • Example 1: Compare average times played sport in last 3 months (12.2) with average times visited national parks (9.8) • Difference is 2.4 • value of t is 1.245 • Significance is 0.219, which is larger than 0.05 • Null hypothesis is accepted: difference is not significant A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge Comparing 2 means: t-test : Paired samples (Fig. 17.10) • Compare course costs for males ($110.00 pa) and females ($136.60) • Difference is $28.60 • value of t is -1.245 • significance is 0.219 • Null hypothesis is accepted: difference is not significant A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge One-way analysis of variance (ANOVA)(Fig. 17.11, 13) • Means of one variable for groups defined by another variable • F-test rather than r-test • eg. Means of times played sport by student status: – F/T student/no paid work: mean = 9.7 times in 3 months – F/T student/paid work: 9.6 times – P/T student – F/T job: 19.1 times – P/T student – Other: 12.2 times • Value of F: 2.485, Significance 0.072, which is greater than 0.05 • Null hypothesis accepted: no relationship between status and sport • But for ‘going out for a meal’: F = 6.64 and Sig. = 0.001, which is less than 0.05, so null hypothesis rejected: there is a significant relationship A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge Factorial analysis of variance (Fig. 17.14, 15) • A table of means: two variables and means of a third • eg. Mean visits to theatre by gender by student status Mean number of visits in three months Status Male Female F/T student/no paid work 3.1 1.5 F/T student/paid work 1.6 5.4 P/T student - F/T job 1.4 2.6 P/T student/Other 3.5 3.2 • Status not significant and gender not significant • But for status x gender: F = 3.681, Sig. = 0.019, which is <0.05, so null hypothesis rejected: there is a significant relationship. A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge Correlation (Fig. 17.16) Watched sport by income: weak positive: r = .46 A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge Correlation (Fig. 7.16) Played sport by income: weak negative: r = -0.44 A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge Correlation (Fig. 7.16) Sport exp. by income: strong positive: r= 0.91 A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge Correlation (Fig. 17.18) • • • • Correlation coefficient (r) expresses the relationship numerically No relationship: r =0 Exact relationship: r = 1 (positive) -1 (negative) Correlation matrix shows correlations between a number of variables A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge Correlation matrix (simplified Fig. 17.18) Income Income Sport Watch sport Visit park Meal 1.00 Sport -.44** 1.00 Watch sport .46** -.68** 1.00 Visit park .02 .27 -.29* 1.00 Meal .08 .45** -.29* -.04 1.00 .91** -.37** .38 .06 .12 Sport exp. Sport exp. 1.00 * = significant at the 0.05 level ** = significant at the 0.01 level A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge Regression Fits best fit ‘regression line’ to scatterplot: Fig. 17.21 A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge Regression: best fit may be a curve (Fig. 17.22) A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge Multi-variate analysis • Multiple regression has one dependent variable and a number of independent, influencing, variables • One development: Structural Equation Modelling explores inter-relationships between a number of variables • Cluster and factor analysis: combine large numbers of variables into groups – eg. lifestyle or personality groups A. J. Veal & S. Darcy (2014) Research Methods for Sport Studies and Sport Management: A practical guide. London: Routledge