CHAPTER TWELVE ANALYSING DATA I: QUANTITATIVE DATA ANALYSIS QUANTITATIVE ANALYSIS Analysis should always have a purpose: • Describe. • Compare. • Examine similarities. • Examine differences. The issue is not just HOW do we collect data, but how do we generate useful information? DESCRIPTIVE STATISTICS Summarise and organise data. Measures of central tendency • Mean − average sum of scores/number of scores. • Mode − most common value − ‘typical’ value. • Median − middle value. Findings can be presented in a number of ways: Frequency tables How often do you go training in a week? 1 2 3 (n = number of responses) Always provide a table of results. n 17 29 22 % 25.0 42.7 32.3 USING GRAPHS AND CHARTS Only present a graph/chart if it illustrates something. These describe data − they do not explain anything. INFERENTIAL STATISTICS Allow you to make inferences from data. Uses at least 2 variables. ‘What affect does the independent variable have on the dependent variable?’ − Causality − is A caused by B? TYPES OF TEST 1. Parametric tests. These tests use interval or ratio data (see Chapter 6 for a reminder). Parametric tests assume that the data is drawn from a normally distributed population (i.e. the data is not skewed) and have the same variance (or spread) on the variables being measured. 2. Non-parametric tests. These are used with ordinal or nominal data, and do not make any assumptions about the characteristics of the sample in terms of its distribution. TESTS OF ASSOCIATION CORRELATION Correlations investigate the relationship between two variables consisting of interval or ratio data. A correlation can indicate: • Whether there is a relationship between the two variables. • The direction of the relationship, i.e. whether it is positive or negative. • The strength, or magnitude of the relationship. Correlation scores range from 1 to -1 R=1 strong positive correlation R= -1 strong negative correlation R=0 no correlation A strong correlation does not necessarily mean a relationship! e.g. lectures attended positively correlates with final grade. May be: more lectures attended = more interest more interest = higher grade Spuriousness relationship. TESTING DIFFERENCES Tests of difference generally assess whether differences between two samples are likely to have occurred by chance, or whether they are the result of the effect of a particular variable. THE INDEPENDENT SAMPLES T-TEST This examines whether the mean scores of two different groups can be considered as being significantly different. It can be used when: • The data is interval or ratio in nature. • The groups are randomly assigned (hence, you should use an ANOVA rather than a t-test to compare differences between males and females, as gender is not randomly determined when you come to assign your groups). • The two groups are independent of each other. • The variance, or spread, in the two groups is equal. PAIRED SAMPLES T-TEST The paired t-test measures whether the mean of a single group is different when measured at different times. ANALYSIS OF VARIANCE (ANOVA) ANOVA is similar in nature to the independent t-test, however it allows you to ascertain differences between more than two groups. If you are looking to explore gender differences, then this is a more appropriate test to use than an independent ttest as it does not assume that participants have been randomly assigned to each group. THE MANN-WHITNEY TEST An alternative to the independent t-test. Used when data is ordinal and non-parametric. This test works on ranking the data rather than testing the actual score, and scoring each rank (so the lowest score would be ranked ‘1’, the next lowest ‘2’ and so on) ignoring the group to which each participant belonged. The principle of the test is that if the groups were equal, then the sum of the ranks should also be the same. THE WILCOXON SIGNED RANK TEST Similar to the Mann-Whitney test, however it examines differences where the two sets of scores are from the same participants (effectively it is non-parametric alternative to a one sample t-test). THE KRUSKAL-WALLIS TEST This is a non-parametric alternative to the ANOVA test, and can be used to identify differences between three or more independent groups. WHICH TEST SHOULD I USE? The type of data that you collect will be important in your final choice of test: Nominal Consider a chi-squared test if you are interested in differences in frequency counts using nominal data, for example comparing whether month of birth affects the sport that someone participates in. Ordinal If you are interested in the relationship between groups, then use Spearman’s correlation. If you are looking for differences between independent groups, then a Mann-Whitney test may be appropriate. If the groups are paired, however, then a Wilcoxon Signed rank test is appropriate. If there are three or more groups then consider a KruskalWallis test. Interval or ratio Are you looking to identify relationships between two variables? If so, consider the use of a Pearson’s correlation. If there are three or more variables, then consider multiple regression. If you are concerned with differences between scores, then t-tests or ANOVA may be appropriate. If you want to identify differences within one group, then a paired samples t-test should be used. If you are comparing two randomly assigned groups, then use an independent samples t-test. If you are looking to compare two non-randomly assigned, or three or more groups, then use ANOVA. INTERPRETING THE FINDINGS SPSS/excel will tell you the correlation and tell you if it is significant or not. Significance is the likelihood of something not being due to chance. Take two groups and measure average height… height of group A is 165cm and the other is 155cm… are they different? You will be given a score, e.g. p=0.05 5% likelihood the finding was due to chance 95% likelihood that it was as a consequence of treatment. MISTAKES SOMETIMES MADE IN QUANTITATIVE ANALYSIS • Choosing an incorrect statistical test, often through applying parametric tests to non-parametric data. • Designing the questionnaire so that the data is in the incorrect format for the appropriate statistical test, thus the format of the response may provide you with ordinal data when you require interval data. • Misinterpreting a p-value, or deciding upon an inappropriate level of significance. • Deciding upon a level of significance after undertaking the analysis. • Using parametric tests for non-parametric data. SUMMARY 1. Descriptive statistics allow you to organise and summarise your data. Inferential statistics allow you to draw inferences regarding the association or difference between two or more variables. 2. Inferential tests will provide you with a ‘p-value’. The pvalue indicates the likelihood that any association or difference (depending upon the test) was down to chance or not. A p-value of 0.05 indicates that in 95 cases out of 100 you could be confident that there was an actual difference or association, rather than a chance difference or association.