Spearman’s? Chi-squared? Mann-Whitney? Choosing Your Test Choosing the correct technique • Choose your technique BEFORE collecting your data! • Your choice will depend on: What you want to test What sort of data you can collect • Once you’ve chosen your test, it will tell you how much data you must collect What do you want to test? • For Correlation (e.g. between hours in day-care and GCSE scores) • For Association (e.g. is there an association between gender of advertiser in personal ads, and whether physical attraction or resources are advertised) • For a Difference Between Two Cases (e.g. difference in recall of semantically similar and dissimilar words) Click here for a flow chart on choosing tests Testing for Correlation Correlation can be positive, negative or zero Positive correlation: as one variable increases, so does the other. E.g. the higher the parental IQ, the higher the child’s IQ Negative correlation: as one variable increases, the other decreases. E.g. the more hours in daycare per week, the lower the GCSE points score No correlation: one variable increasing has no consistent effect on the other Types of correlation Two kinds of correlation: Straight Line correlation This measures how close your data are to a straight line graph. Data must be continuous, interval Rank correlation This measures whether things are in the same order, but they don’t have to be in a straight line. Data do not have to be continuous – but they must be ordinal Choosing A Correlation Coefficient • Straight line correlation - Pearson’s Product Moment Correlation Coefficient Best to use if data actually are near a straight line If you have lots of data, easier to calculate than Spearman’s (use a spreadsheet) Lets you work out the equation of a “best-fit” line • Rank correlation - Spearman’s Rank Correlation Coefficient Data do not need to be close to a straight line Needs absolute minimum of 4 data pairs – but more is better Not valid if you have too many “ties” Testing for Association Chi-Squared Association Index • This lets you investigate whether there’s any association between two factors – are they linked? E.g.: Are gender and what is advertised in a “Personal Ad” linked? If they are associated, it means that if we know if an advertiser is female, they are more likely to advertise physical attractiveness, say • To do this test, we need: Numbers of people/items etc in categories (in the example, it would be numbers of ads from each sex advertising resources or physical attractiveness) An average of at least 5 in each category The data needs only to be nominal/categorical Testing For A Difference • Difference between numbers of items in two or more categories (e.g. numbers of males and females choosing a certain A-level) • Difference between averages (mean or median) (e.g. difference in scores on a memory test for semantically similar and dissimilar words) Difference between numbers of items in 2+ categories Your data need only be nominal/categorical • Chi-squared – testing for a difference • Sign Test In either case, the null hypothesis is that there will be no difference between the categories – so in the first example there must be equal numbers of images classified as “beginning”, “middle” and “end”, and in the second, there must be approximately equal numbers of male and female students Which one to use? Chi-squared – testing for a difference Use this if there are 3 or more categories (e.g. when comparing number of images recalled from the beginning, middle and end of a sequence) You can also use it to compare 2 categories Data must be frequencies – numbers of people/items • Sign Test Use to see whether there’s significantly more individuals in one category rather than another – e.g. more males than females doing a particular A-level Difference Between Averages To choose the right test for averages, you must ask: Frequency Are the data paired or not? E.g.: Data on “matched pairs” Two test results for the same person (repeated measures) For paired data, is the size of the difference important, or just the fact there is a difference? E.g.: for a change in pulse rate, the size is important for a change in self-esteem, just “better” or “worse” is more useful Are the data likely to be normally distributed? Only continuous data can be normal Can check visually whether normal by diagram Value Which Test for Averages? • Paired Data If the size of the difference is unimportant (or you only have ordinal data), use the sign test. If the size of the difference is important, but the data is not normally distributed, use the Wilcoxon Signed Rank If the data is normally distributed, use the paired t-test • Unpaired Data If the data is not normally distributed, use the MannWhitney U-test If the data is normally distributed, use the unpaired t-test