Data Analysis Decision Matrix By Dr. Kevin Kaeochinda Capstone II Workshop Series How to use this Data Analysis Matrix: • Write down your hypothesis and identify your independent variable(s) (IV) and your dependent variable(s) (DV) – IV: Types of treatment (e.g., therapy or no therapy) and DV: Outcome (e.g., depression) • How many do you have of each? – For example: – Treatment (Therapy or no therapy) is ONE I.V. with two levels – Depression is ONE D.V. with one level (although it is continuous • Are you measuring the outcome(s) on the same population or different? If you use two-levels of Treatment (no therapy and then therapy) on the same group of people, then your sample would be “dependent” • Are they Categorical, Ordinal, or Interval type variables? – Categorical/Nominal: Two or more categories but no true order for the categories (e.g., Male and Female; Hair color) – Ordinal: Same as categorical but has a logical order or value (e.g., First place, second place, third place, etc.) – Interval/Continuous: Has order but there’s values in between (e.g., Income in dollars) Other things to consider: • Do you have any groupings? – Are you going to compare male versus female, ethnic groups, age groups, or other grouping (e.g., hours spent on Facebook) – Write down your groups so you can visually see it. • For most test, you would also want to test for parametric assumptions before conducting a parametric test (a non-parametric equivalent would be possible) – Normality – Homogeneity of variances Descriptive & Frequencies • You should always do a “Frequency” analysis prior to any other analysis – Also, break down your analysis based on grouping to help you get means/modes/medians and Standard Deviations • Include a histogram or graphical representation of the data to get a “feel” of it • It will help you get the feel of your data – How many N’s? How many males or females? Students or non-students? How many missing? No. of D.V.’s Nature of I.V. 1 Nature of D.V. Analysis to perform Interval & Normal One-Sample t-test Ordinal or Interval Wilcoxon Signed Rank Test Interval & Normal 2 Independent Sample t-test Ordinal or Interval Wilcoxon-Mann Whitney test 1 I.V. with 2+ levels (independent groups) Interval & Normal One-way ANOVA Ordinal or Interval Kruskal Wallis 1 I.V. with 2+ levels (dependent/matched groups) Interval & Normal Paired sample t-test Ordinal or Interval Wilcoxon signed ranks test 2 or more I.V. Interval & Normal Factorial ANOAV Ordinal or Interval Ordered logistic regression Interval & Normal Correlation And/Or Simple Linear Regression Ordinal or Interval Non-parametric correlation Categorical Simple logistic regression Interval & normal Multiple regression 0 I.V.’s (1 population) Example. Whole Sample 1 I.V. with 2 levels (independent groups) Example. Female/Male Students/Non-students 1 Interval I.V. 1 or more interval I.V.’s and/or 1 or more categorical I.V.’s Analysis of Covariance (ANCOVA) Categorical Multiple logistic regression Discriminant analysis 2+ D.V.’s No. of D.V.’s Nature of I.V. Nature of D.V. Analysis to perform 2+ 1 I.V. with 2 or more levels (independent groups) Interval & Normal One-Way MANOVA 2+ I.V. Interval & Normal Multivariate Multiple Linear Regression 0 I.V. Interval & Normal Factor Analysis 0 I.V. Interval & Normal Canonical Correlation 2+ sets of 2+ The above analyses are advanced. Please see me about them before trying to conduct the analyzes boxed in red Retrieved and simplified from: http://goo.gl/ohT922 One Sample t-test • The one-sample t-test is used to determine whether a sample comes from a population with a specific mean. This population mean is not always known, but is sometimes hypothesized. For example, you want to show that a new teaching method for pupils struggling to learn English grammar can improve their grammar skills to the national average. Your sample would be pupils who received the new teaching method and your population mean would be the national average score. In SPSS • Analyze >> Compare Means >> One-Sample T-test • Transfer your Dependent Variable (in this case, dep_score) to the Test Variable(s) box. • Click OK Population mean to compare to the sample mean Click here to continue >> One Sample t-test cont’d. • How to interpret your output The write-up: • Dep_score is Dependent Score • “A single-sample t-test was conducted to compare the weight of all participants in their Dependent Scores to the average Dependent Score of students at MCU. The results were significant, t(9)=4.47, p<.001.” << Click here to go Back to beginning Wilcoxon Signed Rank Test • • Equivalent to the dependent t-test For example, you could use a Wilcoxon signed-rank test to understand whether there was a difference in smokers' daily cigarette consumption before and after a 6 week hypnotherapy program (i.e., your dependent variable would be "daily cigarette consumption", and your two related groups would be the cigarette consumption values "before" and "after" the hypnotherapy program). In SPSS • Analyze >> Nonparametric >> Related Samples (scan data option) • In Objective tab: Customize Analysis • Fields tab: Move the Pre and Post test over to Test Fields Click here to continue >> Wilcoxon Signed Rank Test cont’d • In the Settings tab: tick Wilcoxon matched-pairs signed rank (2 samples) • Click the Run button on the bottom • In the output, double-click the Window under Hypothesis Test Summary Click here to continue >> Wilcoxon Signed Rank Test cont’d • The Model Viewer will open up and you will see this on the rightside The write-up: • Pre-test: Health problems before smoking • Post-test: Health problems after starting smoking • “A Wilcoxon Signed-ranks test indicated that using cigarettes (M=1.9, SD=1.20) contributed to increase in health problems than non-smokers (M=4.2, SD=1.99), Z = 55.00, p < .01.” << Click here to go Back to beginning Independent Samples T-test • In this analysis, you’re attempting to compare one IV (e.g., Gender) with two independent levels (e.g., Female and Male) to a DV (Writing score). In SPSS • Analyze >> Compare Means >> Independent Samples t-Test • • Writing is the Testing Variable(s) – The DV Grouping Variable is Gender with 1,2 – The IV • You should get the following output: Click here to continue >> Independent Samples T-test cont’d • First, look at Levene’s Test for Equality of Variances. If it is non-significant, then you will need to report the ttest in the row of “Equal variances assumed”. If it is significant, then you will need to report the t-test in the row of “Equal variances not assumed” • This example’s Levene’s test is non-significant, so we assume equal variances The write-up • “An independent samples t-test was used. There was a significant difference in the writing scores between female (M=88.2, SD=4.81) and males (M=73.2, SD=10.52), t(8)=2.90, p < .05. << Click here to go Back to beginning Wilcoxon-Mann Whitney test • This is similar to the independent samples t-test when you have ordinal variables. • For example, you could use the Mann-Whitney U test to understand whether attitudes towards pay discrimination, where attitudes are measured on an ordinal scale, differ based on gender (i.e., your dependent variable would be "attitudes towards pay discrimination" and your independent variable would be "gender", which has two groups: "male" and "female"). In SPSS • Analyze >> Nonparametric Tests >> Independent Samples • Objective tab: Customize analysis should be ticked • Fields tab: Move your DV (AttitudePay) which is attitude towards pay discrimination to the Test Fields: and Gender to the Groups. Click here to continue >> Wilcoxon-Mann Whitney test cont’d • Settings tab: Tick Customize Tests, then tick MannWhitney U (2 samples) • Click Run • Double-click on the table below in the Output window to open up the Model Viewer Click here to continue >> Wilcoxon-Mann Whitney test cont’d • In the Model Viewer’s right side, you should see this: • Reporting: “A Mann-Whitney U test was used to test for group differences in attitudes to pay rate discrimination. Females’ level of attitude to pay discrimination (M=3.0, SD=1.58) was not significantly different than males (M=2.6, SD=1.52), U = 10.50, z = -.430.” << Click here to go Back to beginning One-Way ANOVA • A one-way analysis of variance (ANOVA) is used when you have a categorical independent variable (with two or more categories) and a normally distributed interval dependent variable and you wish to test for differences in the means of the dependent variable broken down by the levels of the independent variable. • For example, if you wanted to test whether high school, college, or Sunday soccer players differ in training time In SPSS • Analyze >> Compare Means >> One-way ANOVA • TrainingTime (DV) goes in the Dependent List and Soccer (IV) goes into the Factor list. • Click the Post-Hoc button and make sure Tukey is selected. This will be important after the intial test. • Press OK Click here to continue >> One-Way ANOVA cont’d • • You will need to look at two boxes in the Output. The first one is whether there are any differences in the group: • If the first box is significant, then we can also report from the second box: • Reporting for two boxes: “A one-way ANOVA was conducted to compare the effect of Soccer Player Type on Training Time in High School, College, and Sunday soccer players. There was a significant effect of soccer player type on training time, [F(2,7)=159.92, p<.001]. Post-hoc comparison using the Tukey HSD test indicated that College soccer players had significantly greater training times (M=19.5,SD=1.73) than High School soccer players (M=7.33,SD=1.15) and Sunday soccer players (M=1.00,SD=.00). It was also found that High School soccer players had greater training times than Sunday soccer players.” << Click here to go Back to beginning Kruskal Wallis Test • The Kruskal-Wallis H test (sometimes also called the "one-way ANOVA on ranks") is a rank-based nonparametric test that can be used to determine if there are statistically significant differences between two or more groups of an independent variable on a continuous or ordinal dependent variable • For example, your D.V. is Exam performance on a continuous scale of 0-100 and I.V. is Work with three independent groups (levels) of On-campus, Offcampus, and No Work groups. In SPSS • Analyze >> NonParametric >> Independent Samples • Objective tab: Customize analysis • Fields tab: TestScores (DV) in Test Fields, Work (IV) in Groups Click here to continue >> Kruskal Wallis Test • Settings tab: tick Kruskal-Wallis 1-Way ANOVA (k samples) • Click Run • In the Output window, double-click this window to get the Model Viewer to open up Click here to continue >> Kruskal Wallis Test • In the model viewer, you should see this: • Reporting: “A Kruskal Wallis H Test was conducted. There were no statistically significant difference between Worker types (H(2)=5.80, p = .06), with mean rank of 7 for On-campus, 2 for Off-campus, and 7 for No Work.”* – * You can get the “Mean rank” by mousing over the groups in the graph above. << Click here to go Back to beginning Paired sample t-test • The dependent t-test (called the paired-samples t-test in SPSS) compares the means between two related groups on the same continuous, dependent variable. • For example, students entering an anxiety workshop with pre-scores being anxiety levels before the program and post-scores being anxiety levels afterward (both measures on continuous scale of 0100). In SPSS • Analyze >> Compare Means >> Paired Sample t-Test • PreAnxiety and PostAnxiety variables are entered as one pair (Variable 1 and Variable 2) • Click OK Click here to continue >> Paired sample t-test cont’d • The output window: • The write-up: “A paired-samples t-test was conducted to test whether students entering an anxiety reducing program had significantly less anxiety afterward. A pre-test on anxiety level (M=61.0,SD=26.45) and posttest on anxiety level (M=30.3,SD=18.92) was recorded. There was a significant difference between pre- and post-test levels of anxiety, t(9)=9.36, p <.001.” << Click here to go Back to beginning