Inferential Statistics ARIEL F. MELAD 2020 (EDITED) Learning Objectives At the end of the module, you should be able to: review types of measurements variable and different level of differentiate between parametric and non-parametric test choose appropriate statistical tool needed for test of difference or test of relationships perform basic statistical analysis using statistical software interpret and report the statistical results of the analysis Topics Summary Statistics Appropriate Statistical Methods Most common statistical test Independent samples t-test Paired t-test One-way Analysis of Variance (ANOVA) Chi-square test Pearson correlation Summary Statistics Summary Statistics Summary of descriptive and graphical statistics Chart Pie chart or bar chart Variable type Purpose Summary Statistics One categorical Shows frequencies/percentages/ proportions Class percentages Stacked/multiple Two categorical Compares proportions within bar groups Histogram One scale Show distribution of results Scatter graph Two scale Boxplot One scale/one categorical Scale by time Line chart Shows relationship between two variables Compare spread of values Display changes over time Comparison of groups Percentages with groups Mean and standard deviation Correlation coefficient Median and IQR Means by time point Inferential Statistics Appropriate Statistical Methods Things to consider: 1. which is dependent and independent variables? 2. What type of measurement scales/data? 3. How many variables are there? 4. Is it test of difference or relationship? Inferential Statistics Common single comparison test Comparing Average of two INDEPENDENT groups Average of 3+ independent groups The average difference bewteen paired (matched) samples e.g. weight before and after The 3+ measurements on same subject Dependent (outcome) variable Scale Scale Scale Scale Independent (explanatory) variable Parametric test Non-parametric test (data is (ordinal/skewed normally data) distributed) Independent Man –Whitney Nominal (Binary) samples t-test test/Wilcoxon rank sum Nominal One-way Kruskal-Wallis test ANOVA Time/condtion Paired t-test Wilcoxon signed rank varaible test Time/condtion varaible Repeated measures ANOVA Friedman test Test of Association Dependent (outcome) variable Independent (explanatory) variable Parametric test (data is normally distributed) Non-parametric test (ordinal/ skewed data) Relationship between 2 continuous variables Scale Scale Transform the data Predicting the value of one variable from the value of a predictor variable or looking for significant relationship Scale Any Pearson’s correlation coefficient Simple linear regression Nominal (binary) Categorical Any Comparing Assesing the relationship between two categorical variable Categorical Logistic regression Chi-square test Inferential Statistics Statistical Significance Statisticians often choose a cut-off point under which a pvalue must fall for a finding to be considered statistically significant. If the p-value is less than or equal 0.05 (5%), the result is deemed statistically significant, i.e., there is a significant relationship between the variables. Use the p-value as an indicator of statistical significance. Inferential Statistics Statistical Significance Two forms of inferential statistics exist: test that examine associations, and test that examine differences. Most common statistical test 1. Descriptive Analytics uses percentages, frequency counts, means, ranks and weighted mean. Most common statistical test Example1. Let us examine data on SPSS Open the file, “ Weight Management”. Q1: What is the profile of the participants in terms of sex, age, initial BMI (interpretation) and final BMI (interpretation) of the participants. Variables: sex, age, initial and final BMI (interpretation). Types of data: all variable are categorical. Descriptive Analytics Interpretation Table shows the profile of participants in terms of sex. It can be seen on the table that 18 or 46.2% are male while 21 or 53.8% are female. This shows that most of the participants in the weight management activity are female. Descriptive Analytics Interpretation According to Craft, Carroll and Lustyk (2014) on their study entitled “ Gender Differences in Exercise Habits and Quality of Life Reports”, results revealed that women reportedly significant higher exercise and quality of life levels than men. Women reported exercising for weight loss and toning more than men, whereas men reported exercising for enjoyment more than women. Descriptive Analytics Table shows the initial weight of the participants before the weight management activity. It can be viewed from the table that 21 or 53.8% have a normal weight, 11 or 28.2% are overweight while only 7 or 17.9% are obese. This shows that most of the participants have normal weight before the activity conducted. Descriptive Analytics According to U.S National Library of Medicine, regular exercise and physical activity may help you to control your weight, reduce your risk of heart diseases, help your body manage blood sugar and insulin levels, improve your mental health and mood, reduce your risks of some cancers and many more. Descriptive Analytics Can you interpret? 𝑥 age is 23.10 years old Descriptive Analytics Can you interpret this data? Most common statistical test 1. Independent samples t-test Independent variable: categorical (dichotomous) Dependent variable: scale/continuous Use: A t-test is used to compare the means between two independent groups or categories. Plot: Histogram of differences Important: the dependent variable must be normally distributed. Independent Samples T-test Interpretation: If the p-value < 0.05, there is a significant difference between the means of the two groups. Report the means of the two groups or the mean difference and confidence interval from the SPSS output to describe the difference. Independent Samples T-test Example1. Let us examine data on SPSS Open the file, “ Weight Management”. Q1: Is there a significant difference between the weight goal of the male and female participants before the weight management activity? Variables: sex and weight goal Independent variable: sex Independent Samples T-test dependent variable: weight goal Sex is categorical with only 2 groups (dichotomous) Weight goal is scale/ numerical data. Independent Samples T-test Independent Samples T-test Sample write-up: “An independent samples t-test was conducted to examine whether there was a significant difference between male and female group in relation to their weight goal before the conduct of the weight management activity. The test revealed a statistically significant difference between male and female group since the p-value is < 0.05 level of significance(t = 2.058, df = 36, p = .047). Independent Samples T-test Sample write-up: Male group (mean = 61.67, SD = 6.86) reported significantly higher weight goal than the female group (mean = 56.67, SD = 8.11)”. This can be concluded that male are more confident on weight loss before the weight management activity than the female group. Independent Samples T-test Sample write-up: For men, exercise itself was the best predictor of quality of life. In other words, higher levels of exercise in men were associated with higher quality of life. This suggests that men may be able to improve their quality of life with increased exercise, no matter what reasons for exercise men give(Craft, Carroll & Lustyk ,2014) . One way ANOVA (Analysis of Variance) Independent variable: categorical (at least three categories) Dependent variable: scale/continuous Use: one way ANOVA is used to detect the difference in means of 3 or more independent groups. ANOVA is the generalize version of independent samples t-test. Plot: Box-plots or confidence interval plots Important: the dependent variable must be normally distributed. One way ANOVA Note: Post-hoc tests allow you to determine where significant differences lie. When the ANOVA is found to be significant, one must examine which two groups differ significantly from the total number of groups: on post-hoc tests table, look at mean differences between different pairs. One way ANOVA Note: There are many post-hoc tests to choose from when doing an ANOVA. The Scheffe post-hoc test should be selected when equal variances assumed but the Games- Howell post-hoc test should be selected if not. One way ANOVA Example2. Let us examine data on SPSS Open the file, “ Weight Management”. Q1: Is the weight loss among participants differs on the frequency of exercise? Variables: weight loss and frequency of exercise Independent variable: frequency of exercise One way ANOVA dependent variable: weight loss Frequency of exercise is categorical with 3 groups (3x,4x and 5x a week) Weight loss is scale/ numerical data. One way ANOVA One way ANOVA Sample write-up: “A one-way ANOVA was conducted to examine whether there were statistically significant differences among participants in different frequency of exercise in relation to their weight loss after the weight management activity. The results revealed statistically no significant difference on their weight loss as to their frequency of exercise since the p-value is > 0.05 level of significance, F = 0.146, p = 0.864. Paired T-test Use: Paired t-test is used to compare significant difference two population means Important: Variables between two groups must be scale. Example: Before-and-after observations on the same subjects (e.g. blood pressure measurements before and after exercise) Paired T-test Example3. Let us examine data on SPSS Open the file, “ Weight Management”. Q1: Is there a significant difference on the BMI of the participants before and after the weight management activity? Group 1: BMI before (scale variable) Group 2. BMI after (scale variable) Paired T-test Paired T-Test Sample write-up: “A paired t-test was conducted to examine whether there were statistically significant differences on the BMI of the participants before and after the weight management activity. The results revealed statistically significant difference on the BMI of the participants before and after the activity since the p-value is < 0.05 level of significance (t=18.09, df =38, p = 0.000). Paired T-Test Sample write-up: This shows that the reported mean BMI of the participants before (mean = 24.9, SD = 3.65) the activity is higher than after (mean = 23.75, SD = 3.62) the weight management exercise. This further shows that there was a decrease of an average of 1.20 kg on the BMI score of the participants after the weight management activity. Hence, the weight management activity is effective. Chi-square Test Use: The Pearson Chi square is used to test whether a statistically significant relationship exists between two categorical variables. It accompanies a crosstabulation between the two variables. Independent variable: categorical Dependent variable: categorical Chi-square Test Example: educational attainment and use of contraceptives Independent variable: educational attainment (e.g. Elem grad, HS grad, College Grad, etc…) Dependent variable: use of contraceptives (e.g. Yes or No) Independent variable: categorical data Dependent variable: categorical data Chi-square Test Example4. Let us examine data on SPSS Open the file, “ Weight Management”. Q1: Is there a significant relationship between the initial weight and initial BMI of the participants? Independent: initial weight Dependent: initial BMI initial weight is categorical (e.g. 20-30, 30-40, etc.) initial BMI is categorical (e.g. 10-15, 16-20, etc.) Chi-square Test Chi-square Test Sample write-up: “A Pearson chi-square test was conducted to examine whether there was a relationship between initial weight and initial BMI of the participants. The results revealed that there was a significant relationship between the two variables (Chi square value = 67.10, df =20, p=0.000) since the p-value <0.05 level of significance. This means that the higher your initial weight the higher your BMI is. (see table 1 and 2) Pearson Correlation Use: Correlation tests (Pearson correlation) are used to examine relationships between two or more quantitative/numerical variables. They measure the strength and direction of a relationship between variables. Pearson Correlation Pearson Correlation It ranges from negative (-1) to positive (+1) coefficient values. A negative correlation indicates that high values on one variable are associated with low values on the next. A positive correlation indicates that high values on the one variable are associated with high values of the next. Pearson Correlation Example 1: A positive correlation between salary and expenditures means that higher your salary the higher your expenses is. Pearson Correlation Example 2: A negative correlation between the number of absences and score during exams means that the more absences you take place in the class the lower your score is during exams. Pearson Correlation Example 3: No correlation occurred between your height and your expenditures. Pearson Correlation The p-values tells you whether the relationship or correlation between the variables are statistically significant (p< .05). Pearson Correlation Strength .10 to .29 – weak relationship .30 to .49 – moderate relationship .50 and above – strong relationship Pearson Correlation The sign of the relationship does not indicate the strength; (-).50 is the same strength as (+).50 but different direction. ‘r’is the symbol of the correlation coefficient. Pearson Correlation Example5. Let us examine data on SPSS Open the file, “ Weight Management”. Q1: Is there a significant relationship between the weight and BMI of the participants? Independent: weight Dependent: BMI Both variables are numerical/quantitative Pearson Correlation Pearson Correlation Sample write-up “A Pearson correlation analysis was conducted to examine whether there is a relationship between weight and BMI. The results revealed a significant and strong positive relationship (r = .913, p = .000). The higher your weight score the higher your BMI (see Table 1).” Pearson Correlation Sample write-up “The result of the present study is similar to the study of Islam et al (2017) showed that weight and height among males and females is significantly correlated.” Pearson Correlation Try this! Is weight significantly correlated with expenditures? Comment below References Davonish, D. (n.d). Exploring relationships using SPSS inferential statistics (part II).