1 type error upper limit ALWAYS = 0.05 - p>0.05 accept H0 reject Ha, p<0.05 reject H0, accept Ha - If the P is low H0 has to go! H0 = null hypothesis, are: do not depend on one another, does not differ, does not affect, etc. HA = alternative hypothesis, are: dependent on one another, differ, affects, etc. Dependent value: the measured value (e.g. cm, concentration, ml) Independent value: unaffected (e.g. age) Statistical language Comparing (differ) 𝐻0 : 𝑥𝑚𝑒𝑎𝑛𝐴 = 𝑥𝑚𝑒𝑎𝑛𝐵 𝐻𝐴: 𝑥𝑚𝑒𝑎𝑛𝐴 ≠ 𝑥𝑚𝑒𝑎𝑛𝐵 Correlation(dependent) 𝐻0 : 𝑟 = 0 𝐻𝐴: 𝑟 ≠ 0 Effect: 𝑟=1 2 𝐻0 : ∑ (𝑓0 − 𝑓𝑒𝑥𝑝) = 0; z=0 𝑟=4 𝑟=1 2 𝐻𝐴 : ∑ (𝑓0 − 𝑓𝑒𝑥𝑝) ≠ 0; z≠0 𝑟=4 Topic 1 (Comparing 2 Independent Samples) - - Dispersion of data? Compare variance or range (larger variance/range = high dispersion) Symmetrical data distribution? Compare Skewness (closer to 0 = more symmetric; skewness<0 => skewed more left; skewness>0 skewed more right; -0.5<skewness<0.5 => fairly symmetric) Flattened data distribution? Compare Kurtosis (low Kurtosis = more flat distribution) Higher variability? Compare standard deviation (high SD = high variability) Accurate estimate of mean? Compare SE (high SE = low accuracy of mean) Leptokurtic? Compare Kurtosis (K>0) Mesokurtic? Compare Kurtosis (K=0) Platykurtic? Compare Kurtosis (K<0) Normality - Shapiro-Wilk Test - Descriptive Statistics → select variables → Normality → Shapiro-Wilk (unselect the other box above) → Histogram - If p-value is > (bigger) than 0.05 accept H0 = the data follows a normal distribution (if p < 0.05 reject H0, accept HA) Mann-Whitney Test (U-test) - Your data followed a NOT normal distribution! If your data is not normal then you can no longer compare means and you must compare U variables. Not mean variables! If data doesn't fit normal distribution it is nonparametric (the parameters cannot be expressed in a normal distribution) Select Nonparametrics → Comparing 2 independent samples Confident limits of mean (Upper and lower limit) - H0: mean = value given, HA: mean ≠ value given Find upper and lower limits, is the mean of the variable within these limits? Yes, accept H0. - E.g. Value given in question 3 - Upper limit = 4.41, Lower limit = 2.83. Hence, mean of 2 values = 3.62 - In conclusion 3.62 = 3, so we accept H0 Homogeneity Test (Do samples have similar data trends? eg. similar variance) - Levene’s Test = sample size is the same (Same N-Value) - Descriptive Statistics → t-test, independent by group → options → Select Levene’s Test → select test value, p-value - Compare the p-value to 0.05 (H0 = homogenous, HA= not homogeneous) - Brown & Forsythe Test = different sample size (Different N-Value) - Descriptive Statistics → t-test, independent by group → options → Select Brown & Forsythe Test → select test value (F(1,df)), p-value - Compare the p-value to 0.05 (H0 = homogenous, HA= not homogeneous) T- Test: compares samples' means Student-t-test (2 sample t-test) - Your sample was homogenous! Descriptive Statistics → 2 sample t-test Cochran-Cox test (t-test with separate variance estimates) - Your sample was NOT homogenous! Descriptive Statistics → C-test Topic 2 (Comparing Independent and Dependent Samples) - Ordinal? Qualitative Data(language, detail) → Wilcoxon Test Metric? Quantitative Data(numbers) → Normality test Normality - Shapiro-Wilk Test - Descriptive Statistics → select variables → Normality → Shapiro-Wilk (unselect the other box above) → Histogram - If p-value is > (bigger) than 0.05 accept H0 = the data follows a normal distribution (if p < 0.05 reject H0, accept HA) Wilcoxon Test - Your data is either qualitative or does NOT follow a normal distribution One sample depends on other Paired t-test - Your data follows a normal distribution one sample depends on other Descriptive Statistics → Dependent Sample → get test value (t) and p value Compare to 1 type error upper limit 0.05 for a conclusion Topic 3 (Correlation, r-value) Scatter Graphs - Graphs → Scatter Graph Analyze Line best fit of line (if the line is a good representative of the data points on the graph) Check if there are any obvious outliers (1 specific data point that really stands out from the others) R-Value - If ±0.6<r±0.8 => correlation is strong; ±0.4<r<±0.6 correlation is moderate Descriptive Statistics → Correlation matrix → Select variables → Options (select the box with value r) → collect r-value and p-value Topic 4 (Correlation, dependent by group) Scatter Graphs - Graphs → Scatter Graph Analyze Line best fit of line (if the line is a good representative of the data points on the graph) Check if there are any obvious outliers (1 specific data point that really stands out from the others) R-Value - Descriptive Statistics → Nonparametrics → Correlation, dependent by group → Compute: Detailed Report → Variables Topic 5 (Chi-Squared Test) 2x2 Table - Nonparametrics → 2x2 Table → enter values from table into the 4 boxes Values are ≤ 10 in table - Write down values of “Yates correction of chi-square”, and its p-value Value are > 10 in table - Write down values of “Chi-squared (df1)”, and its p-value