An introduction to statistics Book Contents Presenting and writing about statistics Course Outline Chapter Topic Minimum No. weeks allotted 1 1 1 2 end Jan end Feb PRELIMS EXAM 2 end March 2 end April 2 1 FINALS EXAM To become a biologist, you need to “know” statistics UNDERSTAND & ANALYZE what others describe your own results How can you make/interpret these visualizations? Gordon et al. 2010. PLoS ONE. DOI: 10.1371/journal.pone.0010905 Ranasinghe et al. 2013. BMC Public Health 13(1):797 Nanchahal et al. 2018. EBioMedicine 33: DOI: 10.1016/j.ebiom.2018.06.022 Badrawy et al. 2016. Int J Stem Cells. 9(1): 145-151. ASSIGNMENT: Line Graphs Bar Graphs Scatterplot Box and whiskers plot Pie Graph Histogram Area Graph Biologists do not generalize from a single observation DETECT Variability is inherent among organisms DECREASE Bias and Error is possible during observations Replicated observations can.. DETECT Variability is inherent among organisms DECREASE Bias and Error is possible during observations Statistics helps by.. DETECT INVESTIGATING Variability Distribution DECREASE CALCULATING Bias and Error Reasonable Estimates Answer questions by hypothesis testing based on type of observation OUR MAIN OBJECTIVE Answer questions by hypothesis testing based on type of observation Data set of values with respect to qualitative or quantitative variables Answer questions by hypothesis testing based on type of observation Measurements Data Ranks Frequencies Answer questions by hypothesis testing based on type of observation Measurements Data Ranks Frequencies ex. Height , mass, pH Answer questions by hypothesis testing based on type of observation also known as Measurements 2 kinds: Data Ranks Frequencies interval data CONTINUOUS meaningfully with decimal e.g. length in cm: [ 6, 12.4, 8.32] DISCRETE can only be integer individual counts e.g. no. of pills/patient : patient A = 4 pills patient B = 3 pills Answer questions by hypothesis testing based on type of observation Measurements Data also known as ordinal data take note: Ranks ranks remove inherent gaps in variability Frequencies must be analyzed using non-parametric tests Answer questions by hypothesis testing based on type of observation Measurements Data Ranks Frequencies Example: seriousness of infection ( none, light, medium, heavy) Results of questionnaire ( 1= poor; 5= excellent) Answer questions by hypothesis testing based on type of observation Measurements Data Ranks Categorical data/ Frequencies Answer questions by hypothesis testing based on type of observation different from measurement individual counts of each organism/category Measurements DISCRETE Data some features of organisms seem impossible to quantify only way is to get a Ranks total counts per category Ex. cancer or non-cancer, mutant or non-mutant, Different species of turtles Frequencies usually analyzed using chi2 test or logistic regression QUIZ Give 2 examples each of the types of data ( be more specific) : 1.Measurement ex. Height , mass, pH 2. Ranks ex. Example: seriousness of infection ( none, light, medium, heavy) Results of questionnaire ( 1= poor; 5= excellent) 3. Categorical data Ex. cancer or non-cancer, mutant or non-mutant, Different species of turtles Answer questions by hypothesis testing based on type of observation Answer questions by hypothesis testing based on type of observation DIFFERENCE? RELATIONSHIP? There are 2 main “statistical” questions Answer questions by hypothesis testing based on type of observation DIFFERENCE? On average, is “x” more than “y” ? less bigger smaller RELATIONSHIP? The Effect of Ethanolic Extract of Urtica dioica Leaves on High Levels of Blood Glucose and Gene Expression of Glucose Transporter 2 (Glut2) in Liver of Alloxan-Induced Diabetic Mice DIFFERENCE? DIFFERENCE? endothelial nitric oxide synthase (eNOS) Answer questions by hypothesis testing based on type of observation DIFFERENCE? RELATIONSHIP? On average, is “x” more than “y” ? less bigger smaller As “x” increase, does “y” increase? decrease? no change? BY WHAT AMOUNT? RELATIONSHIP? RELATIONSHIP? Relationship between body mass index (BMI) and risk for diabetes in US Health Professionals, derived from data extracted from Chan et al. [2]. universally expressed in units of kg/m2, underweight: under 18.5 kg/m2, normal weight: 18.5 to 25, overweight: 25 to 30, obese: over 30. Answer questions by hypothesis testing based on type of observation DIFFERENCE? RELATIONSHIP? On average, How is “x”much moreone thandata “y” ?set is inless comparison to other data set/s bigger smaller As “x” increase, How much one data set does “y” increase? varies decrease? with another data set no change? BY WHAT AMOUNT? Answer questions by hypothesis testing based on type of observation Answer questions by hypothesis testing based on type of observation 1. State the QUESTION 2. Formulate the NULL HYPOTHESIS 3. Create a VISUALIZATION 4. Calculate the TEST 5. Determine STATISTIC SIGNIFICANCE 6. Draw an INFERENCE Answer questions by hypothesis testing based on type of observation 1. State the QUESTION 2. Formulate the NULL HYPOTHESIS 3. Create a VISUALIZATION 4. Calculate the TEST 5. Determine STATISTIC SIGNIFICANCE 6. Draw a INFERENCE Answer questions by hypothesis testing based on type of observation 2 kinds: 1. State the QUESTION 2. Formulate the NULL HYPOTHESIS BIOLOGICAL overall theme investigated STATISTICAL specific; states the data set/s 3. Create a VISUALIZATION 4. Calculate the TEST 5. Determine STATISTIC SIGNIFICANCE 6. Draw a INFERENCE Answer questions by hypothesis testing based on type of observation example: 1. State the QUESTION 2. Formulate the NULL HYPOTHESIS BIOLOGICAL Is hypertension linked to obesity? STATISTICAL Does increased systolic blood pressure positively vary with high body mass index? 3. Create a VISUALIZATION 4. Calculate the TEST 5. Determine STATISTIC SIGNIFICANCE 6. Draw a INFERENCE Answer questions by hypothesis testing based on type of observation example: theme BIOLOGICAL Is hypertension linked to obesity? STATISTICAL Does increased systolic blood pressure positively vary with high body mass index? 1. State the QUESTION 2. Formulate the NULL HYPOTHESIS 3. Create a VISUALIZATION 4. Calculate the TEST 5. Determine STATISTIC SIGNIFICANCE 6. Draw a INFERENCE Data sets Answer questions by hypothesis testing based on type of observation reminder: There can be one (1) BIOLOGICAL question investigated by multiple STATISTICAL questions 1. State the QUESTION 2. Formulate the NULL HYPOTHESIS 3. Create a VISUALIZATION 4. Calculate the TEST 5. Determine STATISTIC SIGNIFICANCE 6. Draw a INFERENCE Answer questions by hypothesis testing based on type of observation Also 2 kinds but in statement format: BIOLOGICAL overall theme investigated STATISTICAL specific; states the data set/s 1. State the QUESTION 2. Formulate the NULL HYPOTHESIS 3. Create a VISUALIZATION 4. Calculate the TEST 5. Determine STATISTIC SIGNIFICANCE 6. Draw a INFERENCE Answer questions by hypothesis testing based on type of observation Also 2 kinds but in statement format: BIOLOGICAL Hypertension is not linked to obesity. STATISTICAL Increased systolic blood pressure does not vary with high body mass index. 1. State the QUESTION 2. Formulate the NULL HYPOTHESIS 3. Create a VISUALIZATION 4. Calculate the TEST 5. Determine STATISTIC SIGNIFICANCE 6. Draw a INFERENCE Answer questions by hypothesis testing based on type of observation Also 2 kinds but in statement format: 1. State the QUESTION 2. Formulate the NULL HYPOTHESIS BIOLOGICAL Hypertension is not linked to obesity. STATISTICAL Increased systolic blood pressure does not vary with high body mass index. preliminary assumption must be NO difference / relationship 3. Create a VISUALIZATION 4. Calculate the TEST 5. Determine STATISTIC SIGNIFICANCE 6. Draw a INFERENCE Answer questions by hypothesis testing based on type of observation Table OR Graph ? 1. State the QUESTION 2. Formulate the NULL HYPOTHESIS 3. Create a VISUALIZATION 4. Calculate the TEST 5. Determine STATISTIC SIGNIFICANCE 6. Draw a INFERENCE Answer questions by hypothesis testing based on type of observation Table OR Graph ? Histogram Bar Chart Box-Whiskers Plot Scatter Plot etc. 1. State the QUESTION 2. Formulate the NULL HYPOTHESIS 3. Create a VISUALIZATION 4. Calculate the TEST 5. Determine STATISTIC SIGNIFICANCE 6. Draw a INFERENCE Answer questions by hypothesis testing based on type of observation Table OR Graph ? Histogram Bar Chart Box-Whiskers Plot Scatter Plot etc. LECTURE: how to choose appropriate! LABORATORY: create from data sets! 1. State the QUESTION 2. Formulate the NULL HYPOTHESIS 3. Create a VISUALIZATION 4. Calculate the TEST 5. Determine STATISTIC SIGNIFICANCE 6. Draw a INFERENCE Answer questions by hypothesis testing based on type of observation example: Pormousa et al. 2015. Dependence Modelling 3(1): DOI: 10.1515/demo-2015-0016 For relationship tests using measurements most common and basic is the “scatterplot” 1. State the QUESTION 2. Formulate the NULL HYPOTHESIS 3. Create a VISUALIZATION 4. Calculate the TEST 5. Determine STATISTIC SIGNIFICANCE 6. Draw a INFERENCE Answer questions by hypothesis testing based on type of observation Measures effect size of difference/relationship 1. State the QUESTION relative to amount of variability 2. Formulate the NULL HYPOTHESIS 0.58 3. Create a VISUALIZATION 4. Calculate the TEST 5. Determine STATISTIC SIGNIFICANCE 6. Draw a INFERENCE Answer questions by hypothesis testing based on type of observation Measures effect size of difference/relationship 1. State the QUESTION relative to amount of variability 2. Formulate the NULL HYPOTHESIS 3. Create a VISUALIZATION Pearson r : 0.76 example of Pearson correlation coefficient a “parametric” test of relationships 0.58 r2 : measure of the linear correlation between two variables X and Y. 4. Calculate the TEST 5. Determine STATISTIC SIGNIFICANCE 6. Draw a INFERENCE Answer questions by hypothesis testing based on type of observation Measures effect size of difference/relationship 1. State the QUESTION relative to amount of variability 2. Formulate the NULL HYPOTHESIS Pearson r : 0.76 r2 : 0.58 effect size 3. Create a VISUALIZATION in this example Calculate the TEST STATISTIC 0 lowest : 1 highest4.value test statistic 5. Determine SIGNIFICANCE value between +1 and −1, where 1 is total positive linear correlation, is no linear 6. Draw a0INFERENCE correlation, and −1 is total negative linear correlation. Answer questions by hypothesis testing based on type of observation Probability of getting effect just by chance if null hypothesis was true 2. Formulate the NULL HYPOTHESIS 3. Create a VISUALIZATION 4. Calculate the TEST WE CAN USE: critical value 1. State the QUESTION or p-value 5. Determine STATISTIC SIGNIFICANCE 6. Draw a INFERENCE Answer questions by hypothesis testing based on type of observation Probability of getting effect just by chance if null hypothesis was true 1. State the QUESTION 2. Formulate the NULL HYPOTHESIS 3. Create a VISUALIZATION 4. Calculate the TEST test statistic sample size sig. probability 5. Determine STATISTIC SIGNIFICANCE 6. Draw a INFERENCE Answer questions by hypothesis testing based on type of observation IF example we obtained p < 0.05 critical > test statistic 1. State the QUESTION What to write: 2. Formulate the NULL HYPOTHESIS Therefore, we accept/reject the null hypothesis (HO). 3. Create a VISUALIZATION 4. Calculate the TEST 5. Determine STATISTIC SIGNIFICANCE 6. Draw a INFERENCE Answer questions by hypothesis testing based on type of observation IF example we obtained p < 0.05 critical > test statistic 1. State the QUESTION What to write: 2. Formulate the NULL HYPOTHESIS Therefore, we accept/reject the null hypothesis (HO). 3. Create a VISUALIZATION That is what you DECIDE, not what you WRITE IN TEXT. 4. Calculate the TEST 5. Determine STATISTIC SIGNIFICANCE 6. Draw a INFERENCE Answer questions by hypothesis testing based on type of observation IF example we obtained p < 0.05 critical > test statistic 1. State the QUESTION What to write: 2. Formulate the NULL HYPOTHESIS Therefore, we accept/reject the null hypothesis (HO). 3. Create a VISUALIZATION Increased systolic blood pressure positively 4. Calculate the TEST STATISTIC varies with high body mass index. 5. Determine If you decide to REJECT HO, STATE your ALTERNATE hypothesis (HA). SIGNIFICANCE 6. Draw a INFERENCE Answer questions by hypothesis testing based on type of observation TOGETHER, we can write in the RESULTS: 1. State the QUESTION 2. Formulate the NULL HYPOTHESIS Increased systolic blood pressure positively varies 3. Create a VISUALIZATION with high body mass index (Pearson r=0.76, p<0.05). 4. Calculate the TEST 5. Determine STATISTIC SIGNIFICANCE 6. Draw a INFERENCE REMEMBER: Performing statistical tests does not actually allow you to prove conclusively • there are always inherent biases / limitations, do your best to identify them • understand results as “concepts in context” BE CAREFUL! Though hypothesis tests are meant to be reliable, two types of errors can occur https://www.abtasty.com/blog/type-1-and-type-2-errors/ false negative false positive