Let’s flip a coin Making Data-Based Decisions We’re going to flip a coin 10 times. What results do you think we will get? The Research Question… Hypotheses: Null hypothesis: A coin toss will results in 50 % head and 50% tails. Expected data: Equal numbers of heads and tails Alternative hypothesis 1: Heads will occur more often compared to tails. More heads than tails Alternative hypothesis 2: Heads will occur less often compared to tails. Fewer heads than tails Testing: Flip coin 10 times Results - out of 10 flips What is the minimum number of heads that you would expect if: the null hypothesis is correct? Why? (A coin toss will results in 50 % head and 50% tails.) alternative hypothesis 1 is correct? Why? (Heads will occur more often compared to tails.) alternative hypothesis 2 is correct? Why? (Heads will occur less often compared to tails.) Results, coin flipped 10 times 8 or 80% of the flips were heads Which hypothesis does this support? null hypothesis: A coin toss will results in 50 % head and 50% tails. alternative hypothesis 1: Heads will occur more often compared to tails. alternative hypothesis 2: Heads will occur less often compared to tails. Results - out of 10 flips 8 or 80% of the flips were heads. Is it possible that we could have gotten 8 heads if the other hypotheses were correct, too? What if we could actually calculate the likelihood of getting at least 8 heads? For instance If null hypothesis fails to be rejected – what is the probability that we could have actually gotten 8? If alternative hypothesis 1 is supported, what is the probability that we could have actually gotten 8? Testing the null hypothesis is the easiest…why? The problem is…. Null hypothesis – equal numbers of predicted flips Alternative 1 – how many more heads would we expect? WE KNOW EXACTLY WHAT TO EXPECT FOR THE NULL, BUT HAVE NO IDEA WHAT VALUES TO EXPECT FOR THE ALTERNATIVE Let’s set up a simulation…. Let’s flip a coin 10 times as one sample • How many heads would you expect to get? Explain TO DETERMINE PROBABILITY REPEAT A LOT OF TIMES AND SEE HOW OFTEN WE GET AT LEAST 8 HEADS Count of Samples Simulation – 20 samples X X 0 1 2 X X X X X X X X X X X X X X X X X X 3 4 5 6 7 8 9 10 8 9 10 Number of “correct flips” Number Heads Chance of getting this many heads 0 0.0 1 2 0.05 0.0 3 4 5 0.15 0.15 0.25 6 7 0.2 0.1 0.05 0.05 0 Question… Why don’t we get 5 heads every time we flip a coin 10 times? Why are some values not represented? We didn’t get any samples with 0, 2, or 10 heads? The Impact of Sampling We are sampling We don’t expect every sample to look exactly like the population. There is going to be variability because of chance Simulation – 20 samples Number Heads Total Times occurred Probability of Occurring 0 1 2 3 4 5 6 7 8 9 10 0 1 0 3 3 5 4 2 1 1 0 0.0 0.05 0.0 0.2 0.1 0.15 0.15 0.25 0.05 0.05 0 Simulation – 10000 samples Number 0 Heads Total Times 5 occurred Probability of 0.0005 Occurring 1 2 3 4 5 6 7 8 9 10 92 461 1154 1981 2537 2063 1117 479 101 6 0.009 0.046 0.115 0.198 0.245 0.206 0.117 0.048 0.010 0.0006 The Big Question… What is the likelihood (probability) of having AT LEAST 8 heads in our sample (getting 8, 9, or 10 heads)? My simulation – 10000 samples Number 0 Heads Total Times 5 occurred Probability of 0.0005 Occurring 1 2 3 4 5 6 7 8 9 10 92 461 1154 1981 2537 2063 1117 479 101 6 0.009 0.046 0.115 0.198 0.245 0.206 0.117 0.048 0.010 0.0006 The Big Question… What is the likelihood (probability) of having AT LEAST 8 heads in our sample? p= 0.048 +0.01 + 0.006 so p=0.0586 (8) (9) (10) My simulation – 10000 samples Number 0 Heads Total Times 5 occurred Probability of 0.0005 Occurring 1 2 3 4 5 6 7 8 9 10 92 461 1154 1981 2537 2063 1117 479 101 6 0.009 0.046 0.115 0.198 0.245 0.206 0.117 0.048 0.010 0.0006 The Big Question… What is the likelihood (probability) of having AT LEAST 8 heads in our sample? p=0.0586 or the likelihood of this occurring is 6 times out of 100. My simulation – 10000 samples Number 0 Heads Total Times 5 occurred Probability of 0.0005 Occurring 1 2 3 4 5 6 7 8 9 10 92 461 1154 1981 2537 2063 1117 479 101 6 0.009 0.046 0.115 0.198 0.245 0.206 0.117 0.048 0.010 0.0006 Logic of Statistical Testing Inferring from samples – INFERENTIAL STATISTICS Scientists collect data from a sample and determine whether or not that sample provides EVIDENCE AGAINST the null hypothesis. If the null hypothesis is true, what is the probability we would have randomly chosen a sample with the values we observed? Analysis: By looking at our probability of obtaining 80% or 8 heads in a sample of 10 flips, we can make a decision. PROBABILITY IS OFTEN CALLED THE P value Our Example Likelihood of getting 8 heads out of ten in our sample if the null hypothesis were actually true is p=0.0586 meaning it would occur roughly 6 times out of 100. Do you consider this value low or high? Do you think it provides enough evidence against the null hypothesis? Statistical Significance Need a cut point for the p-value Common “cut points”: 0.05, 0.01, .001 If P value < 0.05, • you say the result is “statistically significant” and you reject the null hypothesis. • If the null hypothesis is true, the probability of randomly getting the observed sample is unlikely. • This provides evidence against the null hypothesis and we would REJECT the null hypothesis, suggesting one of the alternative hypotheses were correct. Statistical Significance If P value > 0.05, You say the results were “not statically significant” If the null hypothesis is true, the probability of randomly getting the observed sample is likely. This does not provides evidence against the null hypothesis and we would FAIL TO REJECT the null hypothesis, allowing us to reject the alternative hypotheses. Statistical Tests/Hypothesis Testing/Inferential Test: All statistical tests provide a P-value that is the probability that your results would have occurred if the null hypothesis were true. They use information from your data (mean, standard deviation, etc.) to figure out a probability based upon a population that meets the null hypothesis (much like our coin simulation). You use the p-value to make a data-driven decision Question: What do you think would happen to the probability of getting 80% heads if we had flipped more: 16 heads out of 20? p = 0.01 40 heads out of 50? p<0.0001 Increasing your sample size decreases the chance that your results will be impacted by errors or chance factors that might mask differences. Example Hypotheses and P value Null Hypothesis The mean life-span is 15 years. P-Value Decision 0.078 The correlation between amount of nutrient and growth is 0. The mean height of plants exposed to sunlight equals the mean height of plants not exposed to light. Cut-off value is 0.05 Interpretation Example Hypotheses and p-value Null Hypothesis The mean life-span is 15 years. P-Value 0.078 Decision Interpretation Do not reject the There is no null hypothesis evidence to suggest the mean life-span is not 15 years. The correlation between amount of nutrient and growth is 0. The mean height of plants exposed to sunlight equals the mean height of plants not exposed to light. Cut-off value is 0.05 Example Hypotheses and p-value Null Hypothesis P-Value The mean life-span is 15 years. 0.078 The correlation between amount of nutrient and growth is 0. 0.010 Decision Interpretation Do not reject the There is no null hypothesis evidence to suggest the mean life-span is not 15 years. The mean height of plants exposed to sunlight equals the mean height of plants not exposed to light. Cut-off value is 0.05 Example Hypotheses and p-value Null Hypothesis P-Value Decision Interpretation The mean life-span is 15 years. 0.078 Do not reject the There is no null hypothesis evidence to suggest (P > 0.05) the mean life-span is not 15 years. The correlation between amount of nutrient and growth is 0. 0.010 Reject the null hypothesis (P < 0.05) The mean height of plants exposed to sunlight equals the mean height of plants not exposed to light. Cut-off value is 0.05 There is evidence to suggest the correlation is not zero. Example Hypotheses and p-value Null Hypothesis P-Value Decision Interpretation The mean life-span is 15 years. 0.078 Do not reject the There is no null hypothesis evidence to suggest (P > 0.05) the mean life-span is not 15 years. The correlation between amount of nutrient and growth is 0. 0.010 Reject the null hypothesis (P < 0.05) The mean height of plants exposed to sunlight equals the mean height of plants not exposed to light. 0.0001 Cut-off value is 0.05 There is evidence to suggest the correlation is not zero. Example Hypotheses and p-value Null Hypothesis P-Value Decision Interpretation The mean life-span is 15 years. 0.078 Do not reject the There is no null hypothesis evidence to suggest (P > 0.05) the mean life-span is not 15 years. The correlation between amount of nutrient and growth is 0. 0.010 Reject the null hypothesis (P < 0.05) There is evidence to suggest the correlation is not zero. The mean height of plants exposed to sunlight equals the mean height of plants not exposed to light. 0.0001 Reject the null hypothesis (P < 0.05) There is evidence to suggest light makes a difference on plan growth. Cut-off value is 0.05