Significance Tests …and their significance Significance Tests Remember how a sampling distribution of means is created? Take a sample of size 500 from the US. Record the mean selfesteem. If the mean should be 25, you might get this. Self-esteem 15 20 25 30 35 40 Significance Tests Remember how a sampling distribution of means is created? Take a sample of size 500 from the US. Record the mean selfesteem. If the mean should be 25, you might get this. Self-esteem 15 20 25 30 35 40 Significance Tests Remember how a sampling distribution of means is created? Take a sample of size 500 from the US. Record the mean selfesteem. If the mean should be 25, you might get this. Self-esteem 15 20 25 30 35 40 Significance Tests Remember how a sampling distribution of means is created? Take a sample of size 500 from the US. Record the mean selfesteem. If the mean should be 25, you might get this. Self-esteem 15 20 25 30 35 40 Significance Tests Remember how a sampling distribution of means is created? Take a sample of size 500 from the US. Record the mean selfesteem. If the mean should be 25, you might get this. Self-esteem 15 20 25 30 35 40 Significance Tests Remember how a sampling distribution of means is created? Take a sample of size 500 from the US. Record the mean selfesteem. If the mean should be 25, you might get this. Self-esteem 15 20 25 30 35 40 Significance Tests Remember how a sampling distribution of means is created? Take a sample of size 500 from the US. Record the mean selfesteem. If the mean should be 25, you might get this. Self-esteem 15 20 25 30 35 40 Significance Tests Remember how a sampling distribution of means is created? Take a sample of size 500 from the US. Record the mean selfesteem. If the mean should be 25, you might get this. Self-esteem 15 20 25 30 35 40 Significance Tests Remember how a sampling distribution of means is created? Take a sample of size 500 from the US. Record the mean selfesteem. If the mean should be 25, you might get this. Self-esteem 15 20 25 30 35 40 Significance Tests Remember how a sampling distribution of means is created? Take a sample of size 500 from the US. Record the mean selfesteem. If the mean should be 25, you might get this. The sample means would stack up in a normal curve. A normal sampling distribution. z -3 Self-esteem -2 15 -1 20 0 25 1 30 2 35 3 40 Significance Tests Remember how a sampling distribution of means is created? Take a sample of size 500 from the US. Record the mean selfesteem. If the mean should be 25, you might get this. The sample means would stack up in a normal curve. A normal sampling distribution. 2.5% z -3 Self-esteem 2.5% -2 15 -1 20 0 25 1 30 2 35 3 40 Significance Tests The sample size affects the sampling distribution: Standard error = population standard deviation / square root of sample size Y-bar= /n s.e. = pop. Sd./ n But in fact we use our sample’s standard deviation as an estimate of the population’s. • Σ(Y – Y-bar)2 n-1 Significance Tests And if we increase our sample size (n)… Our repeated sample means will be closer to the true mean: 2.5% 2.5% Z-3 -2 -1 0 1 2 3 z -3 -2 -1 0 1 2 3 Significance Tests Means will be closer to the true mean, and our standard error of the sampling distribution is smaller: 2.5% 2.5% Z-3 -2 -1 0 1 2 3 z -3 -2 -1 0 1 2 3 Significance Tests The range of particular middle percentages gets smaller: Self-esteem 15 20 25 30 35 40 2 3 Z-3 -2 -1 0 1 2 3 z -3 -2 95% Range -1 0 1 Significance Tests We use that measuring stick to say two things: 1.96z 1. If my sample is in the middle specified percent, the population’s mean is within this range. (Confidence Interval) 95% 1z 2. Besides construct a confidence interval, we can also do a significance test. 68% 3z -3 -1.96 -1 0 99.99% 1 1.96 3 68% 95% 99.99% Significance Tests We use that measuring stick to say two things: 1.96z 1. If my sample is in the middle specified percent, the population’s mean is within this range. (Confidence Interval) 2. If the population mean is the same as a guess of mine, then my sample’s mean would have to fall within this range to have been drawn from the middle specified percent. (Significance Test) 95% 1z 68% 3z -3 -1.96 -1 0 99.99% 1 1.96 3 68% 95% 99.99% Significance Tests We know that if you have your sampling distribution centered on the population mean: • 16% of samples’ means would be larger than + 1z and 16% would be smaller than - 1z, for a total of 32% outside that range. 1z 68% -3 -1.96 -1 0 68% 1 1.96 3 Significance Tests We know that if you have your sampling distribution centered on the population mean: • 1.96z 95% 2.5% of samples’ means would be larger than + 1.96z and 2.5% would be smaller than - 1.96z, for a total of 5% outside that range. -3 -1.96 -1 0 95% 1 1.96 3 Significance Tests We know that if you have your sampling distribution centered on the population mean: • 0.005% of samples’ means would be larger than + 3z and 0.005% would be smaller than - 3z, for a total of 0.01% outside that range. 3z -3 -1.96 -1 0 99.99% 1 1.96 3 99.99% Significance Tests But you remember that we don’t normally know the actual mean for the population. But what if we guessed? What if we specified a value that might be the population mean? Significance Tests If we guessed a mean… If our guess is correct, our sample’s mean should be among the common samples that would have been drawn from a population with that guessed mean. guess If it is not, it is likely that the sample did not come from such a population. -3 -1.96 -1 0 1 1.96 3 What if my sample’s mean were here? Significance Tests One way to tell whether our sample’s mean was generated by such a population is to place our sampling distribution over the guessed mean to see if the sample mean is among the middle 99% or 95% of samples that would be generated by such a mean. 1.96z 95% What if my sample’s mean were here? -3 -1.96 -1 0 guess 95% 1 1.96 3 It is among the rare 5% of possible means. Significance Tests Essentially, a significance test for a mean tells you what the odds are that your sample mean could have come from a population whose mean equals your guess. 1.96z 95% What if my sample’s mean were here? -3 -1.96 -1 0 guess 95% 1 1.96 3 It is among the rare 5% of possible means. Significance Tests What you do is figure out what your sample’s zscore is relative to your guessed mean. If z is larger than 1.96 or smaller than -1.96, you have less than a 5% chance than your sample came from such a “guess population” Essentially, a significance test for a mean tells you what the odds are that your sample mean could have come a population with a particular mean. —reject the guess! -3 -1.96 -1 0 1 1.96 3 guess 95% Sample mean Significance Tests For example: If our guess was that self-doubt scores in the population averaged 20 on a scale from 1 – 50, we’d place a guess as below. Self-doubt 16 18 20 22 24 26 28 Significance Tests We guess 20, but our sample of size 100 has a mean of 25 and a standard deviation of 10. Guess, Sample, Y-bar Self-doubt 16 18 20 22 24 26 28 Significance Tests Let’s build a sampling distribution around our guess, 20: sample of size 100; s.d. = 10. Sample, Y-bar s.e. = 10/100 = 10/10 = 1 Self-doubt Z: 16 18 20 22 24 -3 -2 -1 0 1 2 3 4 5 26 28 Significance Tests Our sample appears to be larger than a critical value of 1.96 (outer 5% of samples) or even 2.58 (outer 1% of samples). Sample, Y-bar s.e. = 10/100 = 10/10 = 1 Self-doubt Z: 16 18 20 22 24 -3 -2 -1 0 1 2 3 4 5 26 28 Significance Tests How many z’s is our sample mean away from our guess? Z = Y-bar – / s.e. Z = 25 – 20 / 1 s.e. = 10/100 = 10/10 = 1 Sample, Y-bar z=5 Self-doubt Z: 16 18 20 22 24 -3 -2 -1 0 1 2 3 4 5 26 28 Significance Tests Indeed, our sample z-score is 5, well above 1.96 or 2.58. Reject the guess! s.e. = 10/100 = Looking in Appendix B… 10/10 = 1 Our sample has a Sample, Y-bar .0000287 % chance of having come from a population whose mean is 20! Self-doubt Z: 16 18 20 22 24 -3 -2 -1 0 1 2 3 4 5 26 28 Significance Tests Conducting a Test of Significance for the Mean By slapping the sampling distribution for the mean over a guess of the mean, Ho, we can find out whether our sample could have been drawn from a population where the mean is equal to our guess. 1. 2. 3. 4. 5. 6. 7. Decide -level ( = .05) and nature of test (two-tailed vs. one-tailed) Set critical z (z = +/- 1.96) or t Make guess or null hypothesis, Ho: = 0 Ha: 0 Collect and analyze data Calculate Z or t: z = Y-bar - s.e. Make a decision about the null hypothesis (reject or fail to reject) Find the P-value Significance Tests 1. Decide -level ( = .05) and nature of test (two-tailed vs. one-tailed). Sampling distribution of sample means, s.e. calculated by s/√n -level refers to how unlikely a sample’s mean would have to be before you’d reject your guess. The scientific standard is typically .05 probability, or a 5% chance that your sample came from a population whose mean is what you guessed. If your sample’s mean has less than 5% chance of having come from a population with your guess, you’d reject the guess (the null hypothesis). -level could be set at .10, .01, etc. Guess, µo What if my sample mean were here? Significance Tests 1. Decide -level ( = .05) and nature of test (two-tailed vs. one-tailed). Sampling distribution of sample means, s.e. calculated by s/√n One- or two-tailed test refers to the rejection region in your sampling distribution. 2.5% of sampling distribution. If your -level were .05, in a two-tailed test your rejection region would be the outer 2.5% of each tail. A two-tailed test implies a directionless null hypothesis such as µo = 0. Guess, µo What if my sample mean were here? Significance Tests 1. Decide -level ( = .05) and nature of test (two-tailed vs. one-tailed). Sampling distribution of sample means, s.e. calculated by s/√n One- or two-tailed test refers to the rejection region in your sampling distribution. If your -level were .05, in a one-tailed test your rejection region would be the outer 5% of one of the tails. 5% of sampling distribution. A one-tailed test implies a directional null hypothesis such as µo ≤ 0 or µo ≥ 0 . The idea: If I have good reason to think a parameter would be above a particular value, then I only need to set the guess at that value or less (µo ≤ 0) and look to see if the sample statistic is in the rare 5% of possible samples above the null. If it is in the extreme low end, I won’t reject the null! Guess, µo What if my sample mean were here or there? Significance Tests 2. Set critical z (z = +/- 1.96) or t Sampling distribution of sample means, s.e. calculated by s/√n -level refers to how unlikely a sample’s mean would have to be before you’d reject your guess. There is a z- or t-score that corresponds with that proportion of the area in the tails of the curve (area in the tails of the sampling distribution). For example, ?? in the right tail corresponds with z = ?? .10 .05 .025 .01 .005 1.28 1.65 1.96 2.33 2.58 Guess, µo What if my sample mean were here? Significance Tests • We use t instead of z to be more accurate: • t curves are Tea Tests? symmetric and bell-shaped like the normal distribution. However, the spread is more than that of the standard normal distribution—the tails are fatter. df = 1, 2, 3, and so on, approaching normal as df exceeds 120. Significance Tests • The reason for using t is due to the fact that we use sample standard deviation (s) rather than population standard deviation (σ) to calculate standard error. Since s, standard deviations, will vary from sample to sample, Tea Tests? the variability in the sampling distribution ought to be greater than in the normal curve. t has a larger spread, more accurately reflecting the likelihood of extreme samples, especially when sample size is small. • The larger the degrees of freedom (n – 1), the closer the t curve is to the normal curve. This reflects the fact that the standard deviation s approaches σ for large sample size n. • Even though z-scores based on the normal curve will work for larger samples (n > 120) SPSS uses t for all tests because it works for small samples and large samples alike. Significance Tests Sampling distribution of sample means, s.e. calculated by s/√n 3. Make guess or null hypothesis: Ho: = 0 Ha: 0 The guess refers to the value that you will feel comfortable with declaring is true for the population unless your sample evidence suggests otherwise. In science, we wouldn’t want to assert something based on a sample unless we had extremely good evidence. The null is a default assumption, such as saying previous research says the mean is . In more advanced statistics, we will typically use null hypotheses that declare “no difference between groups” or “no relationship between variables.” The alternative is typically consistent with your research hypothesis or expectations. Guess, µo What if my sample mean were here? Significance Tests Sampling distribution of sample means, s.e. calculated by s/√n 3. Make guess or null hypothesis: Ho: = 0 (or some other value) Ha: 0 The hypotheses above refer to a two-tailed test. Hypotheses for one-tailed tests would be like this: Ho: ≤ 0 (or some other value) Ha: > 0 Ho: ≥ 0 Ha: < 0 Guess, µo What if my sample mean were here? Significance Tests 4. Collect and analyze data. Sampling distribution of sample means, s.e. calculated by s/√n Once you’ve established your assumptions and what you are testing for, you can get into data analysis. Note that this ordering of steps helps prevent you from “peaking into” the data to establish your assumptions and tests. Basing tests on sample information sets up predetermined outcomes—bad! If calculating inferential statistics by hand, you would need to find your mean and standard deviation for each variable. Guess, µo What if my sample mean were here? Significance Tests 5. Calculate Z or t: z or t = Y-bar - s.e. s.e. for z = σ/√n s.e. for t = s/√n Sampling distribution of sample means, s.e. calculated by s/√n Calculating the test statistic will tell you how many standard errors away from the null hypothesis your sample statistic is. Corresponding with the z or t value is an area under the curve that tells what proportion or percentage of sample means would have Guess, µo been that far away if your null hypothesis were correct. What if my sample mean were here? Significance Tests Sampling distribution of sample means, s.e. calculated by s/√n 6. Make a decision about the null hypothesis. Is your sample statistic more standard errors away from your guess or null hypothesized value than your critical z or t? If it is farther out: •It meets the criteria for implausibly rare that you established from the outset. •You would reject the null, saying it is unlikely your sample could have come from a population where that null value is true If it is not more extreme than your critical z or t: •It is not an unlikely occurrence as established from the outset. •You would fail to reject the null, saying that your guess is likely true and your sample has a good chance of having come from a population with that null value. Guess, µo - Z? What if my sample mean were here? Significance Tests Sampling distribution of sample means, s.e. calculated by s/√n 7. Find the p-value. The p-value will tell you the actual likelihood that you’d get a sample with your statistic that is as far away from the null value or the guess if your null or guess were true for your population. To find p, look in a z or t table to find the proportion of the area in the tails of the curve that corresponds with the z or t that you calculated for your sample statistic. p? Remember to be sure you keep track of whether you are doing a two-tailed (p * 2) or one-tailed (p) test. Guess, µo What if my sample mean were here? Significance Tests Sampling distribution of sample means, s.e. calculated by s/√n Another Example of a Significance Test of the mean or proportion. An administrator read that “snob universities” have over 50% of student GPAs over 3.5. He wants to determine whether SJSU is a “snob university.” He decides: 1. To use an alpha-level of .05 with a one-tailed test. 2. That the critical z or t will be 1.65. 3. Thinking SJSU is a “snob university” he sets his null as: Ho: Π ≤ .5; Ha: Π > .5 5% Guess, .50 Significance Tests Another Example of a Significance Test of the mean or proportion. An administrator read that “snob universities” have over 50% of student GPAs over 3.5. He wants to determine whether SJSU is a “best university.” Sampling distribution of sample means, s.e. calculated by s/√n He decides: 1. To use an alpha-level of .05 with a one-tailed test. 2. That the critical z or t will be 1.65. 3. Thinking SJSU is a “snob university” he sets his null as: Ho: Π ≤ .5; Ha: Π > .5 4. He collects data from 500 randomly selected SJSU students and finds that .40 have GPAs above 3.5. Guess, .50 5. He calculates z s.e.= √(p)(1-p)/N z = p – Πo / s.e. s.e. = √.4(.6) = √.24 = .022 Our sample. 500 500 z = .4 - .5/ .022 = -4.55 5% Significance Tests Another Example of a Significance Test of the mean or proportion. Sampling distribution of sample means, s.e. calculated by s/√n An administrator read that “snob universities” have over 50% of student GPAs over 3.5. He wants to determine whether SJSU is a “snob university.” He decides: 5. He calculates z z = p – po / s.e. z = .4 - .5/ .022 = -4.55 s.e.= √(p)(1-p)/N s.e. = √.4(.6) = √.24 = .022 500 500 6. Making a decision about the null is easy. He sees that his sample proportion is lower than the null of .5 and within the null of less than .5. He fails to reject the null. 7. In finding the p-value, he sees that if the population value were .5, he’d have less than .ooo1 chance of getting a GPA that low. He has good evidence that SJSU is not a “snob univeristy.” 5% Guess, .50 Our sample. Significance Tests • One final note: – The tests we typically use in sociology have assumptions of large sample sizes. – When conducting tests with small sample sizes, some restrictions apply: • When working with means, we typically have to assume the population values are normally distributed. • When working with proportions, we must use a binomial probability distribution.