Soc. 504, Lecture 11: Introduction to Statistical Inference A. Two ways we use statistics 1. describe the distribution of a variable or the relationship among variables within a population or a sample 2. draw inferences about the distribution of a variables or the relationships between variables in the population from data for a sample drawn from that population a. infer the value of population parameter from sample statistic (confidence intervals) b. test hypotheses about the value of the population parameter based on sample statistics B. Statistical inference and randomness (chance) 1. We can make inferences or test hypotheses about the value of population parameters from sample statistics only if some random process can influence the values of the sample data a. probability sample: a sample in which we know the probability of every case in the population being included in the sample (1) randomization (random assignment)--experiments (2) with observational data, the simplest type of probability sample is a simple random sample in which every case in the population has an equal probability of being selected. (a) stratified random sample (b) different sampling fractions for different groups and weighting (3) some researchers use statistical inference ritualistically even when no random process could affect the values of their statistics because “statistical significance” has come to imply that a result is noteworthy (a) statistical inference is meaningless with “snowball” or other “convenience” samples b. random measurement or coding error 2. statistical inference requires some random process that could affect the values of our variables that allows us to use the laws of probability to assess the likelihood that the value of an observed statistic is due to chance C. Terminology, notation for doing statistical inference 1. Sample statistics versus population parameters a. we refer to summary values for population data as parameters (1) we typically denote them with Greek letters (e.g., for standard deviation, 2 for variance, for the mean, (“rho”) for the correlation) (2) we sometimes denote them with capital letters (N) 1 b. we refer to summary values for sample data as statistics (1) we typically denote them with Roman alphabet (e.g., X for the mean of Xi, s for standard deviation, s2 for the variance, r for the correlation coefficient) (2) we sometimes use lower-case letters (n for sample size) 2. Distributions a. frequency and percentage distributions for variables (based on sample or population data) b. normal and standard normal distributions c. probability distributions d. sampling distributions D. Probability and probability distributions 1. Probability is the long-run relative frequency with which an outcome occurs. 2. pi = fi/N 3. A probability distribution is analogous to a frequency or a percentage distribution: it lists every possible outcome of an event along with its likelihood a. we can describe a probability distribution by its mean, variance, and shape (1) the mean of probability distribution is its expected value (a) E(Y) = pi Yi (2) variance of a probability distribution = pi [Yi - E(Y)]2 or pi (Yi - )2 (3) standard deviation of probability distribution = pi [Yi - E(Y)]2 b. e.g.: the probability distribution of the outcomes of throwing an honest (6-sided) die (1) its mean (or expected value) = Pi Yi = 21/6 = 3.5 (2) its variance = the summed deviations of scores from the expected value: 2 = Pi [Yi - E(Y)]2 = 2.92 and P = √2.92 = 1.7 E. The effect of sample size on probability distributions 1. The bigger your sample, the closer the value of any sample statistic will be to its corresponding population parameter. Our textbook calls this relationship the “empirical rule”. It is usually called the “law of large numbers” (“law of large samples”) 2. Consider how the proportion of females [p(f)] in a population and how it changes with the size of the sample we take from this population. 2 3. Like other sample statistics, p(f) is a random variable, given random sampling, its value wil vary across samples 4. Example: Consider the probability of randomly drawing just one female from a population in which p = .5 as the size of the sample gets larger a. If we randomly draw a one-person sample from this pop. we get the following probability distribution: Y Pf Pf Yf 0 ½ 0 1 ½ ½ 1 ½ The expected number of females is [E(Y)] in a one-person sample (pfY) = ½ b. the probability distribution if we randomly draw a 2-person sample (1) four possible outcomes: FF, F M, M F, M M—each with a 1/4 probability (2) the probability distribution looks like this: #f pf pf yf 0 ¼ 0 1 ½ ½ 2 ¼ ½ 1 1 (3) In a 2-person sample we’d expect to get 1 female c. What if we randomly draw a three-person sample? (1) 8 possible 3-person samples (23) (2) probability distribution for a 3-person sample from population in which pf = .5 #f pf pf yf 0 1/8 0 1 3/8 3/8 2 3/8 6/8 3 1/8 3/8 1 12/8 = 1.5 (3) In a 3-person sample we’d expect to get 1.5 females d. probability distribution for the number of females from a random sample of 4 people (1) 16 possible 4-person samples: (24) #f pf pf yf 3 0 1/16 0 1 4/l6 4/16 2 6/16 12/16 3 4/16 12/16 4 1/16 4/16 16/16 32/16=2 (2) we would expect to get 2 females if n = 4 e. Notice that the probability of getting no females declines (1/2, 1/4, 1/8, 1/16) as the sample size increases (1,2,3,4). f. the formula for the likelihood of getting a particular value of a dichotomous variable = Pf = 0 = .5n (1) this says that the probability of getting a particular outcomes depends only on the sample size (2) the probability of getting no females in a 5-person sample = .55 or 1/32. (3) the probability of getting no females in a 10-person sample = .510 = 1/1024. (4) The probability of getting no females in a sample of 100 = .5100 F. Normal distribution is bell-shaped which means it is symmetrical with a single mode which equals mean and median 1. Shape of normal distribution a. symmetrical b. single mode c. mean and median = mode 2. There are a large number of normal distributions with different means and different standard deviations 3. A standard(ized) normal distribution is a normal distribution with a mean of 0 and a standard deviation of 1. a. the Z-score distribution is standard normal distribution. 3. If Y is normally distributed, the proportion of the area of the normal curve that corresponds to each z-score for Y a. a z-score of –1.96 is always at the fifth percentile of the normal distribution and a z-score of 1.96 is always at the 95th percentile. b. This means that we can assess the likelihood of any sample value by transforming it into a zscore and then using Table A to determine what proportion of the area under the normal curve corresponds to that z-score 4 4. When we use table A to find what proportion of the area under the normal curve corresponds to a standard score, we are essentially using a nonlinear transformation 5. Review how Table A is set up G. Sampling distribution of a sample statistic 1. A sampling distribution of a sample statistic is a theoretical distribution of the values of a sample statistic that we would get if we drew an infinite number of probability samples of a fixed size from the same population. 2. A sampling distribution for a statistic is a probability distribution that shows all possible values of a sample statistic for some variable along with the probability of getting each value. 3. You can think of a sampling distribution as a probability distribution for a sample statistic for some variable (e.g., Y , s, p, r, b, a 4. You would get a sampling distribution for a statistic if you drew an infinite number of samples from the population and for each sample calculate the statistic that we’re interested in, and keep track of them, and then make a frequency distribution of the values of all the sample stats, and then calculate the relative frequency—i.e., the probability—of each value. The resulting sampling distribution would tell you the probability of drawing another sample in which the statistic of interest would have a particular value 5. Statistical theory about the sampling distrib. tells us what the distribution of a statistic would look like it we drew all possible samples, calculated a statistic for some variable Y for each sample and constructed the distribution for that statistic. 6. We need to know mean, variability and shape of the sampling distributions of a statistic a. A sampling distrib. involves the same principle as a frequency distrib. Whereas a freq distrib. presents the frequency of each of a set of possible values, a sampling distrib. for a statistic like the mean or the proportion shows the probability of drawing a sample that has every possible value for the statistic in which you’re interested 7. We can describe the sampling distrib. of a normally distributed statistic by its mean and standard deviation a. the standard deviation of a sampling distribution for a statistic is called its standard error (e.g., “the standard error of the proportion,” “the standard error of the mean”) b. According to the Law of Large Numbers, the larger the size of the sample (n), the closer the value of the sample statistic (p, Y ) will be to the population parameter (, ) 8. Central limit theory: as n increases, the shape of the sampling distribution of a statistic approaches a normal distribution with expected value [E( Y )] = and a std error, Y = / n [transparency] 5 a. Sample must have at least 30 observations for sampling distrib of the mean to be normal w/ E( Y ) = and std error, Y = / n b. If n 30, the sampling distrib. of the mean will be normal, even for variables like income whose distribution is not normal 9. Example: the sampling distribution of the mean a. mean of the sampling distrib. of the mean, Y = (1) any given sample mean Y may be larger or smaller than the population mean, , but the average sample mean across many random samples will equal the population mean b. variance of sampling distrib. of the mean, 2Y = 2/n c. std dev of sampling distrib. of the mean, Y = / n = 58 E( Y ), = 3.5, n=125 (1) Y = 3.5/125 = 3.5/11.18 = .313 (2) the standard error of the mean of the sampling distribution Y is much less than the standard dev. in our sample. d. the sample mean Y obeys the law of large numbers; the larger the N, the smaller the standard error e. How big must a sample be before its sampling distribution is normal with E( Y ) = ? Rule of thumb: 30 cases [transparency] 10. If a sampling distribution is normal, we know its mean and standard deviation, so we can use the normal curve in Table A to determine the likelihood of getting a sample statistic of any particular value. 11. Estimating a population parameter from a sample statistic a. the statistical concepts of error and bias b. a statistic is an unbiased estimator of a pop parameter if the expected value of the statistic equals the value of pop parameter; e.g., E( Y ) = ; E(p) = ; but E(s) c. even if a statistic is an unbiased estimator of a parameter, however, that doesn’t mean that it is an error-free measure of the parameter. 6