Bias - s3.amazonaws.com

Chapter 7 Notes Sampling Distributions AP Stats Name__________________________ 7.1 What is a Sampling Distribution? Parameter – a number that describes some characteristic of the population. The value of a parameter is usually not known because we cannot examine the entire population. Statistic – number that describes some characteristics of a sample. μ – for the population mean and 𝑥̅ for the sample mean. ρ – (greek letter rho) for the population proportion and 𝑝̂ (p-hat) the sample proportion is used to represent the unknown parameter ρ EXAMPLE: Identify the population, the parameter, the sample, and the statistics in each of the following settings. a) A pediatrician wants to know the 75th percentile for the distribution of heights of 10 year old boys, so she takes a sample of 50 patients and calculates Q3 = 56 inches. The population is all 10 year old boys, parameter in the 75th percentile, or Q3. The sample is the 50 10 yr old boys included in the sample, the statistic is the sample Q3 = 56 inches. b) A Pew Research Center Poll asked 1102 12 to 17 year olds in the US if they have a cell phone. Of the respondents, 71% said yes. The population is all 12 to 17 yr old in the US, Parameter is the proportion p with cell phones. The sample is the 1102 12 to 17 yr old in the sample; the statistic is the sample proportion with a cell phone p-hat = 0.71 SAMPLING VARIABILITY sampling variability: the value of a statistic varies in repeated random sampling ACTIVITY: Draw the class Dot Plot below: Proportion of red chips is p = ½ 1. Choose 20 red chips 2. Write the proportion of red chips 3. Record that proportion on the class dot plot 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Describe the distribution: Definition: The sampling distribution of a statistic is the distribution of values taken by the statistic in all possible samples of the same size from the same population. Be careful of interchanging the following terms: 1) Population distribution describes individuals from the population 2) Sample distribution describes individuals from the sample 3) Sampling Distribution describes how a statistic varies in many samples from the population Describing Sampling Distributions The fact that statistics from samples have definite sampling distributions allows us to answer the question: “How trustworthy is a statistic as an estimator of the parameter?” To answer this: consider the center spread and shape. The dot plot on the right shows the approximate sampling distribution of p-hat. The center is very close to the parameter value of 0.5, the parameter value. Understanding a sampling distribution: If we took all possible samples of 20 chips from the population, calculated p-hat for each sample and then found the mean of all those p-values, we’d get exactly 0.5. We say that p-hat is an unbiased estimator of p. Definition: A statistic used to estimate a parameter is an unbiased estimator if the mean of its sampling distribution is equal to the true value of the parameter being estimated. Unbiased does not mean perfect! An unbiased estimator will almost always provide an estimate that is not equal to the value of the population parameter. It is called “unbiased” because in repeated samples, the estimates won’t consistently be too high or consistently too low. n=100 n=1000 Larger samples have a clear advantage over smaller samples. They are much more likely to produce an estimate close to the true value of the parameter. The variability of a statistic is described by the spread of its sampling distribution. This spread is determined primarily by the size of the random sample. Larger samples give smaller spread. The spread of the sampling distribution does not depend on the size of the population, as long as the population is at least 10 times larger than the sample. Bias, variability, and shape We can think of the true value of the population parameter as the bull’s- eye on a target and of the sample statistic as an arrow fired at the target. Both bias and variability describe what happens when we take many shots at the target. Bias means that our aim is off and we consistently miss the bull’s-eye in the same direction. Our sample values do not center on the population value. High variability means that repeated shots are widely scattered on the target. Repeated samples do not give very similar results. The lesson about center and spread is clear: given a choice of statistics to estimate an unknown parameter, choose one with no or low bias and minimum variability. The blue line represents the true number of tanks. Note the different shapes. Which statistic gives the best estimator? Why? Why n-1 when calculating sample variance and standard deviation? 𝑠2 = 1 ∑(𝑥𝑖 − 𝑥̅ )2 𝑛−1 If we replace 1/n with 1/(n-1) then the variance becomes a biased estimator. (a very watered down explanation) P 429 1,3,5,7,9,11,13,17-20 Identify the population, the parameter, the sample, and the statistic in each setting. 1. STOP SMOKING. A random sample of 1000 people who signed a card saying they intended to quit smoking were contacted nine months later. It turned out that 210 (21%) of the sampled individuals that had not smoked over the past six months. Population: people who signed a card saying that they intend to quit smoking Parameter of interest: proportion of the population who signed the cards saying they would not smoke who actually quit smoking Sample: a random sample of 1,000 people who signed the cards Sample statistics: 𝑝̂ = 0.21 3. Hot turkey. Tom is cooking a large turkey breast for a holiday meal. He wants to be sure that the turkey is safe to eat, which requires a minimum internal temperature of 165° F. Tom uses a thermometer to measure the temperature of the turkey meat at four randomly chosen points. The minimum reading in the sample is 170° F. Population: all the turkey meat. Parameter of interest: minimum temperature. Sample: 4 randomly chosen points in the turkey. Sample statistic: sample minimum = 170F. For each boldface number in 5 and 7, 1) state whether it is a parameter or a statistic and 2) use appropriate notation to describe each number for example p = 0.65 5. Get your bearings. A large container of ball bearings has mean diameter 2.5003 centimeters (cm). This is within the specifications for acceptances of the container by the purchases. By chance, an inspector chooses 100 bearings from the container that have mean diameter 2.5009 cm. Because, this is outside the specified limits, the container is mistakenly rejected. μ = 2.5003 is a parameter (related to the population of all the ball bearings in the container) 𝑥̅ = 2.5009 is a statistic (related to the sample of 100 ball bearings). 7. Unlisted Numbers. A telemarketing firm in LA uses a device that dials residential phone numbers in that city at random. Of the first 100 numbers dialed, 48%, are unlisted. This is not surprising because 52% of all LA residential phones are unlisted. 𝑝̂ = 0.48 is a statistic (related to the sample of 100 numbers dialed) and p = 0.52 is a parameter (related to the population of all residential phone numbers in Los Angeles). 9. Doing homework. A school newspaper article claims that 60% of the students at a large high school did their assigned homework last week. Some skeptical AP Statistics students want to investigate whether this claim is true, so they choose an SRS of 100 students from the school to interview. What values of the sample proportion 𝑝̂ would be consistent with the claim that the population proportion of students who completed all their homework is p = 0.60? To find out, we used Fathom software to simulate choosing 250 SRSs of size n = 100 students from a population in which p = 0.60. The figure below is the dotplot of the sample proportion 𝑝̂ of students who did all their homework. a) Is this the sampling distribution of 𝑝̂ ? Justify your answer. This is not the exact sampling distribution because that would require a value of 𝑝̂ for all possible samples of size 100. However, it is an approximation of the sampling distribution that we created through simulation. b) Describe the distribution. Are there any obvious outliers? The distribution is centered at 0.60 and is reasonably symmetric and bell-shaped. Values vary from about 0.47 to 0.74. The values at 0.47, 0.73 and 0.74 are outliers. c) Suppose that 45 of the 100 students in the actual sample say that they did all their homework last week. What would you conclude about the newspaper article’s claim? Explain. If we found that only 45 students said that they did all their homework last week, we would be skeptical of the newspaper’s claim that 60% of students did their homework last week. None of the simulated samples had a proportion this low. 11. Doing Homework. Refer to #9. a) Make a graph of the population distribution given that there are 3,000 students in the school. REMEMBER THESE ARE INDIVIDUALS SINCE IT IS A POPULATION DISTRIBUTION b) Sketch a possible graph of the distribution of sample data for the SRS of size 100 taken by the AP Statistics students. REMEMBER THESE ARE INDIVIDUALS SINCE IT IS A SAMPLE DISTRIBUTION 13. During winter months outside temperatures at the Starnes’s cabin in Colorado can stay well below freezing (32° F or 0° C) for weeks at a time. To prevent the pipes from freezing, Mrs. Starnes sets the thermostat at 50° F. The manufacturer claims that the thermostat allows variation in home temperature that follows a Normal distribution with σ = 3° F. To test this claim, Mrs. Starnes programs her digital thermometer to take an SRS of n = 10 readings during a 24 hour period. Suppose the thermostat is working properly and that the actual temperatures in the cabin vary according to a Normal distribution with mean μ = 50° F and standard deviation σ = 3° F. The Fathom screen shot below shows the results of taking 500 SRSs of 10 temperature readings from a population distribution that’s N(50,3) and recording the sample variance s2 each time. a) Describe the approximate sampling distribution The approximate sampling distribution is skewed to the right with a center at 9° F2. The values vary from about 2 to 27.5° F2 b) Suppose that the variance from an actual sample is sx2 = 25. What would you conclude about the thermostat manufacture’s claim? Explain. A sample variance of 25 is quite large compared with what we would expect, since only one out of 500 SRSs had a variance that high. It suggests that the manufacturer’s claim is false and that the thermostat actually has more variability than claimed. 17. IRS AUDITS the IRS plans to examine an SRS of individual federal income tax returns from each state. One variable of interest is the proportion of returns claiming itemized deductions. The total number of tax returns in each state varies from over 15 million in California to about 240,000 in Wyoming. a) Will the sampling variability of the sample proportion change from state to state if an SRS of 2000 tax returns is selected in each state? Explain your answer. Since the smallest number of total tax returns (i.e., the smallest population) is still more than 10 times the sample size, the variability of the sample proportion will be (approximately) the same for all states. b) Will the sampling variability of the sample proportion change from state to state if an SRS of 1% of all tax returns is selected in each state? Explain your answer. Yes. It will change—the sample taken from Wyoming will be about the same size, but the sample in, for example, California will be considerably larger, and therefore the variability of the sample proportion will be smaller. 18. Predict the election Just before a presidential election, a national opinion poll increases the size of its weekly random sample form the usual 1500 people to 4000 people. a) Does the larger random sample reduce the bias of the poll result? A larger sample does not reduce the bias of a poll result. If the sampling technique results in bias, simply increasing the sample size will not reduce the bias. b) Does it reduce the variability of the result? Explain. A larger sample will reduce the variability of the result. More people means more information which means less variability. 19. The figure below shows histograms of four sampling distributions of different statistics intended to estimate the same parameter. a) Which statistics are unbiased estimators? Justify your answer Graph (c) shows an unbiased estimator because the mean of the distribution is very close to the population parameter. b) Which statistic does the best job of estimating the parameter? Explain. The graph in (b) shows the statistic that does the best job at estimating the parameter. Although it is biased, the bias is small and the statistic has very little variability. 20. A sample of teens. A study of the health of teenagers plans to measure the blood cholesterol of an SRS of 13 to 16 year olds. The researchers will report the mean 𝑥̅ from their sample as an estimate of the mean cholesterol level μ in this population. a) Explain to someone who knows no statistics what it means to say that 𝑥̅ is an unbiased estimator of μ. If we choose many samples, the average of the 𝑥̅ values from these samples will be close to μ. In other words, the sampling distribution of 𝑥̅ is centered at the population mean μ we are trying to estimate. b) The sample result 𝑥̅ is an unbiased estimator of the population μ no matter what size SRS the study chooses. Explain to someone who knows no statistics why a large random sample gives more trustworthy results than a small random sample. A larger sample will give more information and, therefore, more precise results. The variability in the distribution of the sample average decreases as the sample size increases. 7.2 SAMPLE PROPORTIONS http://www.rossmanchance.com/applets/Reeses3/ReesesPieces.html Describe the sampling distribution: Describe the sampling distribution: SHAPE: Roughly Symmetric, unimodal, and bell shaped. A Normal curve would approximate this distribution fairly well. SHAPE: Roughly symmetric, unimodal, and bell shaped. A Normal curve would approximate this distribution fairly well. CENTER: The mean of 400 sample proportions is 0.440. This is quite close to the actual proportion of 0.45. CENTER: The mean of 400 sample proportions is 0.451. This is quite close to the actual proportion of 0.45. SPREAD: The standard deviation of the 400 values of 𝑝̂ from these samples is 0.101. SPREAD: The standard deviation of the 400 values of 𝑝̂ is 0.067. So what happens as n increases: For a specific value of p, the standard deviation gets smaller as n gets bigger. The value of σp depends on both p and n. Go back to the applet and change p to .15 FORMULAS: SUMMARY OF SAMPLING DISTRIBUTION of 𝑝̂ Choose an SRS of size n from a population of size N with proportion p of successes. 1) Let 𝑝̂ be the sample proportion of successes __context of problem_________________ 2) μ = p = _____ from problem 3) Check 10% Rule n< 1/10N if so then, σ= or it could be given in problem 4) Check it is approximately normal: np> 10 and n(1 – p) >10. EXAMPLE of Using Normal Approximation of 𝑝̂ A polling organization asks an SRS of 1500 first-year college students how far away their home is. Suppose that 35% of all first-year students actually attend college within 50 miles of home. What is the probability that the random sample of 1500 students will give a result within 2 percentage points of this true value? State: Find P(0.33< 𝑝̂ < 0.37) Plan: n = 1500 p = 0.35 (the proportion of first year students who attend college within 50 miles of home) 𝜇𝑝̂ = p = 0.35 10% rule: We assume that there are more than 10(1500) = 15,000 first year college students 𝑝(1−𝑝) 𝜎𝑝̂ = √ 𝑛 0.35(0.65) =√ 1500 =0.0123 Check for Approximately Normal: np = 1500(.35) = 525 > 10 and n(1-p)=1500(.65)=975 > 10 Therefore we can consider the distribution of 𝑝̂ to be Approximately Normal. Draw and Shade the Normal Distribution Do: For 𝑝̂ = 0.33 z = 0.33−0.35 0.0123 = -1.63 For 𝑝̂ = 0.37 z= 0.37−0.35 0.0123 = 1.63 P(0.33 < 𝑝̂ < 0.37) = P(-1.63 < z < 1.63) = 0.8968 Conclude: About 90% of all SRS’s of size 1500 will give a result within 2 percentage points of the truth about the population. P 431 21-24 Page 439 27,29,33,35,37,41 21. A newpaper poll reported that 73% of respondents liked business tycoon Donald Trump. The number 73% is a) a population b) a parameter c) a sample d) a statistic e) an unbiased estimator 22. The name for the pattern of values that a statistic takes when we sample repeatedly from the same population is a) the bias of the statistic b) the variability of the statistic d) the distribution of the sample data c) the population distribution e) the sampling distribution of the statistic 23. If we take simple random sample of size n = 500 from a population of size 5,000,000 the variability of our estimate will be a) Much less than the variability for a sample of size n = 500 from a population of size 50,000,000 b) Slightly less than the variability for a sample of size n = 500 from a population of size 50,000,000. c) About the same as the variability for a sample size n = 500 from a population of size 50,000,000. d) Slightly greater than the variability for a sample of size n = 500 from a population of size 50,000,000 e) Much greater than the variability for a sample of n = 500 from a population of size 50,000,000. 24. Increasing the sample size of an opinion poll will a) reduce the bias of the poll result b) reduce the variability of the poll result c) reduce the effect of nonresponse on the poll. d) reduce the variability of opinions e) all of the above. Page 439 Explain why you cannot use the methods of this section to find the desired probability 33. Hispanic Workers A factory employs 3000 unionized workers of whom 30% are Hispanic. The 15-member union executive committee contains 3 Hispanics. What would be the probability of 3 or fewer Hispanics if the executive committee were chosen at random for all the workers? The Normal condition is not met here. Np = 15(.3) = 4.5 < 10 No greater than or = to How would you find this probability using a different method? It is a binomial distribution. Binary Yes, Independent Yes, Fixed number of successes Yes, Probability of Success = Yes 15 15 ( ) . 30 . 715 + ⋯ + ( ) . 33 . 712 = 0.2969. 0 3 35. Do you drink the cereal milk? A USA Today Poll asked a random sample of 1012 US adults what they do with the milk in the bowl after they have eaten the cereal. Of the respondents, 67% said they drink it. Suppose that 70% of US adults actually drink the cereal mile. Let 𝑝̂ be the proportion of people in the sample who drink the cereal milk. a) What is the mean of the sampling distribution of 𝑝̂ ? Why? The mean of the sampling distribution is the same as the population proportion so it is μp-hat = p = 0.67 b) Find the standard deviation of the sampling distribution of 𝑝̂ . Check to see if the 10% condition is met. The population (all U.S. adults) is clearly at least 10 times as large as the sample (the 1012 surveyed adults) so the 10% condition is met. c) Is the sampling distribution of 𝑝̂ approximately Normal? Check to see if the Normal condition is met The sampling distribution is approximately Normal since np = 1012(.70) = 708.4> 10 and n(1-p) = 1012(0.30)=303.60>10 Therefore approximately Normal. d) Find the probability of obtaining a sample of 1012 adults in which 67% or fewer say they drink the cereal mile. Do you have any doubts about the results of this poll? Draw the Normal Curve: z = 0.67 - .70 = -2.08 P(p-hat < .67) = P(z < -2.08) = 0.0188 0.0144 This is a fairly unusual result if 70% of the population actually drink the cereal milk. 37. Do you drink the cereal milk? What sample size would be required to reduce the standard deviation of the sampling distribution to ½ the value you found in Exercise 35B? Justify your answer. 4048 since n is in the denominator, I would need to double the denominator to divide by two. However, since the n is square root, I would to multiply it by 4, so that the the square root of n would be doubled, hence dividing the original standard deviation by 2. 41. On time shipping. You mail order company advertises that is ships 90% of its orders within three working days. You select an SRS of 100 of the 5000 orders received in the past week for an audit. The audit reveals that 86 of these orders were shipped on time. a) If the company really ships 90% of its order on time, what is the probability that the proportion in an SRS of 100 orders is as small as the proportion in your sample or smaller? DRAW: b) A critic says, “Aha! You claim 90%, but in your sample the one time percentage is lower than that. So the 90% claim is wrong.” Explain in simple language why your probability calculation in a) shows theat the result of the sample does not refute the 90% claim. 7.3 Sample Means A is one sample of 100 households B is a distribution of the sample means from many samples of size 100 http://onlinestatbook.com/stat_sim/sampling_dist/index.html Example: The height of young women follows a Normal distribution with mean μ = 64.5 inches and standard deviation σ = 2.5 inches. a) Find the probability that a randomly selected young woman is taller than 66.5 inches. Show you work. Let x = the height of a randomly selected young woman. Normal? Yes since Normal distributed μ = 64.5 and σ = 2.5 inches Find P(x > 66.5) DRAW A NORMAL CURVE x = 66.5 z =(66.5 – 64.5)/2.5 = 0.80 P(z > 0.80) = 0.2119 The probability of choosing a young woman at random whose height exceeds 66.5 inches is about 0.21 b) Find the probability that the mean height of an SRS of 10 young woman exceeds 66.5 inches. Show your work. μx-bar = 64.5 10% condition Met? We can assume there are atleast 100 young women in the population σx-bar = σ/squareroot(n) = 2.5/squareroot(10) = 0.79 DRAW NORMAL CURVE Normal? Yes since population is normal Z =(66.5 – 64.5)/.79 = 2.53 P(z > 2.53) = 0.0057 It is very unlikely, less than 1 % chance, that we would choose an SRS of 10 young women whose average height exceeds 66.5 inches. Since the standard deviation of the sampling distribution is much smaller, the distribution is much tighter around the mean, thus there is less area (which equals probability) from which to draw a sample from versus picking a woman a random from the population distribution. P. 441 43-46, 454 49,51,53,55 The magazine Sports Illustrated askes a random sample of 750 Division 1 college athletes, “Do you believe performance enhancing drugs are a problem in college sports?” Suppose that 30% of all Division 1 athletes that these drugs are a problem. Let 𝑝̂ be the sample proportion who say that these drugs are a problem. 43. The sampling distribution of 𝑝̂ has mean a) 225 b) 0.30 c) none of these d) 0.017 e) 0 44. The standard deviation of the sampling distribution is about a) 0.0006 b) 0.033 c) 0.017 d) 1 e) none of these 45. Decreasing the sample size from 750 to 375 would multiply the standard deviation by a) 2 b) √2 c) ½ d) 1 √2 e) none of these 46. The sampling distribution of 𝑝̂ is approximately Normal because a) There are atleast 7570 Division I college athletes b) np = 225 and n(1-p) = 525 c) a random sample was chosen d) a large sample size like n = 750 guarantees it. e) The sampling distribution of 𝑝̂ always have this shape. 49. Songs on an iPod. David’s ipod has about 10,000 songs. This distribution of the play times for these songs is heavily skewed to the right with a mean of 225 seconds and a standard deviation of 60 seconds. Suppose we choose an SRS of 10 songs from this population and calculate the mean play time x-bar of theses songs. What are the mean and the standard deviation of the sampling distribution of x-bar? The mean is μx-bar = 225 second and σx-bar = 60/squareroot(10) = 18.974seconds. These results do not depend on the shape of the distribution of the individual play times. These results do not depend of the shape of the distribution of individual play times. 51. Refer to 49. How many songs would you need to sample if you wanted the standard deviation of the sampling distribution of x-bar to be 30 seconds? Justify your answer. 53. Larger Sample. Suppose that the blood cholesterol of all mean aged 20 to 34 follows the Normal Distribution with mean μ = 188 milligrams per deciliter and standard deviation σ = 41 mg/dl. a) Choose an SRS of 100 men from this population. What is the sampling distribution of x-bar? b) Find the probability that x-bar estimates μ within ±3 mg/dl. Show your work. c) Choose an SRS of 1000 men form this population. Now what is the probability that x-bar falls within ±3 mg/dl of μ? Show your work. In what sense is the larger sample “better”? 55. Bottling cola. A bottling company uses a filing machine to fill plastic bottles with cola. The bottles are supposed to contain 300 milliliters. In fact, the contents vary according to a Normal Distribution with mean μ = 298 ml and standard deviation σ = 3 ml. a) What is the probability that an individual bottle contains less than 295 ml? Show your work b) What is the probability that the mean contents of six randomly selected bottles is less than 295 ml? Show your work. REVIEW QUESTIONS: Sharing music online. A sample survey reports that 29% of Internet users download music files online, 21% share music files from their computers, and 12% both download and share music. Make a Venn Diagram that displays this information. What percent of Internet users neither download nor share music files? The Central Limit Theorem: http://onlinestatbook.com/stat_sim/sampling_dist/index.html The Central Limit Theorem Draw an SRS of size n form any population with mean μ and standard deviation σ. The Central Limit Theorem (CLT) says that when n is large (n > 30), the sampling distribution of the sample mean x-bar is approximately normal. NORMAL CONDITIONS FOR SAMPLE MEANS 1) If the population is Normal, the so is the sampling distribution of x-bar, no matter what size n is. 2) If the population is not Normal, the CLT tells us that the sampling distribution of x-bar will be approximately Normal if n >30 P 455 57,59,61,63,65-68 57. What does the CLT say? Asked what the central limit theorem says, a student replies, “As you take larger and larger samples form a population, the histogram of the sample values looks more and more Normal.” Is the student right? Explain your answer. No. The histogram of the sample values will look like the population distribution, whatever it might happen to be. The central limit theorem says that the histogram of the distribution of sample means (from many large samples) will look more and more Normal. 59. Songs on an iPod. Refer to #49. a) Explain why you cannot safely calculate the probability the mean play time x-bar is more than 4 minutes (240 seconds) for an SRS of 10 songs. Since the distribution of the play times of the population of songs is heavily skewed to the right, a sample size of 10 will not be enough for the Normal approximation to be appropriate. b) Suppose we take an SRS of 36 songs instead. Explain how the central limit theorem allows ys to find the probability that the mean play times is more than 240 seconds. Then calculate this probability. Show your work. With a sample size of 36, we now have enough observations in our sample for the Central Limit Theorem to apply. 61. Airline passengers get heavier. In response to the increasing weight of airline passengers, the Federal Aviation Administration (FAA) in 2003 told airlines to assume that passengers average 190 pounds in the summer, including clothes and carry-on baggage. But passengers vary, and the FAA did not specify a standard deviation. A reasonable standard deviation is 35 pounds. Weights are not Normally distributed, especially when the population includes both men and women, but they are not very non-Normal. A commuter plan carries 30 passengers. a) Explain why you cannot the calculate the probability that a randomly selected passenger weights more than 200 pounds This probability cannot be calculated, because we do not know the shape of the distribution of the weights. b) Find the probability that the total weight of the passengers of a full flight exceeds 6,000 pounds. Show your work. 63. More on Insurance. An insurance company knows that in the entire population of homeowners, the mean annual loss from fire is μ = $250 and the standard deviation of the loss is σ = $300. The distribution of losses is strongly right skewed: many policies have $0 loss, but a few have large losses. If the company sells 10,000 policies, can it safely base its rates on the assumption that its average loss will be no greater than $275? 65 – 66. Scores on the mathematics part of the SAT exam in a recent year were roughly Normal with mean 515 and standard deviation 114. You choose an SRS of 100 students and average their SAT Math scores. Suppose that you do this many, many times. 65. The mean of the average scores you get should be close to a. 515 b. 5.15 c. 51.5 d. 0 e. none of the above 66. The standard deviation of the average scores you get should be close to a. 114 b. 1.14 c. 11.4 d. 1 e. none of these 67. A new born baby has extremely low birth weight (ELBW) if it weighs less than 1000 grams. A study of the health of such children in later years examined a random sample of 219 children. Their mean weight at birth was x-bar = 810 grams. This sample mean is an unbiased estimator of the mean weight μ in the population of all ELBW babies, which means that a) in all possible samples of size 219 from this population, the mean of the values of x-bar will equal 810. b) in all possible samples of size 219 from this population, the mean of the values x-bar will equal μ. c) as we take larger and larger samples from this population, x-bar will get closer and closer to μ. d) in all possible samples of size 219 from this population, the values of x-bar will have a distribution that is close to Normal e) the person measuring the children’s weights does so without any systematic error. 68. The number of hours a light bulb burns before failing varies from light bulb to light bulb. The distribution of burnout times is strongly skewed to the right. The central limit theorem says a) as we look at more and more bulbs, their average burnout time gets ever closer to the mean μ for all bulbs of this type. b) the average burnout time of a large number of bulbs has a distribution of the same shape (strongly skewed) as the population distribution. c) the average burnout time of a large number of bulbs has a distribution with similar shape but not as extreme (skewed, but not as strongly) as the population distribution. d) the average burnout time of a large number of bulbs has a distribution that is close to Normal e) the average burnout time of a large number of bulbs has a distribution that is exactly Normal.

Bias - s3.amazonaws.com

Related documents

Products

Support

Bias - s3.amazonaws.com

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib