01.11.2022, 12:01 Lesson activity: Sampling distributions: Attempt review Started on Thursday, 27 October 2022, 1:10 PM State Finished Completed on Tuesday, 1 November 2022, 12:01 PM Time taken 4 days 22 hours Marks 64.00/64.00 Grade 10.00 out of 10.00 (100%) https://altclass.online/mod/quiz/review.php?attempt=7639&cmid=1243 1/14 01.11.2022, 12:01 Lesson activity: Sampling distributions: Attempt review Question 1 Correct Mark 5.00 out of 5.00 Sampling or sample-to-sample variability You know that a good sample is representative of the population can be used to draw conclusions about the population as a whole. and therefore it It is important to note that sampling variability occurs and that it is an expected phenomenon. Small differences between samples are to be expected. Since each subject is an individual, responses will vary. Watch a video to answer a few questions about the concept of sampling variability: A statistic is a characteristic of the population: TRUE FALSE You are right! A statistic is any quantity computed from a sample whereas a quantity computed from entire population is referred to as a population parameter or population characteristic! Mark 1.00 out of 1.00 In the video example, each of the sample means of turtle weights represents a statistic statistic! You are right! Each of the sample means in the video example represents a population parameter Mark 1.00 out of 1.00 In the video example, all the sample means of turtle weights are identical https://altclass.online/mod/quiz/review.php?attempt=7639&cmid=1243 2/14 01.11.2022, 12:01 Lesson activity: Sampling distributions: Attempt review vary a little You are right! Indeed, the sample means in the video example vary a little, which is expected vary a lot Mark 1.00 out of 1.00 https://altclass.online/mod/quiz/review.php?attempt=7639&cmid=1243 3/14 01.11.2022, 12:01 Lesson activity: Sampling distributions: Attempt review Question 2 Correct Mark 18.00 out of 18.00 The following exercise will help you to start investigating the properties of the sampling distribution of sample means: The applet below uses a uniform probability distribution of X where the random variable X has an equal probability of taking a value between 10 and 30. It's like sampling with replacement from set of integers {10,11,12,13,..,28,29,30} when simulating a random sample, after each successive number is selected for this sample, the number is “replaced” back into the population {10,11,12,13,..,28,29,30} , so the same number may be selected again. The actual mean, μ, of this population distribution of values {10,11,12,13,..,28,29,30} is equal to 20 You are right! It is equal to [10+11+..+30]/21=20 or just [10+30]/2=20 for discrete uniform probability distribution 15 21 Mark 1.00 out of 1.00 What you're observing in the applet below: 100 random samples are simulated from this distribution. The means of each such random sample, , is computed and frequency distribution of all 100 values of sample means is constructed. First, check the box at n=1 to display the frequency distributions of sample means for samples of size n=1. When each sample consists of just 1 value, the mean of each of the 100 random samples drawn from {10,11,12,13,..,28,29,30} is equal to this same single value in this sample. That is why the red -colored frequency histogram of one hundred 1- sample means is identical to the I don't know uniform probability distribution of X where random variable X is equally likely to have any value between 10 and 30. Right! Mark 1.00 out of 1.00 https://altclass.online/mod/quiz/review.php?attempt=7639&cmid=1243 4/14 01.11.2022, 12:01 Lesson activity: Sampling distributions: Attempt review Next, check the box n=5 and you will have simulated 100 random samples of size 5 from the population distribution of {10,11,12,13,..,28,29,30}. The values of sample means automatically calculated for each of the 100 samples each consisting of are 5 values randomly drawn from {10,11,12,13,..,28,29,30}, and these 100 sample means, , are then used to construct the yellow -colored frequency histogram displayed in the applet above. You would expect the sample means of each of these 100 samples to be not necessarily equal, but rather close, to the actual population mean you have computed above You are right! The resulting yellow histogram shows there is a lot of sample-tosample variability in the value of sample means. For some samples, sample mean is around 15, and for other samples it is around 24. A sample of size 5 from the population won’t always provide precise information about the true mean in the population, μ, which actually is equal to 20. exactly equal to the actual population mean you have computed above Mark 1.00 out of 1.00 What if a larger sample is selected? To investigate the effect of sample size on the behavior of sample means, select 100 samples of size 10 by checking n=10 box. The mean values green -colored frequency histogram of the resulting sample is centered around 20, the actual mean of population distribution Correct! not centered around 20, the actual mean of population distribution Mark 1.00 out of 1.00 and spreads more https://altclass.online/mod/quiz/review.php?attempt=7639&cmid=1243 5/14 01.11.2022, 12:01 less Lesson activity: Sampling distributions: Attempt review Correct! Mark 1.00 out of 1.00 around the population mean value of μ compared to red and yellow distributions of sample means for smaller sample sizes n. Check the remaining boxed of sample size n=20, n =30, n=60. You notice that the resulting histograms tend to be centered around 20, the actual mean of population distribution Correct! not centered around 20, the actual mean of population distribution Mark 1.00 out of 1.00 which tells you that sample mean an unbiased indeed is You are right! an unprejudiced a biased Mark 1.00 out of 1.00 estimator of a population mean, μ, and that as sample size n increases, the sampling distribution of sample means spreads more less Correct! Mark 1.00 out of 1.00 around the population mean value, μ. Which means that as sample size n increases, sample mean becomes a more accurate estimate of population mean, μ. For the largest sample size n=60, the frequency histogram of the resulting sample mean values -colored and most values in this histogram are relatively ocean-wave close to is Correct! far from Mark 1.00 out of 1.00 the actual or true population mean value of μ equal to 20 whereas for smaller sample sizes most farther away from Correct! values are relatively closer to Mark 1.00 out of 1.00 the actual/true population mean value of μ. This lesson activity also introduces/reminds you the Central Limit Theorem which tells that when sample size n contains enough observations observations randomly drawn from a population, no matter what the shape of the population distribution is, then the sampling distribution of sample means is well approximated by a normal curve. https://altclass.online/mod/quiz/review.php?attempt=7639&cmid=1243 6/14 01.11.2022, 12:01 Lesson activity: Sampling distributions: Attempt review Take a look again at the distributions of sample means from 100 samples of size n=30 and n=60. Do they look normal (=bell-shaped and symmetric) to you? Kind of.. Right! They look absolutely abnormal! Mark 1.00 out of 1.00 Do not confuse the sampling distribution with the sample distribution. The sampling distribution considers the distribution of sample statistics (e.g. sample mean), whereas the sample distribution is basically the distribution of a particular sample taken from the population. Does Introduction to the lesson activity make sense? There is no wrong answer here. Yes! Got it! No Not quite https://altclass.online/mod/quiz/review.php?attempt=7639&cmid=1243 7/14 01.11.2022, 12:01 Lesson activity: Sampling distributions: Attempt review Question 3 Correct Mark 17.00 out of 17.00 In a random sample of 100 people, we find that 20 are left-handed. What can we say about the proportion of the population who are left handed? In the previous lesson activities, we discussed the idea of a random variable - a single data-point drawn from a probability distribution described by parameters (like parameters p and n for binomial or μ and σ for normal ). But we are seldom interested in just one data-point - we generally have a mass of data which we summarize by determining means, medians and other statistics. The fundamental step we take in inferential statistics is to consider those statistics as themselves being random variables, drawn from their own distributions. This is a big advance, the one that has challenged generations of statisticians who have tried to work out what distributions we should assume these statistics are drawn from. Maybe in the end of the course I will have time to introduce you into simulation-based bootstrap methods. For example, the question posed at the beginning of this part could be answered by taking our observed data of 20 left-handed and 80 righthanded individuals, and repeatedly resampling one hundred observations from this data set, with replacement, and looking at the distribution of the observed proportion of the left-handed people. But these simulations are clumsy and time consuming, especially with large data sets, and in more complex circumstances it is not straightforward to work out what should be simulated. In contrast, formulae derived from probability theory provide both insight and convenience, and always lead to the same answer since they don't depend on a particular simulation. But the flip side is that this theory relies on assumption, and we should be careful not to be deluded by the impressive algebra into accepting unjustified conclusions. Suppose we draw samples of different sizes from a population containing exactly 20% left- and 80% righthanded people (so, true population probability of success of drawing a left-handed person, p, is equal to 0.2 . Calculate the probability of observing different possible proportions of left-handers (=possible values of sample proportions of left-handers resulting from different random samples). Note: Of course, this is the wrong way round - we want to use the unknown known sample to learn about the population - but we can only get to this conclusion by first exploring how a known population gives rise to different samples. The simplest case is a sample of 1 (n = https://altclass.online/mod/quiz/review.php?attempt=7639&cmid=1243 8/14 01.11.2022, 12:01 Lesson activity: Sampling distributions: Attempt review 1 ) when the observed proportion must be either 0 (right-hander) or 1 (left-hander) and these events occur with probability of 0.8 and 0.2 , respectively. Use the applet below set "The number of trials: n =1"; set "the probability of success at each trial: p =0.2". You will be able to see the resulting sampling distribution for the sample proportions of lefthanded people in all possible random samples of size n=1. Do not confuse the resulting probability histogram with sample distribution: Each particular sample of n =1 people taken from the population will have either x=0 (~a single person in your sample is right-handed) or x=1 (~a single person in your sample is left-handed)! Then take two individuals at random (n = 2 ). Then the probability of 0 left-handers will be 0.64 , the probability of 1 left-hander will be 0.32 handers), and the probability of 2 left-handers will be 0.04 . (Hint: to display corresponding probabilities, set relevant value of x corresponding to the number/count of left-handers equal to 0, then 1, and then 2, correspondingly) Similarly, you can use the probability theory to work out the probability distribution for the observed numbers of left-handers in the 5-, 10-, 50- and 100-person samples. Consider a population of 1000 people, 200 of whom are left-handed and the rest are right-handed. If you select 100 people at random but with replacement, then px(1-p)n-x= the probability that 20 are left-handed and 80 are right-handed is p(x=20)= 0.2 20(10.2 )100-20 ≈ 0.0993 . (Go through the previous lesson activity to refresh the binomial distribution formula) If you select without replacement, then the probability is * )/ ≈ 0.1047 This distributions are already familiar to you. They are based on what is known as binomial distribution, and can tell you the probability, for example, of getting at least 30%left-handed people if you sample 100. The mean of the random variable is also known as its expectation, and in all these samples we expect a proportion of 0.2 or 20%: all of the distributions you have simulated above will have n*p as their corresponding mean. The standard deviation for each is equal to [n*p*(1-p)]0.5 and is usually referred to as https://altclass.online/mod/quiz/review.php?attempt=7639&cmid=1243 9/14 01.11.2022, 12:01 Lesson activity: Sampling distributions: Attempt review standard error to distinguish it from the standard deviation of the population distribution from which it derives. You may notice that as sample size n increases, the probability distributions tend to be more symmetric, regular, and normal in shape. Do not confuse the sampling distribution with the sample distribution: The sampling distribution for a proportion of left-handed people refers to the distribution of sample statistics (e.g. sample proportion)} whereas the sample distribution is basically the distribution of a particular sample taken from the population. Particularly, the sampling distribution of sample proportions refers to the distribution of sample proportions from all possible random samples of particular size n. https://altclass.online/mod/quiz/review.php?attempt=7639&cmid=1243 10/14 01.11.2022, 12:01 Lesson activity: Sampling distributions: Attempt review Question 4 Correct Mark 24.00 out of 24.00 Sampling Distributions of a Sample Proportion In this part of the lesson activity, you will take a quick and informal look at the role that sampling distributions play in learning about population characteristics. The two examples here show how a sampling distribution provides important information in a estimation setting and in a hypothesis testing setting. In an estimation situation, you need to understand sampling variability to assess how close an estimate is likely to be to the actual value of the corresponding population characteristic. Published research reports often include statements about a margin of error. This margin of error is based on an assessment of sampling variability as described by a sampling distribution. This is illustrated in the following example. Will Cash Become a Thing of the Past? The article “Most Americans Foresee Death of Cash in Their Lifetime” used data from a random sample of 1024 adults to estimate the proportion of all adults in the United States who think it is likely that within their lifetime, the United States will become a cashless society with all purchases being made by credit card, debit card, or some other form of electronic payment. Of the 1024 people surveyed, 635 indicated that they thought this was likely, resulting in a sample proportion of =635/1024 =0.62. The population proportion who think that a cashless society is likely probably isn’t exactly 0.62. How accurate is this estimate likely to be? To answer this question, you can use what you know about the sampling distribution of for random samples of size n=1024. Using the general rules described in the video above, you know three things about the sampling distribution of : Rule 1. The sampling distribution of sample proportion https://altclass.online/mod/quiz/review.php?attempt=7639&cmid=1243 is centered at 11/14 01.11.2022, 12:01 Lesson activity: Sampling distributions: Attempt review I don't know zero the actual population proportion of all adults in the U.S. who think it is likely that within their lifetime, society will become cashless, p Correct! Mark 1.00 out of 1.00 This means that the values from random samples cluster around I don't know zero the actual value of population proportion, p which is not known Correct! Mark 1.00 out of 1.00 and this is only true for convenience samples random samples Correct! Mark 1.00 out of 1.00 . The study above says that this sample of 1024 adults was a convenience sample random sample Correct! Mark 1.00 out of 1.00 Rule 2. The standard deviation of sample proportion describes how tightly the values from different possible random samples of size 1024 spread out around p, the actual population proportion of all adults in the U.S. who think it is likely that within their lifetime, society will become cashless, and the standard deviation( )=[ *(1- )/n]0.5= [0.62*(10.62 )/ 1024 ]0.5 = 0.015 . Rule 3. Because: n* =1024* 0.62 = 634.88 expected successes in the sample of 1024 adults and which is more than OK less than equal to Mark 1.00 out of 1.00 https://altclass.online/mod/quiz/review.php?attempt=7639&cmid=1243 12/14 01.11.2022, 12:01 Lesson activity: Sampling distributions: Attempt review 10 AND n*(1- )=1024*[1- 0.62 ]= 389.12 expected failures in the sample of 1024 adults and which is less than equal to more than OK Mark 1.00 out of 1.00 10 THEN the sampling distribution of may be OK may not be Mark 1.00 out of 1.00 approximated by normal distribution. By using the information and what you know about normal distributions, you can now get a sense of the accuracy of the estimate = 0.62: For any variable described by a normal distribution, about 95 % of the values are within 2 standard deviations of the center (Hint: you may use the applet at the bottom of the page to refresh your memory). Since the sampling distribution of is approximately normal Correct! abnormal paranormal Mark 1.00 out of 1.00 and is centered at zero the actual population proportion of all adults in the U.S. who think it is likely that within their lifetime, society will become cashless, p Correct! Mark 1.00 out of 1.00 you now know that about 95 % of the values of all possible random samples will produce a sample proportion that is within 2 standard deviations( )= 2* 0.01516 = https://altclass.online/mod/quiz/review.php?attempt=7639&cmid=1243 13/14 01.11.2022, 12:01 Lesson activity: Sampling distributions: Attempt review 0.03032 of the actual value of the population proportion p. So, a margin of error =2 standard deviations( ) = 0.03032 can be reported. This tells you that the sample estimate = 0.62 is likely to be within 0.03032 of the actual proportion of U.S. adults who think that the United States will become a cashless society in their lifetime. Alternatively, you could say that plausible values for the actual proportion of U.S. adults, p, who think so are those between: 0.62-margin of error = 0.59 and 0.62+margin of error = 0.65 . This example shows that the sampling distribution of the sample proportion is the key in helping you assess the accuracy of the estimate of true population proportion, p. Use the applet by dragging the values of lower and upper boundaries of corresponding intervals along the lines: https://altclass.online/mod/quiz/review.php?attempt=7639&cmid=1243 14/14