Topics for Today Sampling Methods Sample vs Population Law – of – Large Numbers Stat203 Fall 2011 – Week 5 Lecture 1 Page 1 of 23 Importance of Sampling Method When we have a data set, it is critical to know how the ___________ comprising the data were ________. Can we measure entire population? - every _______ on earth - every ______ with a ____________ If not, we’re forced to take a sample. Did every member of the population have the same chance of being included? Were individuals excluded? Why do we care? Stat203 Fall 2011 – Week 5 Lecture 1 Page 2 of 23 Sampling and Population The results from analyzing our ______ can only ______ conclusions on the __________ they came from. Can we analyze the results of a _______of people in this class then make ___________ about “______”? Stat203 Fall 2011 – Week 5 Lecture 1 Page 3 of 23 Sampling Methods The ________ method used in any particular problem is determined by the researcher in the initial _____________stage. There are several types of sampling methods (note that this is different from the data collection methods of earlier) They are broadly classified as - __________ - ______ Stat203 Fall 2011 – Week 5 Lecture 1 Page 4 of 23 Non-Random Sampling Methods: Accidental or Convenience Sampling A particular sample or subset of a population is used because it is convenient for the researcher. Example: I may wish to obtain ___________ about young adults living in the greater _________ area and I choose __________ as my sample (because this is very __________ for me). What are some of the reasons why this class is a ____ sample? Stat203 Fall 2011 – Week 5 Lecture 1 Page 5 of 23 - Some people in this class may __________ to the population (young). - The _____ of men to women in this class may be quite _________ than in the general population. - People in this class are _______________ than in the general population. etc. Stat203 Fall 2011 – Week 5 Lecture 1 Page 6 of 23 Non-Random Sampling Methods: Volunteer Sampling People _________ for your sample – most commonly this is now ______, but also streetcorner and _____ interviews. Example: A TV news broadcast or program asks it’s viewers to _____ in or go to their _______ and register a (Yes/No) opinion on a certain question. - Are the viewers of this news broadcast ______________ of the adult public in the area? - Even if they were, are those who ‘phone in’ or ‘e-mail in’ ____________________________? Stat203 Fall 2011 – Week 5 Lecture 1 Page 7 of 23 Non-Random = Non-Representative Non-random samples are generally ______ in some way. Whether known or unknown, an individual with particular ______________ is ___________ to be ________. When some individuals are more likely to be included, or some are specifically excluded, the sample is automatically not ______________ of the population at large. Does this mean non-random samples are useless? Stat203 Fall 2011 – Week 5 Lecture 1 Page 8 of 23 Random Sampling Each member of the population has an _____________of being selected for the sample. (Note: In each of the non-random methods discussed above, this characteristic of a good sample is missing.) Stat203 Fall 2011 – Week 5 Lecture 1 Page 9 of 23 Random Sampling Methods: Simple Random Sampling How can we produce a SRS? Suppose in a large class with 427 students we wish to select a simple random sample of 5 students. How do we do this? i) Give the each student a different ______ starting at 1 and going to 427. ii) Use Table B (page 517) to choose the 5 people in the following way. - Choose a _________________ in the random digit table. - Disregard _________ selections. - _________ numbers selected _______ the range of 1 to 427. Stat203 Fall 2011 – Week 5 Lecture 1 Page 10 of 23 Random Sampling Methods: Systematic sampling This method produces a random sample from an ________ list. Example: A sample of size 10 is required from a list of size ____. i) Divide the _______________ by the ___________ to get 10 groups of size 155 with 7 left over. (1557/10 = 155 +7 remainder) ii) Randomly choose one number between 1 & ___ (say 17 … or a number from a random number table). iii) Keep adding ___ to the number chosen in (ii) until one number is chosen from ____ of the large groups. Stat203 Fall 2011 – Week 5 Lecture 1 Page 11 of 23 The sample chosen consists of those items on the list in positions __, 172, ___, 482, … , 1412. Let’s try this in class – it’s very easy. Systematic sampling is often used in __________ __________ where the every k-th item is taken out of the _______________. Stat203 Fall 2011 – Week 5 Lecture 1 Page 12 of 23 Random Sampling Methods: Stratified sampling Partition the population into ___________ groups (______) and take a SRS from each one of them. The members of each strata are similar in some way. Example __________ = all SFU students. Each strata could consist of one of the SFU _________ (Arts, Science, Health, Business etc.) Take a ____________________ from each _______. Stat203 Fall 2011 – Week 5 Lecture 1 Page 13 of 23 Random Sampling Methods: Cluster sampling Cluster is also called __________ Sampling. Partition the population into _____________ groups (clusters) and take a SRS of clusters. From each selected cluster chose a ___. Example: A city is subdivided into districts, each of which is similar in _______________ to the whole city. A ___ of these districts is chosen and a ___ is chosen from each of these districts. Stat203 Fall 2011 – Week 5 Lecture 1 Page 14 of 23 Let’s let someone else do the explaining as well and see which explanation works better for you: http://youtu.be/xh4zxC1OpiA and http://youtu.be/wUwH7Slfg9E Stat203 Fall 2011 – Week 5 Lecture 1 Page 15 of 23 Samples and Populations We’ve discussed a number of different things that we _______ … _____, standard deviations, range, ___, median … etc Whether measured on an entire __________ or just a ______, these measures all have the same ______________. But … We often use different symbols to denote whether they are measured on a sample or the entire population. Stat203 Fall 2011 – Week 5 Lecture 1 Page 16 of 23 Parameters vs Statistics A _________ is something measured on a __________. A _________ is the same thing, but measured on the ______. Stat203 Fall 2011 – Week 5 Lecture 1 Page 17 of 23 Parameters and statistics have exactly the same _______, ______________ and are __________ the same way. Consider the following ___________ and the parameter which could be calculated for each: Median Age of _________________ in Vancouver Average wage of _______ of the International Machinists Union The standard deviation of the weight of widgets _____________________ by the Acme Widget Company Maximum temperature in July for major _______________ Stat203 Fall 2011 – Week 5 Lecture 1 Page 18 of 23 … but for larger populations, we could take only a ______. Median Age of all _____________ in Vancouver Average wage of ______________ The standard deviation of the weight of _____________________by the Acme Widget Company Clearly these cannot be measured directly, so we take a ______ and use the sample to ‘guess’ at the true parameter. This is _____________________ … and we all perform inference every day. _____________________? Stat203 Fall 2011 – Week 5 Lecture 1 Page 19 of 23 Question: For a school project, a friend of yours randomly samples ____________________ and asks them if they’ve ever _____________. 2 people had been arrested and s/he calculates that __________ of people approximately her age have been arrested. The next day you read a newspaper article about a nation-wide study of ______________________university students your age concluding that _____ of people approximately your age have been arrested. Which do you think is closer to the ____ percentage for the population? Why? Stat203 Fall 2011 – Week 5 Lecture 1 Page 20 of 23 The Law of Large Numbers You already know this law. This says that as the size of a sample _________, the __________________ gets closer to the true mean of the population. http://www.stat.tamu.edu/~west/ph/sampledist.html Stat203 Fall 2011 – Week 5 Lecture 1 Page 21 of 23 Today’s Topics Sampling Methods - Non-Random (convenience, volunteer) - Random o SRS o Stratified o Cluster o Systematic Sample vs Population - Usually have one or the other - Statistic vs Parameter o Calculated the same o Mean the sam Law-of-Large-Numbers - As sample size increases, the sample mean approaches the population mean Stat203 Fall 2011 – Week 5 Lecture 1 Page 22 of 23 Reading for next lecture Chapter 6 Stat203 Fall 2011 – Week 5 Lecture 1 Page 23 of 23