Statistics 101 L – Laboratory 8 Drawing samples from populations requires some method of randomization. In the activities in this week’s lab you will use different methods to randomly select a sample from a population. In the process you will also learn something about the variation introduced by random sampling (called sampling variation). Activity 1: Your group will need the sheet with 100 rectangles on it for this activity. Do not study the sheet of rectangles. The rectangles represent a generic population of values. The areas of the rectangles could represent the number of hours per week a student spend on Stat 101 outside of class and lab. What we will investigate in this activity is what can be learned about the area of the rectangles using different sampling methods. a) Each individual in your lab group should look at the sheet of rectangles for a few seconds and guess the average area of the rectangles on the entire sheet. The unit of measure is the background rectangle. Thus rectangle 7 has area 4x3=12. Compare your guess with those of the other members in your group. Discuss why you got different guesses. Record the individual guesses on the group answer sheet. b) Each individual should now select five rectangles that, in her/his judgment, are representative of the population of rectangles on the entire sheet. Write down the area for each of the five rectangles and compute the sample mean area. How does this compare with your guess? How does it compare with the sample means of the judgment samples for the others in your group? Again discuss why you got different means. Record the judgment sample means on your group answer sheet. c) We want to take a random sample from the 100 rectangles. We could cut out the rectangles, laminate them, put them in a bag, mix thoroughly and draw from the bag. Do you think this would give you a random sample? Explain briefly. d) Devise a plan to produce a random sample from the population of rectangles using JMP. e) Using your sample plan developed in d), take a simple random sample of 5 rectangles. Record the rectangle numbers and the corresponding areas on the group answer sheet. What is the sample mean area for your sample? How does it compare with your guesses and your judgment sample means? f) If you were to repeat part e) would you come up with the same sample mean area? Explain briefly. g) Which do you think will be a better guess for the population mean area of the 100 rectangles, a random sample mean for 5 rectangles or the random sample mean for 20 rectangles? Explain briefly. Activity 2: The first population that we will sample from is the U.S. Senate (2002) edition. There are 100 members of this population. Some of the characteristics of the Senators that may be of interest are: name = Senator’s last name gender = M for male, F for female party = D for Democratic, R for Republican, I for Independent state = two letter state abbreviation years = years in Senate One way to randomly sample items from a population is to put all of the items into a bag, mix them up and draw from the bag, without replacement. We can’t throw all the Senators into a bag and mix them up, so we will use an applet on the web that simulates random selection, drawing without replacement, from the population of 100 Senators. Go to the course web site under computing and select the Sampling Senators link. 1 a) Choose a random sample of 10 U.S. Senators. Record the Senator’s name, gender, party and years in Senate on the group answer sheet. Also record the proportion of females, the proportion of Republicans and the sample mean years in the Senate. b) Obtain two more samples of 10 U.S. Senators. Again record the proportion of females, proportion Republicans and sample mean years in the Senate. c) Did the three samples produce the same sample statistics, i.e. sample proportions and sample means? Why? d) In the 2002 Senate there were 87 males and 13 females, 50 Democrats, 49 Republicans and one Independent and the mean years in the Senate was 12.38 years. Which of your samples comes closest to matching the true values in the population? e) Because most of the time we take only one sample, what can be said about a single sample’s ability to match the population? Activity 3: In this activity we will look at how you can use JMP to select a simple random sample and also look at the idea of stratified sampling. The population we will look at consists of 1742 students at ISU who have taken Stat 101 and filled out the survey at the beginning of the semester they took Stat 101. The variable we will look at is the age of the student. The data is saved in a JMP file called SexAge_Stat101.JMP. a) Open up the JMP file. You can select a simple random sample from this file by going to Tables – Subset. In the dialog box click on Random – sample size: and fill in 100 for the size of the sample. Use Analyze – Distribution to answer the following questions. Turn in the JMP output. i. What proportion of your sample is Female? Male? ii. What is the average age for your sample? iii. What is the median age for your sample? iv. Describe the distribution of age for your sample. b) The population can be stratified by gender – male and female. This creates two subpopulations (strata) of 950 females and 792 males. These two subpopulations (strata) are saved in JMP files called Male_Age_Stat101.JMP and Female_Age_Stat101.JMP. Using these subpopulations (strata) obtain a stratified random sample of 50 males and 50 females. Use Analyze – Distribution to answer the following questions. Turn in the JMP output. i. What are the average and median ages for your sample of males? ii. What are the average and median ages for your sample of females? iii. Combine your sample average age for males with your sample average age for females to come up with an average age for students in the entire population. Hint: You need to take into account that there are more females in the population than males. iv. Compare your answer in b. – iii. to that in a. – ii. 2 Stat 101 L: Laboratory 8 – Answer Sheet Names: _________________________ _________________________ _________________________ _________________________ Activity 1: Name Guessed Value Judgment Mean c) Do you think drawing laminated rectangles from a bag will give you a random sample? Explain briefly. d) Plan for using a JMP to select a random sample. e) Random sample Rectangle # Area Sample mean area = _______________________________ f) Would another random sample have the same sample mean area? Explain briefly. g) Which is better, random sample mean based on 5 or random sample mean based on 20? Explain briefly. 3 Activity 2: a) Sample of 10 U.S. Senators Name Summaries: Years y= Party Gender p̂ F = p̂ R = b) Years Party Gender Sample 2 Summaries: y= p̂ R = p̂ F = Sample 3 Summaries: y= p̂ R = p̂ F = c) Same sample statistics for all samples? d) Which sample comes closest to the population values? e) What can be said about a single sample’s ability to match the population? 4 Activity 3: a) Simple Random Sample i. What proportion of your sample is Female? Male? ii. What is the average age for your sample? iii. What is the median age for your sample? iv. Describe the distribution of age for your sample. b) Stratified Random Sample i. What are the average and median ages for your sample of males? ii. What are the average and median ages for your sample of females? iii. Combine your sample average age for males with your sample average age for females to come up with an average age for students in the entire population. Hint: You need to take into account that there are more females in the population than males. iv. Compare your answer in b. – iii. to that in a. – ii. 5