Statistics 3502/6304 Spring 2015 Prof. Suess Name:__________________ (print: first last ) Class #:__________________ Project 1 Instructions: Simulation of the Central Limit Theorem. In this project you will see the Central Limit Theorem (CLT) in action. The CLT says that the sampling distribution of the sample mean 𝑦̅ will be approximately normal with a mean that is the 𝜎 same at the populations mean µ and standard deviation 𝑛 , also called the standard error. √ To demonstrate the CLT you can use Minitab. To use a computer to simulate values from a random variable we use what is called a pseudorandom number generator. This just means the values generated by the algorithm in the software produces numbers that are close to truly random. For simulation purposes pseudo-random number generators are commonly used to experiment with randomness. Here are the steps to simulate the CLT using Minitab to simulates samples of size n = 1, 2, 10, and 30 from a Normal populations with mean µ = 54 and 𝜎 = 3. We will generate B = 1000 samples of size n from the population. First, we will simulate a large number of random values form the Normal population to see that the pseudo-random number generator produces what we expect. If we take samples of size n = 1, we should see the same mean and standard deviation of the population. 1. From the pull-down menu Calc > Random Data > Normal. a. Number of rows of data to generate: enter 1000 b. Store in column(s): enter C1 c. Mean: change 0.0 to 54 d. Standard deviation: change 1.0 to 3 e. Click OK. 2. Now plot the data using Graph > Histogram. a. Select With Fit. b. Click OK. c. Select C1 for the Graph variables: d. Click OK. (Give the graph as part of your report.) e. Compute the Descriptive Statistics and record the Mean and StDev in the Table below. 3. Comment on how closely the Fit is to the histogram. 4. Repeat for B = 10,000. (Give the graph as part of your report.) 5. Repeat for B = 1,000,000, if your computer can. (Give the graph as part of your report.) This step is options, if the previous simulation too a long time. 6. As B increases does the Fit become better? B Mean StDev 1000 10,000 1,000,000 Now we will simulate the CLT assuming a normal population. Start with n = 2 and Repeat two more times with n = 10 and n = 30. 1. Simulate the samples of size n in the rows of the spreadsheet. From the pull-down menu Calc > Random Data > Normal. a. Number of rows of data to generate: enter 1000 b. Store in column(s): enter C1 – Cn ( n will be 2, 10, and 30 ) c. Mean: change 0.0 to 54 d. Standard deviation: change 1.0 to 3 e. Click OK. 2. Compute the B sample means in the next column. a. Calc > Row Statistics… b. Select Mean. c. Input variables: enter all of the columns simulated. d. Store result in: enter the next column. If you simulate samples of size 2, then enter C3. (For n=10 enter c11, for n=30 enter c31) e. Name the column, means. 3. Now plot the means using Graph > Histogram. a. Select With Fit. b. Click OK. c. Select C3 for the Graph variables: means (For n=10 enter c11, for n=40 enter c31) d. Click OK. (Give the graph as part of your report.) e. Compute the Descriptive Statistics and record the Mean and StDev in the Table below. 4. Comment on how closely the Fit is to the histogram. 5. Repeat for B = 1000. (Give the graph as part of your report.) 6. Repeat for B = 10,000. (Give the graph as part of your report.) 7. Repeat for B = 1,000,000, if your computer can. 8. As B increases does the Fit become better? B Mean StDev 1000 10,000 Does the CLT work better as the number of simulations B increases? 1,000,000 (Stat. 6304 students only) Redo the simulation experiment simulating from the Binomial distribution using Number of trials: 100 and Event probability: .2. To see your computer working on the simulation open the Windows Task Manager. Right-click on the task bar at the bottom of the screen and selected Windows Task Manager. Start the program before you run your simulations. If your computer has more than one core, you should see 100% utilization of all cores on your computer.