Probability Sampling Binomial Approximation to Hypergeometric Probabilities Lab2: Probability Sampling and Discrete Distributions Michael Akritas Michael Akritas Lab2: Probability Sampling and Discrete Distributions Probability Sampling Binomial Approximation to Hypergeometric Probabilities Simulating the Binomial Random Variable Sampling From a PMF I I I To obtain a sample of size 100 from the sample space population 1-5 with respective probabilities 0.1, 0.2, 0.4, 0.2 and 0.1, do: 1. Enter the numbers 1-5 in C1. Enter the corresponding probabilities in C2. 2. Calc>Random Data>Discrete>Enter ”100” into Generate; enter ”C3” into Store in column(s); enter ”C1” into Values in; enter ”C2” in Probabilities in>OK To view the empirical probabilities (or sample proportions) use: Stat>Basic Statistics>Display Descriptive Statistics>Enter C3 in Variables and also in By variables (optional), Click on Statistics and select ”N total” ”Percent”, ”Cumulative percent”. OK, OK Copy and paste the output on a word document. Michael Akritas Lab2: Probability Sampling and Discrete Distributions Probability Sampling Binomial Approximation to Hypergeometric Probabilities Simulating the Binomial Random Variable Bar Graph and Dot Histogram of a PMF Columns C1 and C2 of the previous slide contain the pmf x 1 2 3 4 5 . p(x) 0.1 0.2 0.4 0.2 0.1 I To do a bar graph of it use: Graph> Bar Chart > Select ”A function of a variable”, OK> Enter ”C2” in Graph variables, enter ”C1” in Categorical variable, OK I Copy the bar graph by right-clicking on its margin and paste on a word document. I An alternative visual graph of a pmf is the dot histogram: Graph>Scatterplot> select ”simple”> OK> Enter ”C2” (the probabilities column) for Y and ”C1” (the sample space column) for X>OK Copy the dot graph and paste it on a word document. Michael Akritas Lab2: Probability Sampling and Discrete Distributions Probability Sampling Binomial Approximation to Hypergeometric Probabilities Simulating the Binomial Random Variable Bar Graph of the Empirical Probabilities We will use the raw data given in column C3 to do a bar graph of the empirical probabilities (or sample proportions): I Graph> Bar Chart > Select ”Counts of unique values”, OK> Enter ”C3” in Categorical variables, click on Chart options and select Show Y as Percent, OK, OK Copy and paste the graph in the word document I An alternative way to do the bar graph of the empirical probabilities is to do a Histogram: Graph> Histogram>Choose ”Simple”, OK>Enter ”C3” in Graph variables, click on Scale and on Y-Scale Type and select Percent>OK, OK Copy and paste the graph in the word document. Michael Akritas Lab2: Probability Sampling and Discrete Distributions Probability Sampling Binomial Approximation to Hypergeometric Probabilities Simulating the Binomial Random Variable Probability Sampling from the Binomial PMF If each of 50 people toss a coin 10 times, and each records the number of heads, how variable do you expect the 50 numbers to be? We can answer this question by simulation. I Generate 50 observations from the Bin(n=10,p=0.5) distribution: Calc>Random Data>Binomial> Enter ”50” into Generate; enter ”C1” into Store in column(s); enter ”10” into Number of trials; enter ”0.5” into Probability success> OK I View the sample of 50 observations numerically and graphically as described in the context of probability sampling. Michael Akritas Lab2: Probability Sampling and Discrete Distributions Probability Sampling Binomial Approximation to Hypergeometric Probabilities Simulating the Binomial Random Variable Comparison of Empirical and True Probabilities Since empirical probabilities imitate the true ones, the answer to the previous question (how variable do we expect the 50 numbers to be) can be answered by a bar graph of the Bin (10, 0.5) pmf. I Obtain the pmf of X ∼ Bin (10, 0.5). Enter the numbers 0-10 in rows 1-11 of column C2 and then: Calc>Probability Distribution>Binomial> click on ”Probability”, enter ”10” in Number of trials, enter ”0.5” in Probability of success, enter ”C2” in Input column, enter ”C3” in Optional storage> OK. I Do a probability dot histogram as described in the context of probability sampling. Michael Akritas Lab2: Probability Sampling and Discrete Distributions Probability Sampling Binomial Approximation to Hypergeometric Probabilities 1. Use commands as before to generate the pmf of Binomial(10,0.3), in c2, c4. 2. Use commands similar to the above to generate the pmf of Hypergeometric(n=10, N=100, M=30), in c2, c5. 3. Use commands similar to the above to generate the pmf of Hypergeometric(n=10, N=1,000, M=300), in c2, c6. 4. Use commands similar to the above to generate the pmf of Hypergeometric(n=10, N=10,000, M=3,000), in c2, c7. Copy and paste in a word document the above pmf’s. Comment on the rule n ≤ 0.05 × N for satisfactory approximation, as well as the quality of the approximation as N increases. Comment on whether the Bin(10, 0.3) pmf provides a good approximation to the hypergeometric(n=10, N=10,000, M=4,000) pmf. Michael Akritas Lab2: Probability Sampling and Discrete Distributions