Week 8 – Key Terms and Concepts Probability Models We discussed random variables last week and the way in which we can assign outcomes of a given variable specific probabilities. We’ll focus this week on three unique probability distributions that can be used in certain circumstances to help us answer questions. But first, we need to cover Bernoulli trials, a key component of many of these probability distributions. Bernoulli Trials A Bernoulli trial is a trial of a random event that has all of the following properties: 1. The trial has two possible outcomes, and only two! One outcome is called a success and the other is called a failure. We assign these names arbitrarily based on the question at hand. 2. The probability of success and of failure do not change, and we know them in advance. The probability of success is labeled with the letter p, and the probability of failure is labeled with the letter q (which is equal to 1 – p). 3. Each individual trial is independent of any other trial. Getting a success or failure on one trial has no effect on the result of the next trial. In many cases, the trials are technically not independent, as one trial will cause the population size to decrease and thus affect the following probability. However, if this happens, we can proceed as if the trials are independent as long as the sample size is less than 10% of the population. If all three conditions here are met, we have a Bernoulli trial! Geometric Model The Geometric model is used whenever we are trying to answer the question “how many trials will it take to obtain our first success”? We can only use this model if the trials are Bernoulli trials, so be careful to check those conditions before proceeding. The only thing we need to know for this model is the probability of success, p. Everything else comes from this. The probability of succeeding on the xth trial is simply the probability of failing x – 1 times (with a probability of q each time) multiplied by the probability of succeeding on the last trial, p. For a random variable X, where X is the number of trials needed to get our first success, we can model the distribution using a Geometric model with parameter p X ~ Geom(p) P(X = x) = qx – 1 * p E(X) = 1/p SD(X) = √𝒒⁄𝒑𝟐 Poisson Distribution In some cases, we already know the average number of some event occurring within a given time period, instead of having to find it. In the cases where these events are not very likely, we can model the likelihood of these events using a Poisson distribution. For a random variable Y, where Y represents the number of some random event occurring within a specified time window, we can estimate the probability of Y occurring with a Poisson distribution, as long as we know the average number of events that occur (λ). Y ~ Poi(λ) 𝝀𝒚 P(Y = y) = 𝒆𝝀 ∗𝒚! E(X) = λ Always be sure to check that the value of λ you have is in the correct unit of time. To find Poisson values in Google Sheets or Excel, use the following formula: =POISSON(x, mean, cumulative) x: The number of events you are trying to find the probability for mean: λ, the average number of events in your time period cumulative: If “FALSE”, will give only the probability of that exact x value; if “TRUE”, will give the probabilities for every x up to and including the one entered here Binomial Distribution There are many times in which we only have a set number of trials to work from, and want to know how many of a certain type of event (success) will occur within that set. In this case, both the sample size and the probability of success are needed in order to calculate probabilities. These must be Bernoulli trials! For a random variable Z, where Z represents the number of successes (each with probability p) in a sample of size n, we can find the probability of Z with a Binomial distribution. Z ~ Binom(n, p) P(Z = z) = (𝒏𝒛)*pz*qn – z 𝒏! where (𝒏𝒛) = 𝒛!∗(𝒏−𝒛)! E(X) = n*p SD(X) = √𝒏 ∗ 𝒑 ∗ 𝒒 To find Binomial probabilities in Google Sheets or Excel, use the following formula: =BINOMDIST(num_successes, num_trials, prob_success, cumulative) num_successes: The number of successes you are trying to calculate the probability for num_trials: The sample size n prob_success: The probability of success p cumulative: If If “FALSE”, will give only the probability of exactly the number of successes you input; if “TRUE”, will give the probabilities for all number of successes up to and including the one entered here