Chapter 5: Random Variables and Discrete Probability Distributions http://www.landers.co.uk/statistics-cartoons/ 1 5.1-5.2: Random Variables - Goals • Be able to define what a random variable is. • Be able to differentiate between discrete and continuous random variables. • Describe the probability distribution of a discrete random variable. • Use the distribution and properties of a discrete random variable to calculate the probability of an event. 2 Random Variables A random variable is a function that assigns a unique numerical value to each outcome in a sample space. The rule for a random variable may be given by a formula, a table, or words. Random variables can either be discrete or continuous. 3 Probability Distribution of a Random Variables • The probability distribution of a random variable gives all of its possible values and the probabilities for each of them. 4 Probability Distribution of a Random Variables • Probability mass function (pmf) is the probability that a discrete random variable is equal to some specific value. In symbols, p(x) = P(X = x) Outcome x1 probability p1 x2 p2 … … 5 Examples: Probability Histograms #1 0.4 0.2 0 1 2 3 4 Outcomes Probability Probability 0.6 0.6 #1 0.4 0.2 0 0 1 2 3 Outcomes 4 5 6 Examples: Probability Histograms Probability 0.6 #2 0.4 0.2 0 Probability 1 2 Outcomes 3 0.6 4 #2 0.4 0.2 0 0 1 2 3 4 5 Outcomes 7 Properties of a Valid Probability Distribution 1. 0 ≤ pi ≤ 1 2. 𝑝𝑖 (𝑥) = 1 𝑖 8 Example: Discrete Random Variable In a standard deck of cards, we want to know the probability of drawing a certain number of spades when we draw 3 cards with replacement. Let X be the number of spades that we draw. a) What is the distribution? b) Is this a valid distribution? c) What is the probability that you draw at least 1 spade? d) What is the probability that you draw at least 2 spades? 9 Probability Example: Discrete (cont.) 0.6 0.5 0.4 0.3 0.2 0.1 0 Spades Example 0 1 2 3 Number of Spades 10 Example: Discrete Random Variable In a standard deck of cards, we want to know the probability of drawing a certain number of spades when we draw 3 cards. Let X be the number of spades that we draw. a) What is the distribution? b) Is this a valid distribution? c) What is the probability that you draw at least 1 spade? d) What is the probability that you draw at least 2 spades? 11 5.3: Mean, Variance, and Standard Deviation for a Discrete Variable - Goals • Be able to use a probability distribution to find the mean of a discrete random variable. • Calculate means using the rules for means (not in the book) • Be able to use a probability distribution to find the variance and standard deviation of a discrete random variable. • Calculate variances (standard deviations) using the rules for variances for both correlated and uncorrelated random variables (not in the book) 12 Formula for the Mean of a Random Variable 𝐸 𝑋 = 𝜇 = 𝜇𝑋 𝐸 𝑋 = 𝜇 = 𝜇𝑋 = 𝑥𝑖 𝑝𝑖 𝑖 13 Example: Expected value What is the expected value of the following: a) A fair 4-sided die X 1 2 3 4 Probability 0.25 0.25 0.25 0.25 14 Rules for Means Rule 1: If X is a random variable and a and b are fixed numbers, then: µa+bX = a + bµX Rule 2: If X and Y are random variables, then: µXY = µX µY Rule 3: If X is a random variable and g is a function of X, then: 𝐸 𝑔 𝑋 = 𝑔(𝑥𝑖 )𝑝𝑖 15 Example: Expected Value An individual who has automobile insurance form a certain company is randomly selected. Let X be the number of moving violations for which the individual was cited during the last 3 years. The distribution of X is X 0 1 2 3 px 0.60 0.25 0.10 0.05 a) Verify that E(X) = 0.60. b) If the cost of insurance depends on the following function of accidents, g(x) = 400 + (100x -15), what is the expected value of the cost of the insurance? 16 Example: Expected Value Five individuals who have automobile insurance from a certain company are randomly selected. Let X and Y be two different accident profiles in this insurance company: X px 0 0.60 1 0.25 2 0.10 3 0.05 Y pY 0 0.40 1 0.35 2 0.15 3 0.10 E(X) = 0.60 E(Y) = 0.95 c) What is the expected value the total number of accidents of the people if 2 of them have the distribution in X and 3 have the distribution in Y? 17 Example: Expected value An individual who has automobile insurance form a certain company is randomly selected. Let X be the number of moving violations for which the individual was cited during the last 3 years. The distribution of X is X px 0 0.60 1 0.25 2 0.10 3 0.05 d) Calculate E(X2). 18 Variance of a Random Variable 𝑛 𝑖=1 2 𝑥 − 𝑥 𝑖 2 𝑠 = 𝑛−1 Var X = 𝜎 2 = 𝜎𝑋2 Var(X) = E X − 𝜇𝑋 2 = (𝑥𝑖 − X )2 ∙ 𝑝𝑖 = E(X2) – (E(X))2 𝜎𝑋 = 𝑉𝑎𝑟(𝑋) 19 Example: Variance An individual who has automobile insurance form a certain company is randomly selected. Let X be the number of moving violations for which the individual was cited during the last 3 years. The distribution of X is X px 0 0.60 1 0.25 2 0.10 3 0.05 e) Calculate Var(X). 20 Rules for Variance Rule 1: If X is a random variable and a and b are fixed numbers, then: σ2a+bX = b2σ2X Rule 2: If X and Y are independent random variables, then: σ2XY = σ2X + σ2Y Rule 3: If X and Y have correlation ρ, then: σ2XY = σ2X + σ2Y 2ρσXσY 21 Example: Variance An individual who has automobile insurance form a certain company is randomly selected. Let X be the number of moving violations for which the individual was cited during the last 3 years. The distribution of X is X px 0 0.60 1 0.25 2 0.10 3 0.05 a) Calculate the variance of this distribution. b) If the cost of insurance depends on the following function of accidents, g(x) = 400 + (100x -15), what is the standard deviation of the cost of the insurance? 22 Example: Variance 5 individuals who have automobile insurance from a certain company are randomly selected. Let X and Y be two different independent accident profiles in this insurance company: X 0 1 2 3 px 0.60 0.25 0.10 0.05 Var(X) = 0.74 Y pY 0 0.40 1 0.35 2 0.15 3 0.10 Var(Y) = 0.95 What is the standard deviation of the (2X – 3Y)? 23 5.4/5.5: Binomial and Poisson Distributions - Goals • Determine when the random variable X can be modeled using the binomial or Poisson Distributions. • Calculate the probability, mean and standard deviation when X has a binomial or Poisson distribution. 24 Properties of a Binomial Experiment BInS • Binary: There are only two possible outcomes for each trial. • Independent: The outcomes of the trials are independent. • n: The experiment consists of n identical trials where n is fixed.. • Success: For each trial, the probability p of success must be the same. 25 Binomial Setting: Example Do the following use the Binomial Setting? 1. Rolling a fair 4-sided die five times and observing whether the number showing is a 1 or not 2. In a drug trial, 20 patients with the same condition are given a drug and some are given a placebo to see if the drug is effective or not. 3. In quality control we want to see if a particular product is ‘not acceptable’. We take 20 random samples from an assembly line that uses different machines to produce the product. 26 Binomial Distribution The binomial random variable maps each outcome in a binomial experiment to a real number, and is defined to be the number of successes in n trials. • X ~ B(n,p) 27 Examples of Binomial Distribution 1. In a clinical trial, a patient’s condition may improve or not. We study the number of patients who improved. 2. Was a sales transaction considered pleasant? The binomial distribution describes the number of pleasant transactions. 3. In quality control we assess the number of defective items in a lot of goods. 28 Binomial Probabilities Suppose X is a binomial random variable with n trials and probability of a success p. Then 𝑛 𝑥 𝑃 𝑋=𝑥 = 𝑝 (1 − 𝑝)𝑛−𝑥 , 𝑥 = 0,1,2, … , 𝑛 𝑥 𝑛 𝑛! = 𝑥 𝑥! 𝑛 − 𝑥 ! 29 Example: Binomial Distribution Suppose 20% of all copies of a particular textbook fail a certain binding strength test. Let's check a batch of 15 such textbooks. a) Is this a binomial distribution? b) What is the chance that there are no defective textbooks? c) What is the chance that we get less than 3 defective textbooks? d) What is the chance that we get more than 2 defective textbooks? 30 Example: Binomial Distribution (cont) 0.3 Proportion 0.25 0.2 n=15 p=0.2 0.15 0.1 0.05 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 # of defective textbooks 31 Histograms of Binomial Distributions 0.2 0.15 0.1 0.2 0.15 0.1 0.05 0.05 0 0 0 1 2 3 4 5 6 7 8 9 10 Number of successes 0.3 0 1 2 3 4 5 6 7 8 9 10 Number of successes n = 10 p = 0.75 0.25 P(X=x) 0.25 P(X=x) n = 10 p = 0.25 0.25 P(X=x) n = 10 p = 0.5 0.3 0.3 0.2 0.15 0.1 0.05 0 0 1 2 3 4 5 6 7 Number of successes 8 9 10 32 Cumulative Probabilities (CDF) The Cumulative Probability Function is defined as the following probability: P(X ≤ x). 33 Binomial Distribution: Mean and Standard Deviation If X ~ B(n,p) then E(X) = X = np 𝜎𝑋 = 𝑛𝑝(1 − 𝑝) 34 Example: Binomial Distribution (cont) Suppose 20% of all copies of a particular textbook fail a certain binding strength test. Let's check a batch of 15 such textbooks. e) What are the mean and standard deviation of the number of textbooks that will fail the binding test? 35 Poisson Random Variable • The Poisson random variable is a count of the number of times the specific event occurs during a given interval. • Example: – The number of people who enter the Union from noon to 1 pm. – The number of α-particles emitted from Uranium-238 in 1 minute. – The number of DNA fragments found from a sequencing experiment. – The number of dead trees in a square mile of forest. 36 Poisson Experiment 1. The probability that a particular event will occur in a given interval (of time, length, volume, etc.) is the same for all units of equal size and is proportional to the size of the unit. 2. The number of events that occur in any interval is independent of the number that occur in any other non-overlapping interval. 3. The probability that more than one event occurs in a unit of measure is negligible for very smallsized units. 37 Poisson Distribution X ~ Poisson() 𝑒 −𝜆 𝜆𝑥 𝑝 𝑥 =𝑃 𝑋=𝑥 = , 𝑥 = 0, 1, 2, … 𝑥! 𝜆>0 X = 2 = 𝜎𝑋 = 𝜆 38 Example: Poisson Distribution An IT consultant receives an average of 3 calls per hour. Let X be the number of calls the consultant receives. Assume X follows a Poisson distribution. a) What is the chance that the consultant receives exactly one call during the next hour? b) What is the chance that the consultant receives more than one call during the next hour? c) What is the chance that the consultant receives exactly 5 calls during the next two hours? 39