MATH 2560 C F03 Elementary Statistics I LECTURE 17: Random Variables. Outline. ⇒ definition of random variable; ⇒ discrete random variables; ⇒ continuous random variables; ⇒ normal distributions as probability distributions; 1 Definition of Random Variable Random Variable A random variable is a variable whose value is a numerical outcome of a random phenomenon. Example. Let X be the number of heads when we toss a coin four times. X = 2 if outcome is HTTH; X = 1 if outcome is TTTH. Thus, the possible values of X are 0, 1, 2, 3, 4. We call X a random variable. 2 Discrete Random Variables Discrete Random Variable A discrete random variable X has a finite number4 of possible values. The probability distribution of X lists the values and their probabilities: The probabilities pi must satisfy two requirements: 1. 0 ≤ pi ≤ 1; 2. p1 + p2 + ... + pk = 1. How to find the probability of any event: add the probabilities pi of the particular values xi that make up the event. Value of X x1 Probability p1 x2 p2 x3 p3 x4 p4 ... xk ... pk Example 4.15. The student’s Grading in Accounting 210. Data: 18%A, 32%B, 34%C, 9%D, 7%F. A four-point scale (with A=4) is using. Thge four-point scale is a random variable X. Value of X 0 1 2 3 4 Probability 0.07 0.09 0.34 0.32 0.18 Consider event=(student got a B or better)=(grade is 3 or 4). The probability is the following: P (3or4) = P (X = 3) + P (X = 4) = 0.32 + 0.18 = 0.5. Histograms are used to show probability distributions as well as distributions of data. Figure 4.5 displays probability histograms that compare the probability model for random digits (Example 4.10, Lecture 16) with the model given by Benford’s Law (Example 4.9, Lecture 16). Example 4.16. Probability distribution of X that counts the number of heads in four tosses of a coin. Two assumptions: 1) the coin is balanced: each toos is equally likely to give H and T; 2) the coin has no memory: tosses are independent. Figure 4.6 lists the possible 16 outcomes. From Figure 4.6 we obtain, for example: P (HT T H) = (0.5)4 = 1/16 (the multiplication rule). The number of heads X has possible values 0, 1, 2, 3, 4. These values are not equally likely. (X = 0) = (T T T T ) and P (X = 0) = 1/16 = 0.0625. (X = 2)=(six different way (see Figure 4.6)), and P (X = 2) = 6/16 = 0.375. The result is: Value of X 0 1 2 3 4 Probability 0.0625 0.25 0.375 0.25 0.0625 We can find the probability of any event involving the number of heads. For example: 1) P(tossing at least two heads)=P (X ≥ 2) = 0.375 + 0.25 + 0.0625 = 0.6875. 2) P(tossing at least one head)=P (X ≥ 1) = 1 − P (X = 0) = 1 − 0.0625. Figure 4.7 is a probability histogram for this distribution. Shape of the histogram is symmetric. 3 Continuous Random Variable Continuous Random Variable A continuous random variable X takes all values in an interval of numbers. The probability distribution of X is described by a density curve. The probability of any event is the area under the density curve and above the values of X that make up the event (see Figure 4.10). Example. Random numbers generator. Generator (=RAND(), Excel) gives you any number between 0 and 1 as the outcome. Sample space: S=(all numbers x such that 0 ≤ x ≤ 1). ⇒ The problem: how to find the probability of the event such as (0.3 ≤ x ≤ 0.7)? Solution: areas under a density curve. Example 4.17. Uniform Distribution (Figure 4.9). The density curve has height 1 over the interval from 0 to 1. The area under the density curve is 1, and the probability of any event is the area under the density curve and above the event in question. We call X in Example 4.17 a continuous random variable. Properties of continuous random variable: 1) assigning probabilities to intervals of outcomes rather than to individual outcomes; 2) all continuous probability distributions assign probability 0 to every individual outcome (see Figure 4.9 for event P(X=0.8)=0). Thus, the two events (X > 0.8) and (X ≥ 0.8) have the same probability. 4 Normal Distributions as Probability Distributions ⇒ Noramal curve is the Density Curve (see Lectures 3-4). ⇒ Normal distributions N (µ, σ) are probability distributions: any density curve describes an assignment of probabilities. ⇒ If X has the N (µ, σ) distribution, then the standardized variable z = X−µ is a standard normal variable havinf the distribution N (0, 1). σ Example 4.18. Opinion poll, SRS of 1500 American adults, p=0.3-population parameter (30% say ”drugs” is the most serious problem facing our shcool). Statistics p̂ for estimation p is a random variable that has approximately the N (0.3, 0.0118) distribution. It is unbiased estimation of p. What is the probability that the poll result differs from the truth about the population by more than two percentage point? 1st way to solve the problem: By addition rule: P (p̂ < 0.28orp̂ > 0.32) = P (p̂ < 0.28) + P (p̂ > 0.32). Using Table B ( after standardizing): p̂−0.3 P (p̂ < 0.28) = P ( 0.0118 < 0.28−0.3 ) = P (z < −1.69) = 0.0455. 0.0118 p̂−0.3 P (p̂ > 0.32) = P ( 0.0118 > 0.32−0.3 ) = P (z > 1.69) = 0.0455. 0.0118 Hence, P (p̂ < 0.28orp̂ > 0.32) = P (p̂ < 0.28) + P (p̂ > 0.32) = 2 × 0.0455 = 0.0910. 2nd way to solve the problem: Let us first find the probability of the complement event: P (0.28 < p̂ < 0.32) = P ( 0.28−0.3 ≤ z ≤ 0.32−0.3 ) = P (−1.69 < z < 0.0118 0.0118 1.69) = 0.9545 − 0.0455 = 0.9090. The by complement rule: P (p̂ < 0.28orp̂ > 0.32) = 1 − P (0.28 ≤ p̂ ≤ 0.32) = 1 − 0.9090 = 0.0910. Figure 4.11 shows the probability as an area under a normal density curve (see Example 4.18). 5 Summary 1. A radnom variables is a variable taking numerical values determined by the outcome of a random phenomenon. The probability distribution of a random variable X tells us ehat the possible values of X are and how probabilities are assigned to those values. 2. A random variable X and its distribution can be discrete or continuous. 3. A discrete random variable has finitely many possible values. The probability distribution assigns each of these values a probability between 0 and 1 such that the sum of all the probabilities is exactly 1. The probability of any event is the sum of the probabilities of all the values that make up the event. 4. A continuous random variable takes all values in some interval of numbers. A density curve describes the probability distribution of a continuous random variable. The probability of any event is the area under the curve above the values that make up the event. 5. Normal distributions are one type of continuous probability distribution. 6. We can picture a probability distribution by drawing a probability histogram in the discrete case or by graphing the density curve in the continuous case.