Statistical Inference Mr. Rizos Linear Combinations of Random Variables A random variable is a function that assigns a number to each outcome in the sample space ε. A random variable whose set of values is countable, i.e., can be listed as a sequence of numbers, is called a discrete random variable. For any discrete probability function p(x), the following conditions must hold: • Each value of p(x) is in the interval [0, 1]. That is, 0 ≤ p(X) ≤ 1 for all x. P • The sum of all the values of p(x) must be 1. That is, p(x) = 1. x Consider a random variable X. We can scale this random variable by a parameter a and shift its location by another parameter b to obtain a new random variable Y = aX + b. This process allows us to determine probabilities associated with Y by using the original probability distribution X. Expected Value of a Discrete Random Variable The expected value of a discrete random variable, X, is denoted by E(X). It is also referred to as the mean of X, which is denoted by the symbol µ (Greek ‘m’). X µ = E(X) = xn p (xn ) n Expected values have the following linearity property: E(aX + b) = a E(X) + b. Variance of a Discrete Random Variable Variance is an informal measure of a distribution’s spread with respect to the mean. A small variance implies that the possible values of a distribution are close to the mean. The variance of a random variable, X, is given by Var(X), where: X Var(X) = E (X − µ)2 = (xn − µ)2 p(xn ) n The following properties are useful for calculation purposes: Var(X) = E(X 2 ) − µ2 Var(ax + b) = a2 Var(X) Standard Deviation of a Discrete Random Variable The Standard Deviation, σ, of a random variable, X, is the positive square root of the Variance. That is, p σ = SD(X) = Var(X) A useful property of the standard deviation is that it is expressed in the same units as the data values, and hence can be used to determine margins of error and confidence intervals in experiments. 1 Jim Rizos Examples 1. The probability distribution of X, the number of purchases that person makes at a particular online store per month is given in the following table. Number of online orders, x Pr(X = x) 0 0.2 1 0.4 2 a 3 0.1 4 0.1 The online store has a $5 per month membership fee as well as a flat rate $12 express post delivery fee per online purchase. (a) Determine the value of a. (b) Express the person’s monthly spending, S, at this online store as a linear function of X. (c) Use your answer from part b. to complete the table below. Monthly online spending, s Pr(S = s) (d) Determine the probability that the person spends more than $25 per month at this online store. (e) Determine E(S) and Var(S). 2 Jim Rizos 2. If E(X) = 4 and Var(X) = 3, find: (a) E(2X + 1) (b) Var(2X + 1) (c) SD(2X + 1) Continuous Random Varibales If X is a continuous random variable with probability density function f (x), then: • f (x) ≥ 0 for any real number x. Z ∞ • f (x) dx = 1 −∞ Z x2 • Pr (x1 ≤ X ≤ x2 ) = f (x) dx x1 • Pr(X = x) = 0 for any real number x. Expected Value of a Continuous Random Variable Z ∞ xf (x) dx Mean = µ = E(X) = −∞ Variance and Standard Deviation of a Continuous Random Variable Var(X) = E (x − µ) 2 Z ∞ Z 2 ∞ (x − µ) f (x) dx = = −∞ −∞ σ = SD(X) = 3 p Var(X) x2 f (x) dx − µ2 Jim Rizos Examples 2. The random variable X has the following probability density function: ( a x2 − 2x − 3 if 0 ≤ x ≤ 3 f (x) = 0 elsewhere Find the following statistics: (a) E(X) (b) Var(X) (c) SD(X) (d) Pr(X ≤ 2) Consider another random variable, Y = 3X + 1. (e) Determine Pr(Y ≥ 7). 4 Jim Rizos A Linear Combination of Random Variables For independent random variables X and Y and constants a and b: • E(aX + bY ) = aE(X) + bE(Y ) • Var(aX + bY ) = a2 Var(X) + b2 Var(Y ) Examples 4. To get to university, a student takes two different buses. The time travelled on the first bus, X minutes, is a continuous random variable with a mean of 25 minutes and standard deviation of 5 minutes. The time travelled on the second bus, Y minutes, is a continuous random variable with a mean of 15 minutes and standard deviation of 3 minutes. Find the mean and standard deviation of the total time taken for the student to get to university, if the times taken for each part of the journey are independent. 5