Lecture 2: Review of statistics – one random variable BUEC 333 Professor David Jacks 1 Many of the things we will be interested in this class are random variables (RVs); that is, variables whose outcome is subject to chance. We do not know what value a RV will take until we observe it. Examples: outcome of coin toss; value of the S&P/TSX in the future; your starting salary after graduation. Random variables 2 The actual value taken by a RV is an outcome. We will use capital letters to denote a RV, and lower case letters to denote particular outcomes. Examples: Rolling two dice; call the sum of the values rolled X; a particular outcome might be x = 7. Random variables 3 RVs are said to be discrete if they can only take on a finite (that is, countable) set of values. Examples: 1.) Outcome of a coin toss: {Heads, Tails} 2.) Outcome of rolling a die: {1, 2, 3, 4, 5, 6} 3.) Handedness of next person you meet: {Left, Right} Discrete versus continuous RVs 4 RVs are said to be continuous if they can take on a continuum (that is, uncountable infinite) set of values. Examples: 1.) Value of the S&P/TSX one year from today: any positive number is possible. 2.) Your starting salary after graduation: any positive(?) number is possible. Discrete versus continuous RVs 5 Associated with every possible outcome of a RV is a probability which tells us how likely a particular outcome is. Pr(X = x) denotes the probability that the random variable X takes the value x. Pr(X = x) as the proportion of times that x occurs in the ―long run‖ (in many repeated trials); Probability 6 1.) Probabilities of individual outcomes lie between 0 and 1. a.) If Pr(X = x) = 0, then outcome X = x never occurs. b.) If Pr(X = x) = 1, then outcome X = x always occurs. 2.) The sum of the probabilities of all possible individual outcomes always equals 1. Properties of probability 7 Every RV has a probability distribution. A probability distribution describes the set of all possible outcomes of a RV, and the probabilities associated with each possible outcome. This is summarized by a probability distribution function (or pdf). Probability distributions 8 Example 1: tossing a fair coin Pr (X = Heads) = Pr (X = Tails) = 1/2 Example 2: Rolling a die Pr (X = 1) = Pr (X = 2) = …= Pr (X = 6)= 1/6 Example 3: # of times a laptop crashes before MT Pr (X = 0) = 0.80, Pr (X = 1) = 0.10 Pr (X = 2) = 0.06, Pr (X = 3) = 0.03 Probability distributions 9 An alternate way to describe a probability distribution is the cumulative distribution function (or cdf). It gives the probability that a RV takes a value less than or equal to a given value, Pr(X ≤ x). Example: number of times a laptop crashes before a midterm; the cdf measures the probability that a Cumulative probability distribution 10 1.0 1.0 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0.0 0.0 0 1 2 3 4 0 Cumulative probability distribution 1 2 3 4 11 Because continuous RVs can take an infinite number of values, the pdf and cdf cannot enumerate the probabilities of each of them. Instead we describe the pdf and cdf using functions. Usual notation for the pdf is f(x). Usual notation for the cdf is F(x) = Pr(X ≤ x). The case of continuous RVs 12 Because pdfs and cdfs of continuous RVs can also be complicated functions, we will just use pictures to represent them. We plot outcomes, x, on the horizontal axis, and probabilities, f(x) or F(x), on the vertical axis. The cdf is an increasing function that ranges from zero to one. Graphing pdfs and cdfs of continuous RVs 13 Remember: the area under the pdf gives the probability that X lies in a particular interval. Therefore, the total area under the pdf must be equal to one; e.g., normally distributed test scores 1.00 0.20 cdf 0.80 pdf 0.15 0.60 0.10 0.40 0.05 0.20 0.00 0.00 0.0 25.0 50.0 75.0 100.0 0.0 25.0 50.0 Graphing pdfs and cdfs of continuous RVs 75.0 100.0 14 The pdf and cdf effectively tell us ―everything‖ we might want to know about a RV. But sometimes we only want to describe particular features of a probability distribution. One feature of interest in this course is the expected value or mean of a RV. Another useful measure of the dispersion Describing RVs 15 Think of the expected value of a RV as its long run average over many repeated trials. More intuitively, can be thought of as the ―middle‖ of a probability distribution or a ―good guess‖ of the value of a RV. More precisely, it is a probability-weighted average of all possible outcomes of X. Expected values 16 Example: laptop crashes before a midterm f(0) = 0.80, f(1) = 0.10, f(2) = 0.06 , f(3) = 0.03, f(4) = 0.01 E(X) = 0 * (0.80) + 1 * (0.10) + 2 * (0.06) + 3 * (0.03) + 4 * (0.01) Expected values 17 The general case for a discrete RV when X can take k values x1, x2,…, xk with associated probabilities p1, p2,…, pk: k E ( X ) pi xi i 1 Reminder about sigma notation: sigma represents Even more exciting facts about E(X) 18 We can think of E(X) as a mathematical operation just like (+, -, *, or /). Conveniently, it is also a linear operator which means we can pass it through addition and subtraction operators. That is, if a and b are constants and X is a RV, then E(a + bX) = a + E(bX) = a + bE(X) Example where a = 5, b = 10, and Even more exciting facts about E(X) 19 Variance measures dispersion—how ―spread out‖ a probability distribution is. A large (small) variance means a RV is likely to take a wide (narrow) range of values. Formally, if X takes one of k possible values x1, x2,…, xk with associated probabilities p1, p2,…, pk: 2 Var ( X ) E ( X x ) Variance 20 Because Var(X) is measured in the square of the scale of X, we often prefer the standard deviation: X Var ( X ) which is measured on the same scale as X. Example: variance of laptop crashes Var(X) = (0 – 0.35)2 * (0.80) + (1 – 0.35)2 * (0.10) + (2 – 0.35)2 * (0.06) + (3 – 0.35)2* (0.03) Variance and standard deviation 21 In some sense, a RV’s probability distribution, expected value, and variance are abstract concepts. More precisely, they are population parameters, characteristics of the whole set of possible observations of a RV. An important aside and a preview of things to come 22 As econometricians, our goal is to estimate these parameters (with varying degrees of precision). We do that by computing statistics from a sample of data drawn from the population. The usefulness of econometrics comes from learning about An important aside and a preview of things to come 23