ECON 300 Econometrics Review Dennis C. Plott University of Illinois at Chicago Department of Economics www.dennisplott.com https://uic.blackboard.com Fall 2014 ECON 300 – Econometrics Review Fall 2014 Contents 1 Analytical Tools 1.1 Subscripts . . . . . . . . . . . 1.2 Growth Rates . . . . . . . . . 1.3 Question . . . . . . . . . . . . 1.4 Rule of 70 (72) . . . . . . . . 1.5 Slope (of a Linear Line) . . . 1.6 Tangent Lines . . . . . . . . . 1.7 Percentage Points vs. Percent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Fundamentals of Probability 2.1 Random Variables and Their Probability Distributions . . . . . . 2.1.1 Discrete Random Variables . . . . . . . . . . . . . . . . . 2.1.2 Continuous Random Variables . . . . . . . . . . . . . . . 2.2 Joint Distributions, Conditional Distributions, and Independence 2.2.1 Joint Distributions and Independence . . . . . . . . . . . 2.2.2 Conditional Distributions . . . . . . . . . . . . . . . . . . 2.3 Features of Probability Distributions . . . . . . . . . . . . . . . . 2.3.1 A Measure of Central Tendency: The Expected Value . . 2.3.2 Properties of Expected Values . . . . . . . . . . . . . . . . 2.3.3 Another Measure of Central Tendency: The Median . . . 2.3.4 Measures of Variability: Variance and Standard Deviation 2.3.5 Standardizing a Random Variable . . . . . . . . . . . . . 2.4 Features of Joint and Conditional Distributions . . . . . . . . . . 2.4.1 Measures of Association: Covariance and Correlation . . . 2.4.2 Variance of Sums of Random Variables . . . . . . . . . . . 2.5 The Normal and Related Distributions . . . . . . . . . . . . . . . 2.5.1 The Normal Distribution . . . . . . . . . . . . . . . . . . 2.5.2 The Normal Distribution with Stata . . . . . . . . . . . . 2.5.3 Normal probability calculation . . . . . . . . . . . . . . . 2.5.4 The Standard Normal Distribution . . . . . . . . . . . . . 2.5.5 Additional Properties of the Normal Distribution . . . . . 2.5.6 The t Distribution . . . . . . . . . . . . . . . . . . . . . . 2.5.7 Student’s t-distribution with Stata . . . . . . . . . . . . . 2.5.8 t-distribution Probabilities . . . . . . . . . . . . . . . . . 2.5.9 Graphing Tail Probabilities . . . . . . . . . . . . . . . . . 2.6 Probability Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Fundamentals of Mathematical Statistics 3.1 Populations, Parameters, and Random Sampling 3.1.1 Sampling . . . . . . . . . . . . . . . . . . 3.2 Finite Sample Properties of Estimators . . . . . . 3.2.1 Estimators and Estimates . . . . . . . . . 3.2.2 Unbiasedness . . . . . . . . . . . . . . . . 3.2.3 The Sampling Variance of Estimators . . 3.2.4 Efficiency . . . . . . . . . . . . . . . . . . 3.3 Interval Estimation and Confidence Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 . 16 . 16 . 16 . 16 . . 17 . 18 . 18 . 18 4 Using Stata to Do Monte Carlo Experiments 4.1 Generating Uniformly Distributed Random Numbers . . . . . . . . . . . . . . . . 4.1.1 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.2 Graph the Uniformly Distributed Random Variable . . . . . . . . . . . . . 4.1.3 Generating a Discrete Random Variable: Example Simulating the Rolls of 4.1.4 Graph the Uniformly Distributed Random Variable . . . . . . . . . . . . . 4.2 The Law of Large Numbers and the Frequentist Notion of Probability . . . . . . 4.3 Tossing Coins with Stata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . a . . . . . . . . . . . . Die. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 1 2 2 2 3 3 . . . . . . . University of Illinois at Chicago . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Change . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4 4 4 5 5 6 6 6 6 6 6 7 7 7 9 11 11 11 12 13 13 13 14 14 15 15 19 19 19 19 19 20 20 21 i ECON 300 – Econometrics Review 5 Principles of Economics: A Review 5.1 Non Sequitur : Random Famous People Who Majored in 5.2 Microeconomics versus Macroeconomics . . . . . . . . . 5.3 Supply & Demand Basics . . . . . . . . . . . . . . . . . 5.4 Secondary (Tertiary) Effects . . . . . . . . . . . . . . . . 5.5 Marginal Analysis & Utility . . . . . . . . . . . . . . . . 5.6 Diminishing (Marginal) Returns . . . . . . . . . . . . . 5.7 Positive vs. Normative . . . . . . . . . . . . . . . . . . . 5.8 Principles of Macroeconomics: A Review . . . . . . . . . Fall 2014 Economics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 . 22 . 22 . 22 . 23 . 23 . 23 . . 24 . . 24 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . List of Figures 1 2 3 4 5 6 7 8 9 10 11 12 Probability Distribution for a Six-Sided Die . . . . . . . . . . . . . Pick a Number, Any Number . . . . . . . . . . . . . . . . . . . . . A Continuous Probability Distribution for the Spinner . . . . . . . Probability Distribution for Six-Sided Dice, Using Standardized Z Probability Distribution for Fair Coin Flips, Using Standardized Z Correlations: Scatter Plots and Linear Fit Lines . . . . . . . . . . Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . Bias vs. Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . Change in Mean, Variance, and Mean and Variance Example . . . Dilbert Company Economist . . . . . . . . . . . . . . . . . . . . . Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Diminishing Returns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 5 5 8 9 10 11 17 18 21 23 24 List of Tables University of Illinois at Chicago ii ECON 300 – Econometrics Review Fall 2014 The notes provided are meant to be useful, but do not necessarily contain everything needed to do well in the course. The remaining components include: textbook reading, in-class notes, and applying the material by solving problems. Recommended Readings – Wooldridge (2013) ∗ Appendix B ∗ Appendix C – Gujarati (2003) ∗ Appendix A Recommended Videos – Hans Rosling “The Best Stats You’ve Ever Seen” – Hans Rosling “New Insights on Poverty” Note: you are not expected to completely understand the Stata “code”1 in the blue boxes at this point. It is included in the Review notes to give you a sense of what Stata can do and demonstrate, in an ad hoc manner, the concept being discussed. Throughout the course I will work through any and all Stata code you will use, step-by-step, so don’t panic. 1 Analytical Tools 1.1 Subscripts • An individual observation often corresponds to a specific economic unit, such as a person, household, corporation, firm, organization, country, state, city or other geographical region. These are often represented with a subscript (i). – For example, if using income data collected on individuals, then incomei represents the income for the ith individual. • The time subscript (t) allows us to be more specific about the timing of a variable • The subscript t (current), t − 1 (past), or t + 1 (future) tells us in which period something is happening • The time subscript (t) can represent any type of data; e.g. annually, quarterly, monthly, etc. • Examples: – if we have annual data and t = 2010, then t − 1 = 2009 and t + 1 = 2011 – if we have monthly data and t = January 2010, then t−1 = December 2009 and t+1 = February 2010 1.2 Growth Rates growth rate = %∆X Xt+1 − Xt = × 100% Xt Xt+1 = − 1 × 100% Xt Example: Suppose your nominal wage per hour is currently Wt = $12.00 and your boss states will will get a sixty cent raise per hour; i.e.,Wt+1 = $12.60. What is this in percentage terms? Wt+1 − Wt × 100% Wt $12.60 − $12.00 %∆W = × 100% $12.00 %∆W = 5% %∆W = 1 A note for the computer science people: throughout the course I will refer to the Stata “code” as code. I am using this term loosely. University of Illinois at Chicago 1 ECON 300 – Econometrics Review Fall 2014 At this rate, assuming you received a five-percent raise per year, how long would it take for your salary to double? 1.3 Question • Which headline would you prefer if you are planning on purchasing a new house several years in the future? Assume you currently do not own any property. 1. “Housing Prices Have Soared Doubling in the Last Decade” 2. “The Market for Housing Grew Moderately at 7% Last Year, Consistent with the Ten Year Average” For simplicity, assume the relevant growth rate for these two respective headlines remains constant. 1.4 Rule of 70 (72) • Rule of 70 (72): the approximate amount of time (e.g. years) it takes for the level of a variable growing at a constant rate to double. T2 ≈ 70 R where – T2 : approximate time for variable to double – R: constant growth rate percent • Use 70 for numbers ending in 0, 2, 5, 7, and 10 • Use 72 for numbers ending in 3, 4, 6, 8, and 9 Continuing from the previous subsection it would take approximately 1.5 70 = 14 years for your salary to double. 5 Slope (of a Linear Line) y y = mx + b y2 rise y1 run y-intercept x x1 University of Illinois at Chicago x2 2 ECON 300 – Econometrics Review Fall 2014 rise y2 − y1 y1 − y2 ∆y = = = =m run x2 − x1 x1 − x2 ∆x b = y-intercept slope = 1.6 Tangent Lines • tangent (line): A line that just touches a curve at one point, without cutting across it. y tangent to y = f (x) y = f (x) x 1.7 Percentage Points vs. Percent Change • Percentage is relative, while percentage points are absolute. • Generally speaking, percentage points should be used to measure the difference between two percentages, since it gives a clearer view of the differences than when percentages are used. • Examples: – If we say that the number of female CEOs increase by 3%, we mean that the number increase with 3% of the current number of female CEOs. If we say the number increases with 3 percentage points, we mean that the number of female CEOs increase with 3% of the total number of CEOs. So if 5% of all CEOs are female a 3% increase would not be noticeable, since it increased the number of female CEOs to 5.015% of the total number of CEOs. However, if the number of female CEOs increases with 3 percentage points, this implies 8% of all CEOs would be female. Quite a difference. – Let’s say that a poll in year 1 shows that 10% of the population supports consuming children for food (to take a hopefully absurd example). In year 2 the poll shows a 20% decrease in the support compared to year 1. However, in year 3, the same number has gone up by 25% compared to year 2. Many people would get the impression that the number of children as food supporters in year 3 is higher than in year 1, but that’s actually not the case. In year one 10% supported eating children. The next year, the number fell by 20%. 20% of 10% is 2%, which means that 8% supports consuming children. Then the number of supporters increased with 25%. 25% of 8% is 2%, so the total is back up to 10%. University of Illinois at Chicago 3 ECON 300 – Econometrics Review Fall 2014 If we have used percentage points, we could just say that in year 2, the number of supporters fell by 2 percentage points, and that the number of supporters increased by the same amount of percentage points in year 3. Thus making it much clearer that the amount of supporters was the same in year 1 and year 3. • Non Sequitor : See Jonathan Swift’s “A Modest Proposal” 2 Fundamentals of Probability 2.1 Random Variables and Their Probability Distributions • Experiment: any procedure that can, at least in theory, be infinitely repeated and has a well-defined set of outcomes. • Random Variable: numerical outcome of a random trial; examples: number from the roll of a die, survey responses from a randomly drawn sample, and estimators. • A random variable X is a variable whose numerical value is determined by chance, the outcome of a random phenomenon • Bernoulli (or binary) random variable: A random variable that can only take on the values zero and one 2.1.1 Discrete Random Variables • A discrete random variable has a countable number of possible values, such as 0, 1, and 2; takes on a finite number of values (as in the roll of a die) • A probability distribution P [Xi ] for a discrete random variable X assigns probabilities to the possible values X1 , X2 , and so on • For example, when a fair coin is flipped it can be heads or tails; when a fair six-sided die is rolled, there are six equally likely outcomes, each with a 1/6 probability of occurring – The die can be 1, 2, 3, 4, 5, or 6 – Figure 1 shows this probability distribution Figure 1: Probability Distribution for a Six-Sided Die 2.1.2 Continuous Random Variables • A continuous random variable can take on any value over a specified range. In practice what we often loosely refer to variables as “continuous” if they take on a large number of values. For example, we often call years of education a continuous variable even though it typically takes on only roughly 20 values. • Examples of continuous random variables include time and distance since both can take on any value in an interval University of Illinois at Chicago 4 ECON 300 – Econometrics Review Fall 2014 – As another example, Figure 2 shows a spinner for randomly selecting a point on a circle • A continuous probability density curve shows the probability that the outcome is in a specified interval as the corresponding area under the curve – This is illustrated for the case of the spinner in Figure 3 Figure 2: Pick a Number, Any Number Figure 3: A Continuous Probability Distribution for the Spinner • Cumulative Density Function (or distribution function in the case of a discrete random variable) or “CDF” measures the probability that a random variable takes on a value below a specified value. Often denoted with a capital letter function and a lowercase argument, as in G(x). Note that the argument is a number, not a random variable. For random variable X, G(x) = P r(X < x). • Probability Density Function (distribution in the case of a discrete random variable) or “PDF”. Note in the case of a continuous random variable, the PDF does not measure a probability, since the probability that a continuous random variable takes on any particular value is zero. • The CDF probabilities associated with a standard – mean zero, variance one – normal random variable are shown in Table G.1 in Wooldridge (2013). 2.2 2.2.1 Joint Distributions, Conditional Distributions, and Independence Joint Distributions and Independence • Two random variables are independent if the probability that one takes on any particular value is unrelated to the value the other variable takes on.2 2 Technically, the condition is written as g (X, Y ) = g (x) · g (y) where g (X, Y ) is the joint PDF – integrated over a range, xy x y xy it gives the joint probability that X is in the specified range at the same time as Y is in the specified range – and gx (x) and gy (y) are the PDFs of X and Y , respectively (technically called the “marginal” PDFs). University of Illinois at Chicago 5 ECON 300 – Econometrics Review Fall 2014 – Observations in a simple random sample are independent. If I survey people at random, the answers one person gives to questions will be on average unrelated to the other respondents’ answers. – In linear regression, we are often talk about a weaker condition, so-called “mean independence”: E[u|X] = 0. This condition says the expected value of random variable u (which in this example is zero) is unrelated to the value random variable X takes on. – If two variables are independent or even just mean independent, they also have zero correlation and zero covariance. See below for these terms. 2.2.2 Conditional Distributions • |= “given that” or “conditional on” as in P r(purple-people eater|one-eye, one-horn) = probability of being a purple people eater given that you have one eye and one horn, or E[drinks last weekend|fraternity member] = expected number of drinks consumed last weekend by a fraternity member. 2.3 2.3.1 Features of Probability Distributions A Measure of Central Tendency: The Expected Value • Expected value – E[x] = population average – The expected value (or mean) of a discrete random variable X is a weighted average of all possible values of X, using the probability of each X value as weights: X Xi P [Xi ] (17.1) µ = E[X] = i 2.3.2 Properties of Expected Values • Recall that expected value is a linear operator; that is, for an random variables X and Y , and for any numbers a, b, and c, E[aX + bY + c] = aE[X] + bE[Y ] + c. For example, if the mean income in some population is $50,000, and you give everybody $1000 (= c) the mean income becomes $51,000. If you then convert everyone’s income to Euros at 1.00¤/$1.33 (=a) , mean income becomes $51,000/$1.33 = 38,476¤ – Note that c is just a number – it is not random – so its expected value is the number itself. 2.3.3 Another Measure of Central Tendency: The Median • Median: the midpoint of the data. The median is the number above which lie half the observed numbers and below which lie the other half. The median is not sensitive to outliers. • Mode: the most frequently occurring value. 2.3.4 Measures of Variability: Variance and Standard Deviation • Variance: the expected squared deviation of a random variable from its mean. If the population mean of a random variable X is µ, the variance is defined as E[(X − µ)2 ]. – The sample analog of the variance in a dataset, also known as an estimator (see below) of the variance, replaces µ with the sample average, and adds up the data and divides by N − 1: N 1 X (Xi − X)2 , N − 1 i=1 where N is again the sample size and i indexes the observations of the dataset. This also happens to be an unbiased (see below) estimator of the variance (whereas dividing by N instead of N − 1 produces a downward biased estimator). – It is common to denote the population variance using σ 2 , where σ is the lowercase Greek letter sigma, which represents the standard deviation – Standard deviation – square root of the variance. This also measures how much variation there is in the data, but it is more useful because it is measured in units of the original data (as opposed to squared units with the variance). The standard deviation of a population is often denoted σ. University of Illinois at Chicago 6 ECON 300 – Econometrics Review Fall 2014 – Note that the common usage of “the standard deviation” refers to the sample estimator concept in a particular dataset. But in statistics and econometrics there is also a population concept defined in terms of random variables. This is why it is meaningful to talk about things like “the standard deviation of an estimator” even though in practice we only typically have one sample.3 • The variance of a discrete random variable X can also be seen as a weighted average, for all possible values of X, of the squared difference between X and its expected value, using the probability of each X value as weights: X σ 2 = E[(X − µ)2 ] = (Xi − µ)2 P [Xi ] (17.2) i • The standard deviation σ is still the square root of the variance 2.3.5 Standardizing a Random Variable • Standardizing a random variable means subtracting off the mean and dividing by the standard deviation; if X has a mean of µ and a standard deviation of σ, then standardized X is X −µ σ – This results in a new random variable which has a mean of zero and a standard deviation of 1. • To standardize a random variable X, we subtract its mean and then divide by its standard deviation σ: Z= X −µ σ (17.3) • No matter what the initial units of X, the standardized random variable Z has a mean of 0 and a standard deviation of 1 • The standardized variable Z measures how many standard deviations X is above or below its mean: – If X is equal to its mean, Z is equal to 0 – If X is one standard deviation above its mean, Z is equal to 1 – If X is two standard deviations below its mean, Z is equal to −2 • Figure 4 and Figure 5 illustrates this for the case of dice and fair coin flips, respectively 2.4 2.4.1 Features of Joint and Conditional Distributions Measures of Association: Covariance and Correlation • Covariance: a general measure of relatedness of two random variables analogous to the variance. If the population mean of a random variable X is µX and Y is µY , then the covariance is defined as E[(X − µX )(Y − µY )]. – The sample estimator of the covariance in a dataset replaces the µ’s with the sample averages, and adds up the data and divides by N − 1: N 1 X (Xi − X)(Yi − Y ) N − 1 i=1 – The covariance does not have very meaningful units, and its magnitude is hard to interpret. But the sign tells you whether X and Y are positively or negatively related. • You may recall that the correlation, a number between −1 and +1, standardizes the covariance by dividing by the standard deviation of each variable. It measures the strength, but not the magnitude (slope) of any linear relationship between X and Y . It has no units, and is typically denoted with an “r” or “R” if it is a sample estimate, and a ρ when it is a population concept. 3 Note that when we do so, we are considering the situation before we have collected the sample; X represents what we might i get from a random draw from the population, not the actual data. University of Illinois at Chicago 7 ECON 300 – Econometrics Review Fall 2014 Figure 4: Probability Distribution for Six-Sided Dice, Using Standardized Z • The linear distinction is important: in extreme cases two variables could even be perfectly related but have zero correlation (if that relationship was nonlinear)! (For example, if y = x2 , y and x would be perfectly related. However, y and x would have zero correlation: a straight slope fitted between y and x would have zero slope. Draw a picture to see why.) Let’s look at this in Stata. The following Stata code (Listing 1) produces Figure 6. 1 2 clear all s e t s e e d 12345 3 4 5 6 7 8 9 10 11 12 ∗ P e r f e c t P o s i t i v e Relationship ( r = 1) s e t obs 50 gen x1 = r n o r m a l ( ) su gen y1 = x1 graph twoway ( l f i t y1 x1 ) /// ( s c a t t e r y1 x1 , t i t l e ( ” P e r f e c t P o s i t i v e R e l a t i o n s h i p ( r = 1 ) ” ) ) , /// name ( c o r r 1 ) c o r r y1 x1 13 14 15 16 17 18 19 20 21 22 23 ∗ Positive Relationship ( r = 0.67) s e t obs 50 gen x4 = r n o r m a l ( ) gen u1 = r n o r m a l ( ) su gen y4 = x4 + u1 graph twoway ( l f i t y4 x4 ) /// ( s c a t t e r y4 x4 , t i t l e ( ” P o s i t i v e R e l a t i o n s h i p ( r = 0 . 6 7 ) ” ) ) , /// name ( c o r r 4 ) c o r r y4 x4 24 University of Illinois at Chicago 8 ECON 300 – Econometrics Review Fall 2014 Figure 5: Probability Distribution for Fair Coin Flips, Using Standardized Z 25 26 27 28 29 30 31 32 33 ∗ P e r f e c t N e g a t i v e R e l a t i o n s h i p ( r = −1) s e t obs 50 gen x2 = r n o r m a l ( ) su gen y2 = −x2 graph twoway ( l f i t y2 x2 ) /// ( s c a t t e r y2 x2 , t i t l e ( ” P e r f e c t N e g a t i v e R e l a t i o n s h i p ( r =−1) ” ) ) , /// name ( c o r r 2 ) c o r r y2 x2 34 35 36 37 38 39 40 41 42 43 ∗ P e r f e c t Q u a d r a t i c R e l a t i o n s h i p ( r =0) s e t obs 100000 gen x3=r n o r m a l ( ) su gen y3 = x3 ∗ x3 graph twoway ( l f i t y3 x3 ) ( s c a t t e r y3 x3 , /// t i t l e ( ” P e r f e c t Q u a d r a t i c R e l a t i o n s h i p ( r =0) ” ) ) , /// name ( c o r r 3 ) c o r r y3 x3 44 45 graph combine c o r r 1 c o r r 4 c o r r 2 c o r r 3 Listing 1: Correlations: Scatter Plots and Linear Fit Lines – In a bivariate (one Y , one X) linear regression only R2 is literally the squared correlation between Y and X. In a multivariate regression this interpretation does not hold. This will be discussed when we get to regression. University of Illinois at Chicago 9 ECON 300 – Econometrics Review Fall 2014 Figure 6: Correlations: Scatter Plots and Linear Fit Lines 2.4.2 Variance of Sums of Random Variables • In general, for random variables X and Y and numbers a, b, and c, V ar(aX + bY + c) = a2 V ar(X) + b2 V ar(Y ) + 2abCov(X, Y ). Listing 2 provides an ad hoc proof using Stata. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 clear all s e t s e e d 12345 s e t obs 10000 gen x = rn o r m a l ( ) gen y = rn o r m a l ( ) gen a = 3 gen b = −4 gen c = 7 . 5 gen z = a ∗x + b∗y + c su correlate x y z , covariance return l i s t gen v a r x = r ( Var 1 ) gen v a r y = r ( Var 2 ) gen c o v x y = r ( c o v 1 2 ) su z , d return l i s t gen v a r z = r ( Var ) ∗ Var (aX + bY + c ) ?=? a ˆ2 Var ( x ) + bˆ2 Var ( y ) + 2abCov ( x , y ) di var z ∗ 24.503485 d i a ∗ a ∗ v a r x + b∗b∗ v a r y + 2∗ a ∗b∗ c o v x y ∗ 24.503484 ∗ Var (aX + bY + c ) = a ˆ2 Var ( x ) + bˆ2 Var ( y ) + 2abCov ( x , y ) ∗ 24.503484 = 24.50348 Listing 2: Variance of Sums of Random Variables Note that: – Since c is just a number, it does not affect the variance. If you gave everybody in the room $5, it would raise mean wealth in the room, but not affect the variation in wealth. University of Illinois at Chicago 10 ECON 300 – Econometrics Review Fall 2014 – If X and Y are independent, as is assumed to be for different observations in a random sample (in a simple random sample, the data are “independent and identically distributed” or “iid”) then the covariance term disappears, so V ar(aX + bY + c) = a2 V ar(X) + b2 V ar(Y ) – Why are the constants a and b squared? Recall that the variance is the expected value of the squared deviations. So if you multiply all the data by 2, the variance goes up by a factor of four, not 2. The standard deviation goes up by a factor of 2. – Note from the definition of covariance that the covariance of a variable with itself is the variance: Cov(X, X) = V ar(X). 2.5 2.5.1 The Normal and Related Distributions The Normal Distribution • The density curve for the normal distribution is graphed in Figure 7 • The probability that the value of Z will be in a specified interval is given by the corresponding area under this curve • These areas can be determined by consulting statistical software or a table, such as Table G.1 in Wooldridge (2013) • Many things follow the normal distribution (at least approximately): – the weights of humans, dogs, and tomatoes – The lengths of thumbs, widths of shoulders, and breadths of skulls – Scores on IQ, SAT, and GRE tests – The number of kernels on ears of corn, ridges on scallop shells, hairs on cats, and leaves on trees Figure 7: Normal Distribution • The central limit theorem is a very strong result for empirical analysis that builds on the normal distribution • The central limit theorem states that: – if Z is a standardized sum of N independent, identically distributed (discrete or continuous) random variables with a finite, nonzero standard deviation, then the probability distribution of Z approaches the normal distribution as N increases University of Illinois at Chicago 11 ECON 300 – Econometrics 2.5.2 Review Fall 2014 The Normal Distribution with Stata First, let us plot the normal density function using a twoway function which plots y = f (x). It makes no difference if x and y are existing variables. The Stata function normalden returns the value of the standard normal density φ(x) for a given x. To plot the density over [−6, 6] use 1 2 3 4 5 6 clear all pwd cd ”C: \ U s e r s \ Dennis \Box Sync \ ECON 300 Stata ” twoway f u n c t i o n y = normalden ( x ) , r a n g e (−6 6 ) t i t l e ( ” Standard Normal D e n s i t y ” ) /// s a v i n g ( normal , r e p l a c e ) /// The saving option will save the graph as a Stata graph (*.gph). We have saved the graph so that we can access it later. It will appear in your default directory. Including this option is often a good idea. The plot of any normal density can be obtained by modifying the command only slightly. Enter help normalden to find a normalden(x,s) for a normal density with mean 0 and standard deviation s, and normalden(x,m,s) for a normal density with mean m and standard deviation s. The following code illustrates the use of different line patterns (lpattern) on a graph. 1 2 3 4 5 6 7 8 9 10 11 12 clear all pwd cd ”C: \ U s e r s \ Dennis \Box Sync \ ECON 300 Stata ” twoway f u n c t i o n y = normalden ( x ) , r a n g e (−5 5 ) | | f u n c t i o n y = normalden ( x , 0 . 8 ) , /// r a n g e (−5 5 ) l p a t t e r n ( dash ) /// | | f u n c t i o n y = normalden ( x , 1 , 0 . 8 ) , /// r a n g e (−5 5 ) l p a t t e r n ( d a s h d o t ) /// | | , t i t l e ( ” Normal D e n s i t i e s ” ) /// l e g e n d ( l a b e l ( 1 ”N( 0 , 1 ) ” ) l a b e l ( 2 ”N( 0 , 0 . 8 ˆ 2 ) ” ) l a b e l ( 3 ”N( 1 , 0 . 8 ˆ 2 ) ” ) ) /// s a v i n g ( normal 3 , r e p l a c e ) /// /// We will have a different legend for the three curves. For help with legends enter help legend option. We have used the graph separator notion || rather than () because it clearly demarks the end of one function and the beginning of another. 2.5.3 Normal probability calculation Now calculate some normal probabilities. The cdf for the standardized normal variable Z is so widely used that it is given its own special symbol, Φ(z) = P (Z ≤ z). It is used in probability calculations as b−µ a−µ ≤Z≤ P [a ≤ X ≤ b] = P σ σ b−µ a−µ =Φ −Φ σ σ The Stata function normal computes the function Φ(z) = P (Z ≤ z). To compute the left tail probability P [Z ≤ 1.33] = Φ(1.33) use 1 2 s c a l a r n t a i l = normal ( 1 . 3 3 ) d i ” l o w e r t a i l p r o b a b i l i t y N( 0 , 1 ) < 1 . 3 3 i s ” n t a i l The result is lower tail probability N(0,1) < 1.33 is .90824086 For example, if X ∼ N (3, 9), then P [4 ≤ Z ≤ 6] = P [0.33 ≤ Z ≤ 1] = Φ(1) − Φ(0.33) = 0.8413 − 0.6293 = 0.2120 The commands are 1 2 s c a l a r prob = normal ((6 −3) / 3 ) − normal ((4 −3) / 3 ) d i ” p r o b a b i l i t y 3<=N( 3 , 9 )<=6 i s ” prob The result is University of Illinois at Chicago 12 ECON 300 – Econometrics Review Fall 2014 probability 3<=N(3,9)<=6 is .21078609 To compute percentiles of the standard normal distribution which are used as critical values for tests or in the calculation of interval estimates, use invnormal. For example, for the 95th percentile of the standard normal distribution use 1 2 s c a l a r n 95 = invnormal ( . 9 5 ) d i ”95 th p e r c e n t i l e o f s t a n d a r d normal = ” n 9 5 Producing 95th percentile of standard normal = 1.6448536 2.5.4 The Standard Normal Distribution 2.5.5 Additional Properties of the Normal Distribution • Any linear transformation of a normally distributed random variable is also normally distributed. This allows us to transform a variable and look up probabilities that it takes on values in particular ranges using standard tables, such as those in Appendix G in Wooldridge (2013). – For example, if X is normally distributed with mean 3 and variance 4, then X −3 P r(X < −1) = P r 2 = P r(Z < −2), where Z is (often) used as a symbol for a standard normal random variable. According to the table, this probability is 0.0228. (Do you see this? How would you instead calculate P r(X > −1)?) – The sum of independent (see below for definition), normally distributed random variables is also normally distributed 2.5.6 The t Distribution • When the mean of a sample from a normal distribution is standardized by subtracting the mean of its sampling distribution and dividing by the standard deviation of its sampling distribution, the resulting Z variable t= X −µ √s N (17.7) has a normal distribution • W.S. Gosset determined (in 1908) the sampling distribution of the variable that is created when the mean of a sample from a normal distribution is standardized by subtracting and dividing by its standard error (≡ the standard deviation of an estimator): Z= X −µ √σ N • Non sequitur The original t-test was developed by William Sealy Gosset, a biochemist who worked for Guinness in the early 20th century. At this time, the Guinness brewery was taking the progressive step of hiring statisticians, chemists and other scientists to improve the quality and consistency of their ingredients and their beer. Gosset could only test small samples out of the entire quantity of beer brewed by Guinness while he worked there. He also tested samples of barley grown by the brewery. He came up with a statistical formula that would let him prove whether or not the very small samples reflected a much larger population (in this case, the ‘population’ would be all of the beer or barley used in Guinness’ operations). Previously, statistical research had focused only on testing with large samples. University of Illinois at Chicago 13 ECON 300 – Econometrics Review Fall 2014 • The exact distribution of t depends on the sample size, – as the sample size increases, we are increasingly confident of the accuracy of the estimated standard deviation • Table G.2 at the end of the textbook shows some probabilities for various t-distributions that are identified by the number of degrees of freedom: degrees of freedom = # observations − # estimated parameters 2.5.7 Student’s t-distribution with Stata The t-density function can be plotted similarly to the Normal. The t-distribution shape is determined by a single parameter called the degrees of freedom. To plot the t-density for 3 degrees of freedom use the tden function to generate the density values, and plot them as above. In fact we can overlay this plot against the standard normal to illustrate the differences 1 2 3 4 5 6 7 clear all cd ”C: \ U s e r s \ Dennis \Box Sync \ ECON 300 Stata ” twoway f u n c t i o n y = normalden ( x ) , r a n g e (−5 5 ) /// | | f u n c t i o n y = tden ( 3 , x ) , r a n g e (−5 5 ) l p a t t e r n ( dash ) | | , t i t l e ( ” Standard Normal and t ( 3 ) ” ) /// l e g e n d ( l a b e l ( 1 ”N( 0 , 1 ) ” ) l a b e l ( 2 ” t ( 3 ) ” ) ) /// saving ( t 3df , replace ) 2.5.8 /// t-distribution Probabilities Calculating t-distribution probabilities is accomplished using the Stata function ttail. This function returns the probability in the upper tail of the t-distribution. For example, to compute the probabilities that a t(3) random variable will be greater than 1.33, and then less than 1.33, use 1 2 3 scalar t t a i l = t t a i l (3 ,1.33) d i ” upper t a i l p r o b a b i l i t y t ( 3 ) > 1 . 3 3 = ” t t a i l di ” lower t a i l p r o b a b i l i t y t (3) < 1.33 = ” 1 − t t a i l ( 3 , 1 . 3 3 ) These commands produce upper tail probability t(3) > 1.33 = .13779644 lower tail probability t(3) < 1.33 = .86220356 University of Illinois at Chicago 14 ECON 300 – Econometrics Review Fall 2014 Percentiles for the t-distribution use the Stata function invttail. To compute the 95th , and 5th , percentiles for the t(3) distribution use 1 2 3 scalar t3 95 = i n v t t a i l (3 ,.05) d i ”95 th p e r c e n t i l e o f t ( 3 ) = ” t 3 9 5 d i ”5 th p e r c e n t i l e o f t ( 3 ) = ” i n v t t a i l ( 3 , . 9 5 ) Producing the results upper tail probability t(3) > 1.33 = .13779644 lower tail probability t(3) < 1.33 = .86220356 2.5.9 Graphing Tail Probabilities Illustrating tail probabilities is a useful skill. Find the 95th percentile of the t(38) distribution. 1 d i ”95 th p e r c e n t i l e o f t ( 3 8 ) = ” i n v t t a i l ( 3 8 , . 0 5 ) The 95th percentile then is 95th percentile of t(38) = 1.6859545 For the plot we first generate a graph of the tail area, to the right of 1.686, and recast the graph to an area plot. Second we add the t-density function, a title and locate some text along the horizontal axis. The place(s) option places the text to the “south”. 1 2 3 4 5 6 7 8 9 10 clear all pwd cd ”C: \ U s e r s \ Dennis \Box Sync \ ECON 300 Stata ” twoway f u n c t i o n y=tden ( 3 8 , x ) , r a n g e ( 1 . 6 8 6 5 ) c o l o r ( l t b l u e ) r e c a s t ( area ) /// | | f u n c t i o n y=tden ( 3 8 , x ) , r a n g e (−5 5 ) /// l e g e n d ( o f f ) p l o t r e g i o n ( margin ( z e r o ) ) /// | | , y t i t l e (” f ( t ) ”) x t i t l e (” t ”) /// text (0 1.686 ”1.686” , place ( s ) ) /// t i t l e ( ” Right− t a i l r e j e c t i o n r e g i o n ” ) /// For a two-tail p-value, with the probability that a value from the t(38) distribution will be greater than 1.9216 or less than −1.9216 use 1 2 3 4 5 6 7 8 9 10 11 12 13 clear all pwd cd ”C: \ U s e r s \ Dennis \Box Sync \ ECON 300 Stata ” twoway f u n c t i o n y=tden ( 3 8 , x ) , r a n g e ( 1 . 9 2 1 6 5 ) /// c o l o r ( l t b l u e ) r e c a s t ( area ) /// || f u n c t i o n y=tden ( 3 8 , x ) , r a n g e (−5 −1.9216) /// c o l o r ( l t b l u e ) r e c a s t ( area ) /// || f u n c t i o n y=tden ( 3 8 , x ) , r a n g e (−5 5 ) /// | | , l e g e n d ( o f f ) p l o t r e g i o n ( margin ( z e r o ) ) /// y t i t l e (” f ( t ) ”) x t i t l e (” t ”) /// t e x t ( 0 −1.921578 ” −1.9216” , p l a c e ( s ) ) /// text (0 1.9216 ”1.9216” , place ( s ) ) /// t i t l e ( ” Pr | t ( 3 8 ) | > 1 . 9 2 1 6 ” ) 2.6 Probability Tables The purpose of these four programs is to display the critical values from the t- and z-distributions. The critical values are given for a variety of alpha (α) levels. 1 2 3 f i n d i t probtabl ttable ztable For more information see UCLA Probability Tables 3 Fundamentals of Mathematical Statistics Below are some more basic statistics concepts that you should understand before proceeding in this course. Many of these will be useful in the first problem set. University of Illinois at Chicago 15 ECON 300 – Econometrics 3.1 Review Fall 2014 Populations, Parameters, and Random Sampling • Population: the entire group of items that interests us • The idea of a “population” is a central concept in statistics. Its characteristics are usually considered unknown, as it may be too costly (or in some cases impossible)4 to find them out for certain. Instead, we will attempt to use samples, along with statistical techniques, to estimate its characteristics. 3.1.1 Sampling • Sample: the part of this population that we actually observe • Simple random sample: each member of the population has an equal probability, a priori, of being in the sample. • Stratified random sample: within defined subgroups (e.g., by gender or race) a random sample is taken, but subgroups may not be sampled at the rate at which the groups are represented in the population. Used, for example, to get larger samples of rare subgroups that are of interest to the survey takers. • Statistical inference involves using the sample to draw conclusions about the characteristics of the population from which the sample came 3.2 3.2.1 Finite Sample Properties of Estimators Estimators and Estimates • Parameter: a characteristic of the population whose value is unknown, but can be estimated • Estimator: a sample statistic that will be used to estimate the value of the population parameter. A procedure for estimating a population parameter or value with a sample. (For example, “add up all the data and divide by the number of observations”, = x, the sample mean). Estimators are considered a random variables because (or in contexts where) the sample is randomly chosen. For example, the value we get for a sample average will depend on who happens to be chosen for our sample, analogous to randomness of the outcome of rolling a die. • Estimate: the specific value of the estimator that is obtained in one particular sample; a number obtained from applying an estimator to a particular sample. This is NOT random. • Sampling variation: the notion that because samples are chosen randomly, the sample average will vary from sample to sample, sometimes being larger than the population mean and sometimes lower • Standard Error – the estimated standard deviation of an estimator. It measures how much we expect the estimate would vary from one similarly taken sample to another. √ – You may recall that the standard error of the sample mean, x, is s/ N , where s is the standard deviation in the data and N is the sample size. – So for example, in a survey of 2,500 consumers on annual holiday spending with a mean of $750 and a standard deviation of $1200, the standard error on the mean would be $1200/50 = $24. This means in repeated random samples of 2,500 consumers, we would expect x to vary by about $24 from one sample to another. • A sample statistic is an unbiased estimator of a population parameter if the mean of the sampling distribution of this statistic is equal to the value of the population parameter • Because the mean of the sampling distribution of x is µ, x is an unbiased estimator of µ • One way of gauging the accuracy of an estimator is with its standard deviation: – If an estimator has a large standard deviation, there is a substantial probability that an estimate will be far from its mean – If an estimator has a small standard deviation, there is a high probability that an estimate will be close to its mean 4 Why impossible? Some populations are infinite (e.g., the set of all possible coin tosses) and some are unobservable (e.g., the “counterfactual”; more on this later). University of Illinois at Chicago 16 ECON 300 – Econometrics Review Fall 2014 • The sampling distribution of a statistic is the probability distribution or density curve that describes the population of all possible values of this statistic – For example, it can be shown mathematically that if the individual observations are drawn from a normal distribution, then the sampling distribution for the sample mean is also normal – Even if the population does not have a normal distribution, the sampling distribution of the sample mean will approach a normal distribution as the sample size increases • It can be shown mathematically that the sampling distribution for the sample mean has the following mean and standard deviation: Mean of x = µ µ Standard deviation of x = √ N 3.2.2 (1) (2) Unbiasedness • Unbiased – said of estimators: when the expected value of the sample estimator is equal to the population parameter. The sample mean is unbiased E[x] = µ, where µ the population average, if the data come from a random sample. – Upward biased: if the expected value of the estimator is above the true value of the population parameter; e.g., E[x] > µ. Downward biased is the opposite. • Figure 8 demonstrates the bias-efficiency trade-off with a dartboard. • Figure 9 demonstrates how distributions change when the mean, variance, and mean and variance are altered using climate change. Figure 8: Bias vs. Efficiency University of Illinois at Chicago 17 ECON 300 – Econometrics Review Fall 2014 Figure 9: Change in Mean, Variance, and Mean and Variance Example 3.2.3 The Sampling Variance of Estimators 3.2.4 Efficiency • Efficient estimator: estimator with the smallest variance; i.e., the estimator whose sampling distribution is more concentrated around the target parameter. – For example, if x1 and x2 are two unbiased (the property we care about most), then we prefer to use the estimator with the smaller variance. 3.3 Interval Estimation and Confidence Intervals • A confidence interval measures the reliability of a given statistic such as x • The general procedure for determining a confidence interval for a population mean can be summarized as: 1. Calculate the sample average x 2. Calculate the standard error of x by dividing the sample standard deviation s by the square root of the sample size N 3. Select a confidence level (such as 95 percent) and look in Table G.2 with N − 1 degrees of freedom to determine the t-value that corresponds to this probability 4. A confidence interval for the population mean is then given by: s x ± t∗ √ N University of Illinois at Chicago 18 ECON 300 – Econometrics Review Fall 2014 • Notably, a confidence interval does not depend on the size of the population • This may first seem surprising: if we are trying to estimate a characteristic of a large population, then wouldn’t we also need a large sample? • The reason why the size of the population doesn’t matter is that the chances that the luck of the draw will yield a sample whose mean differs substantially from the population mean depends on the size of the sample and the chances of selecting items that are far from the population mean – That is, not on how many items there are in the population 4 4.1 Using Stata to Do Monte Carlo Experiments Generating Uniformly Distributed Random Numbers A continuously distributed random variable that is equally likely to take any value between zero and one has a standard uniform probability distribution. Such a variable can be created in Stata with the uniform() function. Generate 10,000 draws from a standard uniform distribution and inspect the results in the browser. 1 2 3 s e t obs 10000 gen u=u n i f o r m ( ) browse Listing 3: Generating Uniformly Distributed Random Numbers Stata’s uniform random number generator returns a number between zero and one, exclusive of one itself. 4.1.1 Functions uniform is the name of a function in Stata. Function names in Stata must always be followed by an open parenthesis with no intervening spaces. Why no intervening spaces? Because otherwise Stata will think the name is the name of a variable, and not a function. The pair of parentheses surrounds the argument or arguments of the function. In this case the uniform function has no arguments, but the parentheses are needed anyway. If a function has more than one argument, then the arguments are separated by commas. 4.1.2 Graph the Uniformly Distributed Random Variable Graph the random variable by issuing the following command: 1 histogram u , bin (40) Listing 4: Graph the Uniformly Distributed Random Variable For continuous variables, Stata has some internal rule for deciding how many bars (or bins) to use in constructing the graph. Notice that the density of the random variable is essentially constant throughout its range, which is why the distribution has the name “uniform”. 4.1.3 Generating a Discrete Random Variable: Example Simulating the Rolls of a Die. The uniform random number generator is a building block for creating virtually any random variable. We will illustrate this by using it to simulate rolling a die. The following commands simulate twenty rolls of a die. First, however, the number of observations in Stata must be set to ten, and in order to do this the memory must be cleared. 1 2 3 4 clear s e t obs 10 gen x=i n t ( 6 ∗ u n i f o r m ( ) )+1 browse Listing 5: Generating a Discrete Random Variable Here is what the generate command does. First, uniform() returns a uniformly distributed random number between 0 and 1, not including one itself. 6*uniform() therefore returns a random number between 0 and 6, not including 6 itself. This becomes the argument to the int() function, which truncates the fractional part from the number, returning an integer between 0 and 5 inclusive – “int” is short for integer. For example, int(5.999999) becomes 5. Finally, 1 is added to this, giving integers between 1 and 6 inclusive. Since uniform() results in numbers uniformly distributed between zero and one, the six final integers assigned to the variable x also have equal probability. University of Illinois at Chicago 19 ECON 300 – Econometrics 4.1.4 Review Fall 2014 Graph the Uniformly Distributed Random Variable Graph the random variable by issuing the following command: 1 h i s t o g r a m x , d i s c r e t e f r a c t i o n y l i n e ( . 1 6 6 6 6 7 ) x l a b e l (#6) t i t l e ( n=10) name ( g1 , r e p l a c e ) Listing 6: Graph the Uniformly Distributed Random Variable The horizontal grid line is drawn at one-sixth, the probability of any particular side of the die facing up. 4.2 The Law of Large Numbers and the Frequentist Notion of Probability The limit in the frequentist notion of probability is the law of large numbers, that is, as the sample size – or number of trials – increases towards infinity, the sample proportion favorable to an event approaches – “settles down to” – the probability of the event. We will illustrate this by increasing the number of rolls of the die, and noticing that the sample distribution of outcomes settles down to the theoretical discrete uniform distribution of one-sixth probability for each side of the die. In order to do this, repeat the commands above, each time changing the number of observations to be 50, then 200, then 10,000. Also, change the name for each graph as indicated below. The easiest way to do this is to single-click on each command in the Review window, and then edit it in the Command window as necessary. 1 2 3 4 clear s e t obs 50 gen x=i n t ( 6 ∗ u n i f o r m ( ) )+1 h i s t o g r a m x , d i s c r e t e f r a c t i o n y l i n e ( . 1 6 6 6 6 7 ) x l a b e l (#6) t i t l e ( n=50) name ( g2 , r e p l a c e ) 5 6 7 8 9 clear s e t obs 200 gen x=i n t ( 6 ∗ u n i f o r m ( ) )+1 h i s t o g r a m x , d i s c r e t e f r a c t i o n y l i n e ( . 1 6 6 6 6 7 ) x l a b e l (#6) t i t l e ( n=200) name ( g3 , r e p l a c e ) 10 11 12 13 14 clear s e t obs 10000 gen x=i n t ( 6 ∗ u n i f o r m ( ) )+1 h i s t o g r a m x , d i s c r e t e f r a c t i o n y l i n e ( . 1 6 6 6 6 7 ) x l a b e l (#6) t i t l e ( n=10000) name ( g4 , r e p l a c e ) Listing 7: Law of Large Numbers Finally, view all the graphs together with the following command: 1 graph combine g1 g2 g3 g4 , t i t l e ( Law o f Large Numbers ) Listing 8: Law of Large Numbers University of Illinois at Chicago 20 ECON 300 – Econometrics 4.3 Review Fall 2014 Tossing Coins with Stata We can use the random uniform distribution generator in Stata to simulate coin tossing: 1 2 3 s e t obs 10000 gen x = 1 i f u n i f o r m ( ) > . 5 r e p l a c e x = 0 i f x == . Listing 9: Tossing Coins Checking the binomial distribution: Treat these 10,000 flips as 1000 “experiments” in which we toss the coin ten times and record the number of heads each time. 1 2 3 4 5 gen c a s i n o = n gen t r i a l = i n t ( ( c a s i n o −1) / 1 0 ) egen heads = sum ( x ) , by ( t r i a l ) h i s t o g r a m heads , b i n ( 1 1 ) tab heads Listing 10: Tossing Coins Non Sequitur A paper by Diaconis et al. (2007) used a high-speed camera that photographed people flipping coins. The three researchers determined that a coin is more likely to land facing the same side on which it started. If tails is facing up when the coin is perched on your thumb, it is more likely to land tails up. The authors suggests that in coin-tossing there is a particular “dynamical bias” that causes a coin to be slightly more likely to land the same way up as it started. How much more likely? At least 51 percent of the time, the researchers claim, and possibly as much as 55 percent to 60 percent – depending on the flipping motion of the individual. 5 Principles of Economics: A Review Economics is the study of mankind in the ordinary business of life. – Alfred Marshall Figure 10: Dilbert Company Economist . . . the dismal science – Thomas Carlyle • Economics: The social science that studies the choices that individuals, businesses, governments, and entire societies make as they cope with scarcity and the incentives that influence and reconcile those choices. • Scarcity: Our inability to satisfy all our wants. • Incentive: A reward that encourages an action or a penalty that discourages one. University of Illinois at Chicago 21 ECON 300 – Econometrics 5.1 Review Fall 2014 Non Sequitur : Random Famous People Who Majored in Economics 5. John Elway – Hall of Fame NFL quarterback (Stanford) 4. Bob Barker – TV Game Show host on The Price Is Right (Drury College) 3. Mick Jagger – Rolling Stones (London School of Economics) 2. William Shatner – “Khaaaan!” (McGill University)5 1. Arnold Schwarzenegger – Body Builder/Actor/Governor/Terminator (University of Wisconsin) 5.2 Microeconomics versus Macroeconomics • Microeconomics: The study of the choices that individuals and businesses make, the way these choices interact in markets, and the influence of governments. • Macroeconomics: The study of the performance of the national economy and the global economy. 5.3 Supply & Demand Basics • Demand 1. Law of Demand: ceteris paribus, the higher the price of a good, the smaller is the quantity demanded of it; the lower the price of a good, the larger is the quantity demanded of it. 2. Elasticity of Demand: Measures the change in quantity demanded in response to a change in price. 3. Determinants (Shifters) (a) Tastes or Preferences (b) Number of Buyers (c) Income – Normal – Inferior (d) Prices of Related Goods – Complement – Substitute (e) Consumer Expectations • Supply 1. Law of Supply: ceteris paribus, the higher the price of a good, the greater is the quantity supplied of it. 2. Determinants (Shifters) (a) (b) (c) (d) (e) (f) Resource (Input) Prices Technology Taxes and Subsidies Prices of Other Goods Producer Expectations Number of Sellers • Supply & Demand 1. Movement versus Shift 2. Equilibrium 3. Price Controls – – – – – 5 Non Price Ceiling Price Floor Binding versus Unbinding Surplus Shortage sequitur : If you have not experienced William Shatner’s version of ”Rocketman”, then please treat yourself. University of Illinois at Chicago 22 ECON 300 – Econometrics 5.4 Review Fall 2014 Secondary (Tertiary) Effects • Secondary effects: unintended consequences of economic actions that may develop slowly over time as people react to events • Examples: 1. In many cities, public officials have imposed rent controls on apartments. The primary effect of this policy, the effect policy makers focus on (i.e. the one they sell to the public), is to keep rents from rising. Overtime, however, fewer apartments get built because renting them becomes less profitable. Moreover, existing rental units deteriorate because owners have plenty of tenants anyway. Thus, the quality and quantity of housing may decline as a result of what appears to be a reasonable measure to keep rents from rising. Another issue is ignoring how landlords would respond in legally circumventing the policy; e.g., tacking on ’required’ extras to the apartment such as curtains that the tenant must rent from the landlord for $250 a month in addition to the rent. 2. A macro related example is Congress passing the Smoot-Hawley Tariff Act of 1930 during the Great Depression. At first the tariff seemed successful. However, the United States’ trading partners retaliated. Overall, world trade decreased by approximately 66% between 1929 and 1934! 5.5 Marginal Analysis & Utility • Marginal analysis: an examination of the effects of additions to or subtractions from a current situation • Utility: a measure of pleasure or satisfaction • See Figure 11 for an example. Figure 11: Utility 5.6 Diminishing (Marginal) Returns • (law of) diminishing (marginal) returns: The principle that as successive increments of a variable resource are added to a fixed resource, the marginal product of the variable resource will eventually decrease. • See Figure 12 for an example. University of Illinois at Chicago 23 ECON 300 – Econometrics Review Fall 2014 Figure 12: Diminishing Returns 5.7 Positive vs. Normative • Normative (questions) economic analysis: addresses questions that involve value judgments. It concerns what ought to happen rather than what did, will, or would happen. • Positive (questions) economic analysis: addresses factual questions, usually concerning choices or outcomes. It concerns what did, will, or would happen. • Economics could convince reasonable people of the truth of the “positive” (“what is”) predictions, but economics cannot end a “normative” (“what should be”) disagreement. • Note: It is not always possible or at least easy to separate normative from positive statements. That said, in this course the focus is nearly exclusively focused on positive economic analysis; i.e. the did, will, or would. In answering any exam or problem set question I do not care about my, your, or someone else’s opinion. • Note: The word positive does not mean that the answer admits no doubt. On the contrary, all answers, particularly those involving predictions, involve some degree of uncertainty. Rather, in this context, positive simply means that the prediction concerns a factual matter. 5.8 Principles of Macroeconomics: A Review • Gross Domestic Product (GDP) 1. Intermediate Goods: Products that are purchased for resale or further processing or manufacturing. 2. Gross Domestic Product (GDP): The market value of all final goods and services produced within a country during a given time period. 3. Gross national product (GNP): The value of final goods and services produced by residents of a country, even if the production takes place outside the country. 4. Expenditure (Output) Approach: The method that adds all expenditures made for final goods and services to measure the gross domestic product. (a) Personal Consumption Expenditures – durable goods (lasting 3 years or more) – nondurable goods – services (b) Gross Private Domestic Investment – All final purchases of machinery, equipment, and tools by business enterprises. – All construction. – Changes in inventories. (c) Government Purchases – Includes spending by all levels of government (federal, state and local). University of Illinois at Chicago 24 ECON 300 – Econometrics Review Fall 2014 – Includes all direct purchases of resources (labor in particular). (d) Net Exports – Exports minus Imports – Imports: Spending by individuals, firms, and governments for goods and services produced in foreign nations. – Exports: Goods and services produced in a nation and sold to buyers in other nations. 5. Bureau of Economic Analysis (BEA): An agency of the U.S. Department of Commerce that compiles the national income and product accounts. 6. Real versus Nominal – Real GDP: The value of final goods and services produced in a given year when valued at the prices of a reference base year. – Nominal GDP: The value of the final goods and services produced in a given year valued at the prices that prevailed in that same year. It is a more precise name for GDP. 7. Real GDP per person (capita): Real GDP divided by the population. 8. Criticisms (a) (b) (c) (d) (e) (f) Household production Underground economic activity Health and life expectancy Leisure time Environmental quality Political freedom and social justice • The Price Level & Inflation 1. Price level: The average level of prices. 2. Inflation [Deflation]: A persistently rising [falling] price level. 3. Why Is Inflation [Deflation] Problematic? (a) (b) (c) (d) Redistribution of Income Redistribution of Wealth Lowers Real GDP and Employment Diverts Resources from Production 4. Hyperinflation: An inflation rate of 50 percent a month or higher that grinds the economy to a halt and causes a society to collapse. 5. Measurement (a) Consumer Price Index (CPI): An index that measures the average of the prices paid by urban consumers for a fixed basket of the consumer goods and services. (b) GDP Deflator (c) Personal Consumption Expenditure (PCE) Deflator 6. cost-of-living adjustment (COLA) An automatic increase in the incomes (wages) of workers when inflation occurs; guaranteed by a collective bargaining contract between firms and workers. • Unemployment 1. Working-age population: The total number of people aged 16 years and over who are not in jail, hospital, or some other form of institutional care. 2. Labor force: The sum of the people who are employed and who are unemployed. 3. Unemployment rate: The percentage of the people in the labor force who are unemployed. 4. Labor force participation rate: The percentage of the working-age population who are members of the labor force. 5. Types – Frictional Unemployment: The unemployment that arises from normal labor turnover-from people entering and leaving the labor force and from the ongoing creation and destruction of jobs. University of Illinois at Chicago 25 ECON 300 – Econometrics Review Fall 2014 – Structural Unemployment: The unemployment that arises when changes in technology or international competition change the skills needed to perform jobs or change the locations of jobs. – Cyclcial Unemployment: The higher than normal unemployment at a business cycle trough and the lower than normal unemployment at a business cycle peak. 6. Criticisms – Part-time employment – Discouraged worker: A marginally attached worker who has stopped looking for a job because of repeated failure to find one. Former workers who have left the labor force because they have not been able to find employment. 7. Natural unemployment rate: The unemployment rate when the economy is at full employment-natural unemployment as a percentage of the labor force. 8. Full employment: A situation in which the the unemployment rate equals the natural unemployment rate. At full employment, there is no cyclical unemployment-all unemployment is frictional and structural. • Macroeconomic Policy – Fiscal Policy ∗ Expansionary Fiscal Policy: An increase in government purchases of goods and services, a decrease in net taxes, or some combination of the two for the purpose of increasing aggregate demand and expanding real output. ∗ Contractionary Fiscal Policy: A decrease in government purchases for goods and services, an increase in net taxes, or some combination of the two, for the purpose of decreasing aggregate demand and thus controlling inflation. ∗ Discretionary Fiscal Policy: Deliberate changes in taxes (tax rates) and government spending by Congress to promote full employment, price stability, and economic growth. – Monetary Policy ∗ The Fed’s Policy Tools · Open market operation (OMO): The purchase or sale of government securities - U.S. Treasury bills and bonds - by the Federal Reserve in the loanable funds market. · Quantitative easing: buying securities beyond the short-term Treasury securities that are usually involved in open market operations. The Fed began purchasing 10-year Treasury notes to keep their interest rates from rising. Interest rates on home mortgage loans typically move closely with interest rates on 10-year Treasury notes. The Fed also purchased certain mortgage-backed securities (MBS). The Fed’s objectives in keeping interest rates on 10-year Treasury notes low and in purchasing mortgage-backed securities was to keep interest rates on mortgages low and to keep funds flowing into the mortgage market in order to help stimulate demand for housing. University of Illinois at Chicago 26 ECON 300 – Econometrics Review Fall 2014 References Gujarati, D. N. (2003). Basic Econometrics. McGraw-Hill/Irwin. Wooldridge, J. M. (2013). Introductory Econometrics: A Modern Approach. Thomson South-Western. University of Illinois at Chicago 27