ECON 300 Econometrics Review

advertisement
ECON 300
Econometrics
Review
Dennis C. Plott
University of Illinois at Chicago
Department of Economics
www.dennisplott.com
https://uic.blackboard.com
Fall 2014
ECON 300 – Econometrics
Review
Fall 2014
Contents
1 Analytical Tools
1.1 Subscripts . . . . . . . . . . .
1.2 Growth Rates . . . . . . . . .
1.3 Question . . . . . . . . . . . .
1.4 Rule of 70 (72) . . . . . . . .
1.5 Slope (of a Linear Line) . . .
1.6 Tangent Lines . . . . . . . . .
1.7 Percentage Points vs. Percent
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
. .
.
.
.
.
.
2 Fundamentals of Probability
2.1 Random Variables and Their Probability Distributions . . . . . .
2.1.1 Discrete Random Variables . . . . . . . . . . . . . . . . .
2.1.2 Continuous Random Variables . . . . . . . . . . . . . . .
2.2 Joint Distributions, Conditional Distributions, and Independence
2.2.1 Joint Distributions and Independence . . . . . . . . . . .
2.2.2 Conditional Distributions . . . . . . . . . . . . . . . . . .
2.3 Features of Probability Distributions . . . . . . . . . . . . . . . .
2.3.1 A Measure of Central Tendency: The Expected Value . .
2.3.2 Properties of Expected Values . . . . . . . . . . . . . . . .
2.3.3 Another Measure of Central Tendency: The Median . . .
2.3.4 Measures of Variability: Variance and Standard Deviation
2.3.5 Standardizing a Random Variable . . . . . . . . . . . . .
2.4 Features of Joint and Conditional Distributions . . . . . . . . . .
2.4.1 Measures of Association: Covariance and Correlation . . .
2.4.2 Variance of Sums of Random Variables . . . . . . . . . . .
2.5 The Normal and Related Distributions . . . . . . . . . . . . . . .
2.5.1 The Normal Distribution . . . . . . . . . . . . . . . . . .
2.5.2 The Normal Distribution with Stata . . . . . . . . . . . .
2.5.3 Normal probability calculation . . . . . . . . . . . . . . .
2.5.4 The Standard Normal Distribution . . . . . . . . . . . . .
2.5.5 Additional Properties of the Normal Distribution . . . . .
2.5.6 The t Distribution . . . . . . . . . . . . . . . . . . . . . .
2.5.7 Student’s t-distribution with Stata . . . . . . . . . . . . .
2.5.8 t-distribution Probabilities . . . . . . . . . . . . . . . . .
2.5.9 Graphing Tail Probabilities . . . . . . . . . . . . . . . . .
2.6 Probability Tables . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3 Fundamentals of Mathematical Statistics
3.1 Populations, Parameters, and Random Sampling
3.1.1 Sampling . . . . . . . . . . . . . . . . . .
3.2 Finite Sample Properties of Estimators . . . . . .
3.2.1 Estimators and Estimates . . . . . . . . .
3.2.2 Unbiasedness . . . . . . . . . . . . . . . .
3.2.3 The Sampling Variance of Estimators . .
3.2.4 Efficiency . . . . . . . . . . . . . . . . . .
3.3 Interval Estimation and Confidence Intervals . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
15
.
16
.
16
.
16
.
16
. . 17
.
18
.
18
.
18
4 Using Stata to Do Monte Carlo Experiments
4.1 Generating Uniformly Distributed Random Numbers . . . . . . . . . . . . . . . .
4.1.1 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.2 Graph the Uniformly Distributed Random Variable . . . . . . . . . . . . .
4.1.3 Generating a Discrete Random Variable: Example Simulating the Rolls of
4.1.4 Graph the Uniformly Distributed Random Variable . . . . . . . . . . . . .
4.2 The Law of Large Numbers and the Frequentist Notion of Probability . . . . . .
4.3 Tossing Coins with Stata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
a
.
.
.
. . .
. . .
. . .
Die.
. . .
. . .
. . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
1
1
2
2
2
3
3
.
.
.
.
.
.
.
University of Illinois at Chicago
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
Change
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4
4
4
4
5
5
6
6
6
6
6
6
7
7
7
9
11
11
11
12
13
13
13
14
14
15
15
19
19
19
19
19
20
20
21
i
ECON 300 – Econometrics
Review
5 Principles of Economics: A Review
5.1 Non Sequitur : Random Famous People Who Majored in
5.2 Microeconomics versus Macroeconomics . . . . . . . . .
5.3 Supply & Demand Basics . . . . . . . . . . . . . . . . .
5.4 Secondary (Tertiary) Effects . . . . . . . . . . . . . . . .
5.5 Marginal Analysis & Utility . . . . . . . . . . . . . . . .
5.6 Diminishing (Marginal) Returns . . . . . . . . . . . . .
5.7 Positive vs. Normative . . . . . . . . . . . . . . . . . . .
5.8 Principles of Macroeconomics: A Review . . . . . . . . .
Fall 2014
Economics
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
21
.
22
.
22
.
22
.
23
.
23
.
23
. . 24
. . 24
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
List of Figures
1
2
3
4
5
6
7
8
9
10
11
12
Probability Distribution for a Six-Sided Die . . . . . . . . . . . . .
Pick a Number, Any Number . . . . . . . . . . . . . . . . . . . . .
A Continuous Probability Distribution for the Spinner . . . . . . .
Probability Distribution for Six-Sided Dice, Using Standardized Z
Probability Distribution for Fair Coin Flips, Using Standardized Z
Correlations: Scatter Plots and Linear Fit Lines . . . . . . . . . .
Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . .
Bias vs. Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . .
Change in Mean, Variance, and Mean and Variance Example . . .
Dilbert Company Economist . . . . . . . . . . . . . . . . . . . . .
Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Diminishing Returns . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4
5
5
8
9
10
11
17
18
21
23
24
List of Tables
University of Illinois at Chicago
ii
ECON 300 – Econometrics
Review
Fall 2014
The notes provided are meant to be useful, but do not necessarily contain everything needed to do well in the
course. The remaining components include: textbook reading, in-class notes, and applying the material by
solving problems.
Recommended Readings
– Wooldridge (2013)
∗ Appendix B
∗ Appendix C
– Gujarati (2003)
∗ Appendix A
Recommended Videos
– Hans Rosling “The Best Stats You’ve Ever Seen”
– Hans Rosling “New Insights on Poverty”
Note: you are not expected to completely understand the Stata “code”1 in the blue boxes at this point. It is
included in the Review notes to give you a sense of what Stata can do and demonstrate, in an ad hoc manner,
the concept being discussed. Throughout the course I will work through any and all Stata code you will use,
step-by-step, so don’t panic.
1
Analytical Tools
1.1
Subscripts
• An individual observation often corresponds to a specific economic unit, such as a person, household,
corporation, firm, organization, country, state, city or other geographical region. These are often represented
with a subscript (i).
– For example, if using income data collected on individuals, then incomei represents the income for
the ith individual.
• The time subscript (t) allows us to be more specific about the timing of a variable
• The subscript t (current), t − 1 (past), or t + 1 (future) tells us in which period something is happening
• The time subscript (t) can represent any type of data; e.g. annually, quarterly, monthly, etc.
• Examples:
– if we have annual data and t = 2010, then t − 1 = 2009 and t + 1 = 2011
– if we have monthly data and t = January 2010, then t−1 = December 2009 and t+1 = February 2010
1.2
Growth Rates
growth rate = %∆X
Xt+1 − Xt
=
× 100%
Xt
Xt+1
=
− 1 × 100%
Xt
Example: Suppose your nominal wage per hour is currently Wt = $12.00 and your boss states will will get a
sixty cent raise per hour; i.e.,Wt+1 = $12.60. What is this in percentage terms?
Wt+1 − Wt
× 100%
Wt
$12.60 − $12.00
%∆W =
× 100%
$12.00
%∆W = 5%
%∆W =
1 A note for the computer science people: throughout the course I will refer to the Stata “code” as code. I am using this term
loosely.
University of Illinois at Chicago
1
ECON 300 – Econometrics
Review
Fall 2014
At this rate, assuming you received a five-percent raise per year, how long would it take for your salary to
double?
1.3
Question
• Which headline would you prefer if you are planning on purchasing a new house several years in the future?
Assume you currently do not own any property.
1. “Housing Prices Have Soared Doubling in the Last Decade”
2. “The Market for Housing Grew Moderately at 7% Last Year, Consistent with the Ten Year Average”
For simplicity, assume the relevant growth rate for these two respective headlines remains constant.
1.4
Rule of 70 (72)
• Rule of 70 (72): the approximate amount of time (e.g. years) it takes for the level of a variable growing
at a constant rate to double.
T2 ≈
70
R
where
– T2 : approximate time for variable to double
– R: constant growth rate percent
• Use 70 for numbers ending in 0, 2, 5, 7, and 10
• Use 72 for numbers ending in 3, 4, 6, 8, and 9
Continuing from the previous subsection it would take approximately
1.5
70
= 14 years for your salary to double.
5
Slope (of a Linear Line)
y
y = mx + b
y2
rise
y1
run
y-intercept
x
x1
University of Illinois at Chicago
x2
2
ECON 300 – Econometrics
Review
Fall 2014
rise
y2 − y1
y1 − y2
∆y
=
=
=
=m
run
x2 − x1
x1 − x2
∆x
b = y-intercept
slope =
1.6
Tangent Lines
• tangent (line): A line that just touches a curve at one point, without cutting across it.
y
tangent to y = f (x)
y = f (x)
x
1.7
Percentage Points vs. Percent Change
• Percentage is relative, while percentage points are absolute.
• Generally speaking, percentage points should be used to measure the difference between two percentages,
since it gives a clearer view of the differences than when percentages are used.
• Examples:
– If we say that the number of female CEOs increase by 3%, we mean that the number increase with
3% of the current number of female CEOs. If we say the number increases with 3 percentage points,
we mean that the number of female CEOs increase with 3% of the total number of CEOs. So if 5% of
all CEOs are female a 3% increase would not be noticeable, since it increased the number of female
CEOs to 5.015% of the total number of CEOs. However, if the number of female CEOs increases
with 3 percentage points, this implies 8% of all CEOs would be female. Quite a difference.
– Let’s say that a poll in year 1 shows that 10% of the population supports consuming children for
food (to take a hopefully absurd example). In year 2 the poll shows a 20% decrease in the support
compared to year 1. However, in year 3, the same number has gone up by 25% compared to year 2.
Many people would get the impression that the number of children as food supporters in year 3 is
higher than in year 1, but that’s actually not the case.
In year one 10% supported eating children. The next year, the number fell by 20%. 20% of 10% is
2%, which means that 8% supports consuming children. Then the number of supporters increased
with 25%. 25% of 8% is 2%, so the total is back up to 10%.
University of Illinois at Chicago
3
ECON 300 – Econometrics
Review
Fall 2014
If we have used percentage points, we could just say that in year 2, the number of supporters fell by
2 percentage points, and that the number of supporters increased by the same amount of percentage
points in year 3. Thus making it much clearer that the amount of supporters was the same in year 1
and year 3.
• Non Sequitor : See Jonathan Swift’s “A Modest Proposal”
2
Fundamentals of Probability
2.1
Random Variables and Their Probability Distributions
• Experiment: any procedure that can, at least in theory, be infinitely repeated and has a well-defined set of
outcomes.
• Random Variable: numerical outcome of a random trial; examples: number from the roll of a die, survey
responses from a randomly drawn sample, and estimators.
• A random variable X is a variable whose numerical value is determined by chance, the outcome of a
random phenomenon
• Bernoulli (or binary) random variable: A random variable that can only take on the values zero and one
2.1.1
Discrete Random Variables
• A discrete random variable has a countable number of possible values, such as 0, 1, and 2; takes on a finite
number of values (as in the roll of a die)
• A probability distribution P [Xi ] for a discrete random variable X assigns probabilities to the possible
values X1 , X2 , and so on
• For example, when a fair coin is flipped it can be heads or tails; when a fair six-sided die is rolled, there
are six equally likely outcomes, each with a 1/6 probability of occurring
– The die can be 1, 2, 3, 4, 5, or 6
– Figure 1 shows this probability distribution
Figure 1: Probability Distribution for a Six-Sided Die
2.1.2
Continuous Random Variables
• A continuous random variable can take on any value over a specified range. In practice what we often
loosely refer to variables as “continuous” if they take on a large number of values. For example, we often
call years of education a continuous variable even though it typically takes on only roughly 20 values.
• Examples of continuous random variables include time and distance since both can take on any value in
an interval
University of Illinois at Chicago
4
ECON 300 – Econometrics
Review
Fall 2014
– As another example, Figure 2 shows a spinner for randomly selecting a point on a circle
• A continuous probability density curve shows the probability that the outcome is in a specified interval as
the corresponding area under the curve
– This is illustrated for the case of the spinner in Figure 3
Figure 2: Pick a Number, Any Number
Figure 3: A Continuous Probability Distribution for the Spinner
• Cumulative Density Function (or distribution function in the case of a discrete random variable) or “CDF”
measures the probability that a random variable takes on a value below a specified value. Often denoted
with a capital letter function and a lowercase argument, as in G(x). Note that the argument is a number,
not a random variable. For random variable X, G(x) = P r(X < x).
• Probability Density Function (distribution in the case of a discrete random variable) or “PDF”. Note in
the case of a continuous random variable, the PDF does not measure a probability, since the probability
that a continuous random variable takes on any particular value is zero.
• The CDF probabilities associated with a standard – mean zero, variance one – normal random variable
are shown in Table G.1 in Wooldridge (2013).
2.2
2.2.1
Joint Distributions, Conditional Distributions, and Independence
Joint Distributions and Independence
• Two random variables are independent if the probability that one takes on any particular value is unrelated
to the value the other variable takes on.2
2 Technically, the condition is written as g (X, Y ) = g (x) · g (y) where g (X, Y ) is the joint PDF – integrated over a range,
xy
x
y
xy
it gives the joint probability that X is in the specified range at the same time as Y is in the specified range – and gx (x) and gy (y)
are the PDFs of X and Y , respectively (technically called the “marginal” PDFs).
University of Illinois at Chicago
5
ECON 300 – Econometrics
Review
Fall 2014
– Observations in a simple random sample are independent. If I survey people at random, the answers
one person gives to questions will be on average unrelated to the other respondents’ answers.
– In linear regression, we are often talk about a weaker condition, so-called “mean independence”:
E[u|X] = 0. This condition says the expected value of random variable u (which in this example is
zero) is unrelated to the value random variable X takes on.
– If two variables are independent or even just mean independent, they also have zero correlation and
zero covariance. See below for these terms.
2.2.2
Conditional Distributions
• |= “given that” or “conditional on” as in P r(purple-people eater|one-eye, one-horn) = probability of
being a purple people eater given that you have one eye and one horn, or E[drinks last weekend|fraternity
member] = expected number of drinks consumed last weekend by a fraternity member.
2.3
2.3.1
Features of Probability Distributions
A Measure of Central Tendency: The Expected Value
• Expected value – E[x] = population average
– The expected value (or mean) of a discrete random variable X is a weighted average of all possible
values of X, using the probability of each X value as weights:
X
Xi P [Xi ]
(17.1)
µ = E[X] =
i
2.3.2
Properties of Expected Values
• Recall that expected value is a linear operator; that is, for an random variables X and Y , and for any
numbers a, b, and c, E[aX + bY + c] = aE[X] + bE[Y ] + c. For example, if the mean income in some
population is $50,000, and you give everybody $1000 (= c) the mean income becomes $51,000. If you then
convert everyone’s income to Euros at 1.00¤/$1.33 (=a) , mean income becomes $51,000/$1.33 = 38,476¤
– Note that c is just a number – it is not random – so its expected value is the number itself.
2.3.3
Another Measure of Central Tendency: The Median
• Median: the midpoint of the data. The median is the number above which lie half the observed numbers
and below which lie the other half. The median is not sensitive to outliers.
• Mode: the most frequently occurring value.
2.3.4
Measures of Variability: Variance and Standard Deviation
• Variance: the expected squared deviation of a random variable from its mean. If the population mean of a
random variable X is µ, the variance is defined as E[(X − µ)2 ].
– The sample analog of the variance in a dataset, also known as an estimator (see below) of the variance,
replaces µ with the sample average, and adds up the data and divides by N − 1:
N
1 X
(Xi − X)2 ,
N − 1 i=1
where N is again the sample size and i indexes the observations of the dataset. This also happens
to be an unbiased (see below) estimator of the variance (whereas dividing by N instead of N − 1
produces a downward biased estimator).
– It is common to denote the population variance using σ 2 , where σ is the lowercase Greek letter sigma,
which represents the standard deviation
– Standard deviation – square root of the variance. This also measures how much variation there is in
the data, but it is more useful because it is measured in units of the original data (as opposed to
squared units with the variance). The standard deviation of a population is often denoted σ.
University of Illinois at Chicago
6
ECON 300 – Econometrics
Review
Fall 2014
– Note that the common usage of “the standard deviation” refers to the sample estimator concept in
a particular dataset. But in statistics and econometrics there is also a population concept defined
in terms of random variables. This is why it is meaningful to talk about things like “the standard
deviation of an estimator” even though in practice we only typically have one sample.3
• The variance of a discrete random variable X can also be seen as a weighted average, for all possible values
of X, of the squared difference between X and its expected value, using the probability of each X value as
weights:
X
σ 2 = E[(X − µ)2 ] =
(Xi − µ)2 P [Xi ]
(17.2)
i
• The standard deviation σ is still the square root of the variance
2.3.5
Standardizing a Random Variable
• Standardizing a random variable means subtracting off the mean and dividing by the standard deviation;
if X has a mean of µ and a standard deviation of σ, then standardized X is
X −µ
σ
– This results in a new random variable which has a mean of zero and a standard deviation of 1.
• To standardize a random variable X, we subtract its mean and then divide by its standard deviation σ:
Z=
X −µ
σ
(17.3)
• No matter what the initial units of X, the standardized random variable Z has a mean of 0 and a standard
deviation of 1
• The standardized variable Z measures how many standard deviations X is above or below its mean:
– If X is equal to its mean, Z is equal to 0
– If X is one standard deviation above its mean, Z is equal to 1
– If X is two standard deviations below its mean, Z is equal to −2
• Figure 4 and Figure 5 illustrates this for the case of dice and fair coin flips, respectively
2.4
2.4.1
Features of Joint and Conditional Distributions
Measures of Association: Covariance and Correlation
• Covariance: a general measure of relatedness of two random variables analogous to the variance. If
the population mean of a random variable X is µX and Y is µY , then the covariance is defined as
E[(X − µX )(Y − µY )].
– The sample estimator of the covariance in a dataset replaces the µ’s with the sample averages, and
adds up the data and divides by N − 1:
N
1 X
(Xi − X)(Yi − Y )
N − 1 i=1
– The covariance does not have very meaningful units, and its magnitude is hard to interpret. But the
sign tells you whether X and Y are positively or negatively related.
• You may recall that the correlation, a number between −1 and +1, standardizes the covariance by dividing
by the standard deviation of each variable. It measures the strength, but not the magnitude (slope) of any
linear relationship between X and Y . It has no units, and is typically denoted with an “r” or “R” if it is
a sample estimate, and a ρ when it is a population concept.
3 Note that when we do so, we are considering the situation before we have collected the sample; X represents what we might
i
get from a random draw from the population, not the actual data.
University of Illinois at Chicago
7
ECON 300 – Econometrics
Review
Fall 2014
Figure 4: Probability Distribution for Six-Sided Dice, Using Standardized Z
• The linear distinction is important: in extreme cases two variables could even be perfectly related but have
zero correlation (if that relationship was nonlinear)! (For example, if y = x2 , y and x would be perfectly
related. However, y and x would have zero correlation: a straight slope fitted between y and x would have
zero slope. Draw a picture to see why.) Let’s look at this in Stata. The following Stata code (Listing 1)
produces Figure 6.
1
2
clear all
s e t s e e d 12345
3
4
5
6
7
8
9
10
11
12
∗ P e r f e c t P o s i t i v e Relationship ( r = 1)
s e t obs 50
gen x1 = r n o r m a l ( )
su
gen y1 = x1
graph twoway ( l f i t y1 x1 ) ///
( s c a t t e r y1 x1 , t i t l e ( ” P e r f e c t P o s i t i v e R e l a t i o n s h i p ( r = 1 ) ” ) ) , ///
name ( c o r r 1 )
c o r r y1 x1
13
14
15
16
17
18
19
20
21
22
23
∗ Positive Relationship ( r = 0.67)
s e t obs 50
gen x4 = r n o r m a l ( )
gen u1 = r n o r m a l ( )
su
gen y4 = x4 + u1
graph twoway ( l f i t y4 x4 ) ///
( s c a t t e r y4 x4 , t i t l e ( ” P o s i t i v e R e l a t i o n s h i p ( r = 0 . 6 7 ) ” ) ) , ///
name ( c o r r 4 )
c o r r y4 x4
24
University of Illinois at Chicago
8
ECON 300 – Econometrics
Review
Fall 2014
Figure 5: Probability Distribution for Fair Coin Flips, Using Standardized Z
25
26
27
28
29
30
31
32
33
∗ P e r f e c t N e g a t i v e R e l a t i o n s h i p ( r = −1)
s e t obs 50
gen x2 = r n o r m a l ( )
su
gen y2 = −x2
graph twoway ( l f i t y2 x2 ) ///
( s c a t t e r y2 x2 , t i t l e ( ” P e r f e c t N e g a t i v e R e l a t i o n s h i p ( r =−1) ” ) ) , ///
name ( c o r r 2 )
c o r r y2 x2
34
35
36
37
38
39
40
41
42
43
∗ P e r f e c t Q u a d r a t i c R e l a t i o n s h i p ( r =0)
s e t obs 100000
gen x3=r n o r m a l ( )
su
gen y3 = x3 ∗ x3
graph twoway ( l f i t y3 x3 ) ( s c a t t e r y3 x3 , ///
t i t l e ( ” P e r f e c t Q u a d r a t i c R e l a t i o n s h i p ( r =0) ” ) ) , ///
name ( c o r r 3 )
c o r r y3 x3
44
45
graph combine c o r r 1 c o r r 4 c o r r 2 c o r r 3
Listing 1: Correlations: Scatter Plots and Linear Fit Lines
– In a bivariate (one Y , one X) linear regression only R2 is literally the squared correlation between Y
and X. In a multivariate regression this interpretation does not hold. This will be discussed when we
get to regression.
University of Illinois at Chicago
9
ECON 300 – Econometrics
Review
Fall 2014
Figure 6: Correlations: Scatter Plots and Linear Fit Lines
2.4.2
Variance of Sums of Random Variables
• In general, for random variables X and Y and numbers a, b, and c,
V ar(aX + bY + c) = a2 V ar(X) + b2 V ar(Y ) + 2abCov(X, Y ).
Listing 2 provides an ad hoc proof using Stata.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
clear all
s e t s e e d 12345
s e t obs 10000
gen x = rn o r m a l ( )
gen y = rn o r m a l ( )
gen a = 3
gen b = −4
gen c = 7 . 5
gen z = a ∗x + b∗y + c
su
correlate x y z , covariance
return l i s t
gen v a r x = r ( Var 1 )
gen v a r y = r ( Var 2 )
gen c o v x y = r ( c o v 1 2 )
su z , d
return l i s t
gen v a r z = r ( Var )
∗ Var (aX + bY + c ) ?=? a ˆ2 Var ( x ) + bˆ2 Var ( y ) + 2abCov ( x , y )
di var z
∗ 24.503485
d i a ∗ a ∗ v a r x + b∗b∗ v a r y + 2∗ a ∗b∗ c o v x y
∗ 24.503484
∗ Var (aX + bY + c ) = a ˆ2 Var ( x ) + bˆ2 Var ( y ) + 2abCov ( x , y )
∗ 24.503484 = 24.50348
Listing 2: Variance of Sums of Random Variables
Note that:
– Since c is just a number, it does not affect the variance. If you gave everybody in the room $5, it
would raise mean wealth in the room, but not affect the variation in wealth.
University of Illinois at Chicago
10
ECON 300 – Econometrics
Review
Fall 2014
– If X and Y are independent, as is assumed to be for different observations in a random sample (in a
simple random sample, the data are “independent and identically distributed” or “iid”) then the
covariance term disappears, so
V ar(aX + bY + c) = a2 V ar(X) + b2 V ar(Y )
– Why are the constants a and b squared? Recall that the variance is the expected value of the squared
deviations. So if you multiply all the data by 2, the variance goes up by a factor of four, not 2. The
standard deviation goes up by a factor of 2.
– Note from the definition of covariance that the covariance of a variable with itself is the variance:
Cov(X, X) = V ar(X).
2.5
2.5.1
The Normal and Related Distributions
The Normal Distribution
• The density curve for the normal distribution is graphed in Figure 7
• The probability that the value of Z will be in a specified interval is given by the corresponding area under
this curve
• These areas can be determined by consulting statistical software or a table, such as Table G.1 in Wooldridge
(2013)
• Many things follow the normal distribution (at least approximately):
– the weights of humans, dogs, and tomatoes
– The lengths of thumbs, widths of shoulders, and breadths of skulls
– Scores on IQ, SAT, and GRE tests
– The number of kernels on ears of corn, ridges on scallop shells, hairs on cats, and leaves on trees
Figure 7: Normal Distribution
• The central limit theorem is a very strong result for empirical analysis that builds on the normal distribution
• The central limit theorem states that:
– if Z is a standardized sum of N independent, identically distributed (discrete or continuous) random
variables with a finite, nonzero standard deviation, then the probability distribution of Z approaches
the normal distribution as N increases
University of Illinois at Chicago
11
ECON 300 – Econometrics
2.5.2
Review
Fall 2014
The Normal Distribution with Stata
First, let us plot the normal density function using a twoway function which plots y = f (x). It makes no
difference if x and y are existing variables. The Stata function normalden returns the value of the standard
normal density φ(x) for a given x. To plot the density over [−6, 6] use
1
2
3
4
5
6
clear all
pwd
cd ”C: \ U s e r s \ Dennis \Box Sync \ ECON 300 Stata ”
twoway f u n c t i o n y = normalden ( x ) , r a n g e (−6 6 )
t i t l e ( ” Standard Normal D e n s i t y ” )
///
s a v i n g ( normal , r e p l a c e )
///
The saving option will save the graph as a Stata graph (*.gph). We have saved the graph so that we can access
it later. It will appear in your default directory. Including this option is often a good idea.
The plot of any normal density can be obtained by modifying the command only slightly. Enter help
normalden to find a normalden(x,s) for a normal density with mean 0 and standard deviation s, and
normalden(x,m,s) for a normal density with mean m and standard deviation s. The following code illustrates
the use of different line patterns (lpattern) on a graph.
1
2
3
4
5
6
7
8
9
10
11
12
clear all
pwd
cd ”C: \ U s e r s \ Dennis \Box Sync \ ECON 300 Stata ”
twoway f u n c t i o n y = normalden ( x ) , r a n g e (−5 5 )
| | f u n c t i o n y = normalden ( x , 0 . 8 ) ,
///
r a n g e (−5 5 ) l p a t t e r n ( dash )
///
| | f u n c t i o n y = normalden ( x , 1 , 0 . 8 ) ,
///
r a n g e (−5 5 ) l p a t t e r n ( d a s h d o t )
///
| | , t i t l e ( ” Normal D e n s i t i e s ” )
///
l e g e n d ( l a b e l ( 1 ”N( 0 , 1 ) ” ) l a b e l ( 2 ”N( 0 , 0 . 8 ˆ 2 ) ” )
l a b e l ( 3 ”N( 1 , 0 . 8 ˆ 2 ) ” ) ) ///
s a v i n g ( normal 3 , r e p l a c e )
///
///
We will have a different legend for the three curves. For help with legends enter help legend option. We have
used the graph separator notion || rather than () because it clearly demarks the end of one function and the
beginning of another.
2.5.3
Normal probability calculation
Now calculate some normal probabilities. The cdf for the standardized normal variable Z is so widely used that
it is given its own special symbol, Φ(z) = P (Z ≤ z). It is used in probability calculations as
b−µ
a−µ
≤Z≤
P [a ≤ X ≤ b] = P
σ
σ
b−µ
a−µ
=Φ
−Φ
σ
σ
The Stata function normal computes the function Φ(z) = P (Z ≤ z). To compute the left tail probability
P [Z ≤ 1.33] = Φ(1.33) use
1
2
s c a l a r n t a i l = normal ( 1 . 3 3 )
d i ” l o w e r t a i l p r o b a b i l i t y N( 0 , 1 ) < 1 . 3 3 i s ” n t a i l
The result is
lower tail probability N(0,1) < 1.33 is .90824086
For example, if X ∼ N (3, 9), then
P [4 ≤ Z ≤ 6] = P [0.33 ≤ Z ≤ 1]
= Φ(1) − Φ(0.33)
= 0.8413 − 0.6293
= 0.2120
The commands are
1
2
s c a l a r prob = normal ((6 −3) / 3 ) − normal ((4 −3) / 3 )
d i ” p r o b a b i l i t y 3<=N( 3 , 9 )<=6 i s ” prob
The result is
University of Illinois at Chicago
12
ECON 300 – Econometrics
Review
Fall 2014
probability 3<=N(3,9)<=6 is .21078609
To compute percentiles of the standard normal distribution which are used as critical values for tests or in the
calculation of interval estimates, use invnormal. For example, for the 95th percentile of the standard normal
distribution use
1
2
s c a l a r n 95 = invnormal ( . 9 5 )
d i ”95 th p e r c e n t i l e o f s t a n d a r d normal = ” n 9 5
Producing
95th percentile of standard normal = 1.6448536
2.5.4
The Standard Normal Distribution
2.5.5
Additional Properties of the Normal Distribution
• Any linear transformation of a normally distributed random variable is also normally distributed. This
allows us to transform a variable and look up probabilities that it takes on values in particular ranges
using standard tables, such as those in Appendix G in Wooldridge (2013).
– For example, if X is normally distributed with mean 3 and variance 4, then
X −3
P r(X < −1) = P r
2
= P r(Z < −2),
where Z is (often) used as a symbol for a standard normal random variable. According to the table,
this probability is 0.0228. (Do you see this? How would you instead calculate P r(X > −1)?)
– The sum of independent (see below for definition), normally distributed random variables is also
normally distributed
2.5.6
The t Distribution
• When the mean of a sample from a normal distribution is standardized by subtracting the mean of its
sampling distribution and dividing by the standard deviation of its sampling distribution, the resulting Z
variable
t=
X −µ
√s
N
(17.7)
has a normal distribution
• W.S. Gosset determined (in 1908) the sampling distribution of the variable that is created when the mean
of a sample from a normal distribution is standardized by subtracting and dividing by its standard error
(≡ the standard deviation of an estimator):
Z=
X −µ
√σ
N
• Non sequitur The original t-test was developed by William Sealy Gosset, a biochemist who worked for
Guinness in the early 20th century. At this time, the Guinness brewery was taking the progressive step of
hiring statisticians, chemists and other scientists to improve the quality and consistency of their ingredients
and their beer.
Gosset could only test small samples out of the entire quantity of beer brewed by Guinness while he worked
there. He also tested samples of barley grown by the brewery. He came up with a statistical formula that
would let him prove whether or not the very small samples reflected a much larger population (in this case,
the ‘population’ would be all of the beer or barley used in Guinness’ operations). Previously, statistical
research had focused only on testing with large samples.
University of Illinois at Chicago
13
ECON 300 – Econometrics
Review
Fall 2014
• The exact distribution of t depends on the sample size,
– as the sample size increases, we are increasingly confident of the accuracy of the estimated standard
deviation
• Table G.2 at the end of the textbook shows some probabilities for various t-distributions that are identified
by the number of degrees of freedom:
degrees of freedom = # observations − # estimated parameters
2.5.7
Student’s t-distribution with Stata
The t-density function can be plotted similarly to the Normal. The t-distribution shape is determined by a
single parameter called the degrees of freedom. To plot the t-density for 3 degrees of freedom use the tden
function to generate the density values, and plot them as above. In fact we can overlay this plot against the
standard normal to illustrate the differences
1
2
3
4
5
6
7
clear all
cd ”C: \ U s e r s \ Dennis \Box Sync \ ECON 300 Stata ”
twoway f u n c t i o n y = normalden ( x ) , r a n g e (−5 5 )
///
| | f u n c t i o n y = tden ( 3 , x ) , r a n g e (−5 5 ) l p a t t e r n ( dash )
| | , t i t l e ( ” Standard Normal and t ( 3 ) ” )
///
l e g e n d ( l a b e l ( 1 ”N( 0 , 1 ) ” ) l a b e l ( 2 ” t ( 3 ) ” ) ) ///
saving ( t 3df , replace )
2.5.8
///
t-distribution Probabilities
Calculating t-distribution probabilities is accomplished using the Stata function ttail. This function returns
the probability in the upper tail of the t-distribution. For example, to compute the probabilities that a t(3)
random variable will be greater than 1.33, and then less than 1.33, use
1
2
3
scalar t t a i l = t t a i l (3 ,1.33)
d i ” upper t a i l p r o b a b i l i t y t ( 3 ) > 1 . 3 3 = ” t t a i l
di ” lower t a i l p r o b a b i l i t y t (3) < 1.33 = ” 1 − t t a i l ( 3 , 1 . 3 3 )
These commands produce
upper tail probability t(3) > 1.33 = .13779644
lower tail probability t(3) < 1.33 = .86220356
University of Illinois at Chicago
14
ECON 300 – Econometrics
Review
Fall 2014
Percentiles for the t-distribution use the Stata function invttail. To compute the 95th , and 5th , percentiles for
the t(3) distribution use
1
2
3
scalar t3 95 = i n v t t a i l (3 ,.05)
d i ”95 th p e r c e n t i l e o f t ( 3 ) = ” t 3 9 5
d i ”5 th p e r c e n t i l e o f t ( 3 ) = ” i n v t t a i l ( 3 , . 9 5 )
Producing the results
upper tail probability t(3) > 1.33 = .13779644
lower tail probability t(3) < 1.33 = .86220356
2.5.9
Graphing Tail Probabilities
Illustrating tail probabilities is a useful skill. Find the 95th percentile of the t(38) distribution.
1
d i ”95 th p e r c e n t i l e o f t ( 3 8 ) = ” i n v t t a i l ( 3 8 , . 0 5 )
The 95th percentile then is
95th percentile of t(38) = 1.6859545
For the plot we first generate a graph of the tail area, to the right of 1.686, and recast the graph to an area plot.
Second we add the t-density function, a title and locate some text along the horizontal axis. The place(s) option
places the text to the “south”.
1
2
3
4
5
6
7
8
9
10
clear all
pwd
cd ”C: \ U s e r s \ Dennis \Box Sync \ ECON 300 Stata ”
twoway f u n c t i o n y=tden ( 3 8 , x ) , r a n g e ( 1 . 6 8 6 5 )
c o l o r ( l t b l u e ) r e c a s t ( area )
///
| | f u n c t i o n y=tden ( 3 8 , x ) , r a n g e (−5 5 )
///
l e g e n d ( o f f ) p l o t r e g i o n ( margin ( z e r o ) )
///
| | , y t i t l e (” f ( t ) ”) x t i t l e (” t ”)
///
text (0 1.686 ”1.686” , place ( s ) )
///
t i t l e ( ” Right− t a i l r e j e c t i o n r e g i o n ” )
///
For a two-tail p-value, with the probability that a value from the t(38) distribution will be greater than 1.9216
or less than −1.9216 use
1
2
3
4
5
6
7
8
9
10
11
12
13
clear all
pwd
cd ”C: \ U s e r s \ Dennis \Box Sync \ ECON 300 Stata ”
twoway f u n c t i o n y=tden ( 3 8 , x ) , r a n g e ( 1 . 9 2 1 6 5 )
///
c o l o r ( l t b l u e ) r e c a s t ( area )
///
||
f u n c t i o n y=tden ( 3 8 , x ) , r a n g e (−5 −1.9216)
///
c o l o r ( l t b l u e ) r e c a s t ( area )
///
||
f u n c t i o n y=tden ( 3 8 , x ) , r a n g e (−5 5 )
///
| | , l e g e n d ( o f f ) p l o t r e g i o n ( margin ( z e r o ) )
///
y t i t l e (” f ( t ) ”) x t i t l e (” t ”)
///
t e x t ( 0 −1.921578 ” −1.9216” , p l a c e ( s ) )
///
text (0 1.9216 ”1.9216” , place ( s ) )
///
t i t l e ( ” Pr | t ( 3 8 ) | > 1 . 9 2 1 6 ” )
2.6
Probability Tables
The purpose of these four programs is to display the critical values from the t- and z-distributions. The critical
values are given for a variety of alpha (α) levels.
1
2
3
f i n d i t probtabl
ttable
ztable
For more information see UCLA Probability Tables
3
Fundamentals of Mathematical Statistics
Below are some more basic statistics concepts that you should understand before proceeding in this course.
Many of these will be useful in the first problem set.
University of Illinois at Chicago
15
ECON 300 – Econometrics
3.1
Review
Fall 2014
Populations, Parameters, and Random Sampling
• Population: the entire group of items that interests us
• The idea of a “population” is a central concept in statistics. Its characteristics are usually considered
unknown, as it may be too costly (or in some cases impossible)4 to find them out for certain. Instead, we
will attempt to use samples, along with statistical techniques, to estimate its characteristics.
3.1.1
Sampling
• Sample: the part of this population that we actually observe
• Simple random sample: each member of the population has an equal probability, a priori, of being in the
sample.
• Stratified random sample: within defined subgroups (e.g., by gender or race) a random sample is taken,
but subgroups may not be sampled at the rate at which the groups are represented in the population.
Used, for example, to get larger samples of rare subgroups that are of interest to the survey takers.
• Statistical inference involves using the sample to draw conclusions about the characteristics of the population
from which the sample came
3.2
3.2.1
Finite Sample Properties of Estimators
Estimators and Estimates
• Parameter: a characteristic of the population whose value is unknown, but can be estimated
• Estimator: a sample statistic that will be used to estimate the value of the population parameter. A
procedure for estimating a population parameter or value with a sample. (For example, “add up all the
data and divide by the number of observations”, = x, the sample mean). Estimators are considered
a random variables because (or in contexts where) the sample is randomly chosen. For example, the
value we get for a sample average will depend on who happens to be chosen for our sample, analogous to
randomness of the outcome of rolling a die.
• Estimate: the specific value of the estimator that is obtained in one particular sample; a number obtained
from applying an estimator to a particular sample. This is NOT random.
• Sampling variation: the notion that because samples are chosen randomly, the sample average will vary
from sample to sample, sometimes being larger than the population mean and sometimes lower
• Standard Error – the estimated standard deviation of an estimator. It measures how much we expect the
estimate would vary from one similarly taken sample to another.
√
– You may recall that the standard error of the sample mean, x, is s/ N , where s is the standard
deviation in the data and N is the sample size.
– So for example, in a survey of 2,500 consumers on annual holiday spending with a mean of $750 and
a standard deviation of $1200, the standard error on the mean would be $1200/50 = $24. This means
in repeated random samples of 2,500 consumers, we would expect x to vary by about $24 from one
sample to another.
• A sample statistic is an unbiased estimator of a population parameter if the mean of the sampling
distribution of this statistic is equal to the value of the population parameter
• Because the mean of the sampling distribution of x is µ, x is an unbiased estimator of µ
• One way of gauging the accuracy of an estimator is with its standard deviation:
– If an estimator has a large standard deviation, there is a substantial probability that an estimate will
be far from its mean
– If an estimator has a small standard deviation, there is a high probability that an estimate will be
close to its mean
4 Why impossible? Some populations are infinite (e.g., the set of all possible coin tosses) and some are unobservable (e.g., the
“counterfactual”; more on this later).
University of Illinois at Chicago
16
ECON 300 – Econometrics
Review
Fall 2014
• The sampling distribution of a statistic is the probability distribution or density curve that describes the
population of all possible values of this statistic
– For example, it can be shown mathematically that if the individual observations are drawn from a
normal distribution, then the sampling distribution for the sample mean is also normal
– Even if the population does not have a normal distribution, the sampling distribution of the sample
mean will approach a normal distribution as the sample size increases
• It can be shown mathematically that the sampling distribution for the sample mean has the following
mean and standard deviation:
Mean of x = µ
µ
Standard deviation of x = √
N
3.2.2
(1)
(2)
Unbiasedness
• Unbiased – said of estimators: when the expected value of the sample estimator is equal to the population
parameter. The sample mean is unbiased E[x] = µ, where µ the population average, if the data come from
a random sample.
– Upward biased: if the expected value of the estimator is above the true value of the population
parameter; e.g., E[x] > µ. Downward biased is the opposite.
• Figure 8 demonstrates the bias-efficiency trade-off with a dartboard.
• Figure 9 demonstrates how distributions change when the mean, variance, and mean and variance are
altered using climate change.
Figure 8: Bias vs. Efficiency
University of Illinois at Chicago
17
ECON 300 – Econometrics
Review
Fall 2014
Figure 9: Change in Mean, Variance, and Mean and Variance Example
3.2.3
The Sampling Variance of Estimators
3.2.4
Efficiency
• Efficient estimator: estimator with the smallest variance; i.e., the estimator whose sampling distribution is
more concentrated around the target parameter.
– For example, if x1 and x2 are two unbiased (the property we care about most), then we prefer to use
the estimator with the smaller variance.
3.3
Interval Estimation and Confidence Intervals
• A confidence interval measures the reliability of a given statistic such as x
• The general procedure for determining a confidence interval for a population mean can be summarized as:
1. Calculate the sample average x
2. Calculate the standard error of x by dividing the sample standard deviation s by the square root of
the sample size N
3. Select a confidence level (such as 95 percent) and look in Table G.2 with N − 1 degrees of freedom to
determine the t-value that corresponds to this probability
4. A confidence interval for the population mean is then given by:
s
x ± t∗ √
N
University of Illinois at Chicago
18
ECON 300 – Econometrics
Review
Fall 2014
• Notably, a confidence interval does not depend on the size of the population
• This may first seem surprising: if we are trying to estimate a characteristic of a large population, then
wouldn’t we also need a large sample?
• The reason why the size of the population doesn’t matter is that the chances that the luck of the draw
will yield a sample whose mean differs substantially from the population mean depends on the size of the
sample and the chances of selecting items that are far from the population mean
– That is, not on how many items there are in the population
4
4.1
Using Stata to Do Monte Carlo Experiments
Generating Uniformly Distributed Random Numbers
A continuously distributed random variable that is equally likely to take any value between zero and one has a
standard uniform probability distribution. Such a variable can be created in Stata with the uniform() function.
Generate 10,000 draws from a standard uniform distribution and inspect the results in the browser.
1
2
3
s e t obs 10000
gen u=u n i f o r m ( )
browse
Listing 3: Generating Uniformly Distributed Random Numbers
Stata’s uniform random number generator returns a number between zero and one, exclusive of one itself.
4.1.1
Functions
uniform is the name of a function in Stata. Function names in Stata must always be followed by an open
parenthesis with no intervening spaces. Why no intervening spaces? Because otherwise Stata will think the name
is the name of a variable, and not a function. The pair of parentheses surrounds the argument or arguments of
the function. In this case the uniform function has no arguments, but the parentheses are needed anyway. If a
function has more than one argument, then the arguments are separated by commas.
4.1.2
Graph the Uniformly Distributed Random Variable
Graph the random variable by issuing the following command:
1
histogram u , bin (40)
Listing 4: Graph the Uniformly Distributed Random Variable
For continuous variables, Stata has some internal rule for deciding how many bars (or bins) to use in constructing
the graph. Notice that the density of the random variable is essentially constant throughout its range, which is
why the distribution has the name “uniform”.
4.1.3
Generating a Discrete Random Variable: Example Simulating the Rolls of a Die.
The uniform random number generator is a building block for creating virtually any random variable. We will
illustrate this by using it to simulate rolling a die. The following commands simulate twenty rolls of a die. First,
however, the number of observations in Stata must be set to ten, and in order to do this the memory must be
cleared.
1
2
3
4
clear
s e t obs 10
gen x=i n t ( 6 ∗ u n i f o r m ( ) )+1
browse
Listing 5: Generating a Discrete Random Variable
Here is what the generate command does. First, uniform() returns a uniformly distributed random number
between 0 and 1, not including one itself. 6*uniform() therefore returns a random number between 0 and 6, not
including 6 itself. This becomes the argument to the int() function, which truncates the fractional part from the
number, returning an integer between 0 and 5 inclusive – “int” is short for integer. For example, int(5.999999)
becomes 5. Finally, 1 is added to this, giving integers between 1 and 6 inclusive. Since uniform() results in
numbers uniformly distributed between zero and one, the six final integers assigned to the variable x also have
equal probability.
University of Illinois at Chicago
19
ECON 300 – Econometrics
4.1.4
Review
Fall 2014
Graph the Uniformly Distributed Random Variable
Graph the random variable by issuing the following command:
1
h i s t o g r a m x , d i s c r e t e f r a c t i o n y l i n e ( . 1 6 6 6 6 7 ) x l a b e l (#6) t i t l e ( n=10) name ( g1 , r e p l a c e )
Listing 6: Graph the Uniformly Distributed Random Variable
The horizontal grid line is drawn at one-sixth, the probability of any particular side of the die facing up.
4.2
The Law of Large Numbers and the Frequentist Notion of Probability
The limit in the frequentist notion of probability is the law of large numbers, that is, as the sample size – or
number of trials – increases towards infinity, the sample proportion favorable to an event approaches – “settles
down to” – the probability of the event. We will illustrate this by increasing the number of rolls of the die, and
noticing that the sample distribution of outcomes settles down to the theoretical discrete uniform distribution of
one-sixth probability for each side of the die.
In order to do this, repeat the commands above, each time changing the number of observations to be 50,
then 200, then 10,000. Also, change the name for each graph as indicated below. The easiest way to do this is
to single-click on each command in the Review window, and then edit it in the Command window as necessary.
1
2
3
4
clear
s e t obs 50
gen x=i n t ( 6 ∗ u n i f o r m ( ) )+1
h i s t o g r a m x , d i s c r e t e f r a c t i o n y l i n e ( . 1 6 6 6 6 7 ) x l a b e l (#6) t i t l e ( n=50) name ( g2 , r e p l a c e )
5
6
7
8
9
clear
s e t obs 200
gen x=i n t ( 6 ∗ u n i f o r m ( ) )+1
h i s t o g r a m x , d i s c r e t e f r a c t i o n y l i n e ( . 1 6 6 6 6 7 ) x l a b e l (#6) t i t l e ( n=200) name ( g3 , r e p l a c e )
10
11
12
13
14
clear
s e t obs 10000
gen x=i n t ( 6 ∗ u n i f o r m ( ) )+1
h i s t o g r a m x , d i s c r e t e f r a c t i o n y l i n e ( . 1 6 6 6 6 7 ) x l a b e l (#6) t i t l e ( n=10000) name ( g4 , r e p l a c e )
Listing 7: Law of Large Numbers
Finally, view all the graphs together with the following command:
1
graph combine g1 g2 g3 g4 ,
t i t l e ( Law o f Large Numbers )
Listing 8: Law of Large Numbers
University of Illinois at Chicago
20
ECON 300 – Econometrics
4.3
Review
Fall 2014
Tossing Coins with Stata
We can use the random uniform distribution generator in Stata to simulate coin tossing:
1
2
3
s e t obs 10000
gen x = 1 i f u n i f o r m ( ) > . 5
r e p l a c e x = 0 i f x == .
Listing 9: Tossing Coins
Checking the binomial distribution: Treat these 10,000 flips as 1000 “experiments” in which we toss the coin
ten times and record the number of heads each time.
1
2
3
4
5
gen c a s i n o = n
gen t r i a l = i n t ( ( c a s i n o −1) / 1 0 )
egen heads = sum ( x ) , by ( t r i a l )
h i s t o g r a m heads , b i n ( 1 1 )
tab heads
Listing 10: Tossing Coins
Non Sequitur A paper by Diaconis et al. (2007) used a high-speed camera that photographed people flipping
coins. The three researchers determined that a coin is more likely to land facing the same side on which it
started. If tails is facing up when the coin is perched on your thumb, it is more likely to land tails up. The
authors suggests that in coin-tossing there is a particular “dynamical bias” that causes a coin to be slightly
more likely to land the same way up as it started. How much more likely? At least 51 percent of the time, the
researchers claim, and possibly as much as 55 percent to 60 percent – depending on the flipping motion of the
individual.
5
Principles of Economics: A Review
Economics is the study of mankind in the ordinary business of life.
– Alfred Marshall
Figure 10: Dilbert Company Economist
. . . the dismal science
– Thomas Carlyle
• Economics: The social science that studies the choices that individuals, businesses, governments, and
entire societies make as they cope with scarcity and the incentives that influence and reconcile those
choices.
• Scarcity: Our inability to satisfy all our wants.
• Incentive: A reward that encourages an action or a penalty that discourages one.
University of Illinois at Chicago
21
ECON 300 – Econometrics
5.1
Review
Fall 2014
Non Sequitur : Random Famous People Who Majored in Economics
5. John Elway – Hall of Fame NFL quarterback (Stanford)
4. Bob Barker – TV Game Show host on The Price Is Right (Drury College)
3. Mick Jagger – Rolling Stones (London School of Economics)
2. William Shatner – “Khaaaan!” (McGill University)5
1. Arnold Schwarzenegger – Body Builder/Actor/Governor/Terminator (University of Wisconsin)
5.2
Microeconomics versus Macroeconomics
• Microeconomics: The study of the choices that individuals and businesses make, the way these choices
interact in markets, and the influence of governments.
• Macroeconomics: The study of the performance of the national economy and the global economy.
5.3
Supply & Demand Basics
• Demand
1. Law of Demand: ceteris paribus, the higher the price of a good, the smaller is the quantity demanded
of it; the lower the price of a good, the larger is the quantity demanded of it.
2. Elasticity of Demand: Measures the change in quantity demanded in response to a change in price.
3. Determinants (Shifters)
(a) Tastes or Preferences
(b) Number of Buyers
(c) Income
– Normal
– Inferior
(d) Prices of Related Goods
– Complement
– Substitute
(e) Consumer Expectations
• Supply
1. Law of Supply: ceteris paribus, the higher the price of a good, the greater is the quantity supplied of
it.
2. Determinants (Shifters)
(a)
(b)
(c)
(d)
(e)
(f)
Resource (Input) Prices
Technology
Taxes and Subsidies
Prices of Other Goods
Producer Expectations
Number of Sellers
• Supply & Demand
1. Movement versus Shift
2. Equilibrium
3. Price Controls
–
–
–
–
–
5 Non
Price Ceiling
Price Floor
Binding versus Unbinding
Surplus
Shortage
sequitur : If you have not experienced William Shatner’s version of ”Rocketman”, then please treat yourself.
University of Illinois at Chicago
22
ECON 300 – Econometrics
5.4
Review
Fall 2014
Secondary (Tertiary) Effects
• Secondary effects: unintended consequences of economic actions that may develop slowly over time as
people react to events
• Examples:
1. In many cities, public officials have imposed rent controls on apartments. The primary effect of this
policy, the effect policy makers focus on (i.e. the one they sell to the public), is to keep rents from
rising. Overtime, however, fewer apartments get built because renting them becomes less profitable.
Moreover, existing rental units deteriorate because owners have plenty of tenants anyway. Thus,
the quality and quantity of housing may decline as a result of what appears to be a reasonable
measure to keep rents from rising. Another issue is ignoring how landlords would respond in legally
circumventing the policy; e.g., tacking on ’required’ extras to the apartment such as curtains that the
tenant must rent from the landlord for $250 a month in addition to the rent.
2. A macro related example is Congress passing the Smoot-Hawley Tariff Act of 1930 during the Great
Depression. At first the tariff seemed successful. However, the United States’ trading partners
retaliated. Overall, world trade decreased by approximately 66% between 1929 and 1934!
5.5
Marginal Analysis & Utility
• Marginal analysis: an examination of the effects of additions to or subtractions from a current situation
• Utility: a measure of pleasure or satisfaction
• See Figure 11 for an example.
Figure 11: Utility
5.6
Diminishing (Marginal) Returns
• (law of) diminishing (marginal) returns: The principle that as successive increments of a variable resource
are added to a fixed resource, the marginal product of the variable resource will eventually decrease.
• See Figure 12 for an example.
University of Illinois at Chicago
23
ECON 300 – Econometrics
Review
Fall 2014
Figure 12: Diminishing Returns
5.7
Positive vs. Normative
• Normative (questions) economic analysis: addresses questions that involve value judgments. It concerns
what ought to happen rather than what did, will, or would happen.
• Positive (questions) economic analysis: addresses factual questions, usually concerning choices or outcomes.
It concerns what did, will, or would happen.
• Economics could convince reasonable people of the truth of the “positive” (“what is”) predictions, but
economics cannot end a “normative” (“what should be”) disagreement.
• Note: It is not always possible or at least easy to separate normative from positive statements. That said,
in this course the focus is nearly exclusively focused on positive economic analysis; i.e. the did, will, or
would. In answering any exam or problem set question I do not care about my, your, or someone else’s
opinion.
• Note: The word positive does not mean that the answer admits no doubt. On the contrary, all answers,
particularly those involving predictions, involve some degree of uncertainty. Rather, in this context,
positive simply means that the prediction concerns a factual matter.
5.8
Principles of Macroeconomics: A Review
• Gross Domestic Product (GDP)
1. Intermediate Goods: Products that are purchased for resale or further processing or manufacturing.
2. Gross Domestic Product (GDP): The market value of all final goods and services produced within a
country during a given time period.
3. Gross national product (GNP): The value of final goods and services produced by residents of a
country, even if the production takes place outside the country.
4. Expenditure (Output) Approach: The method that adds all expenditures made for final goods and
services to measure the gross domestic product.
(a) Personal Consumption Expenditures
– durable goods (lasting 3 years or more)
– nondurable goods
– services
(b) Gross Private Domestic Investment
– All final purchases of machinery, equipment, and tools by business enterprises.
– All construction.
– Changes in inventories.
(c) Government Purchases
– Includes spending by all levels of government (federal, state and local).
University of Illinois at Chicago
24
ECON 300 – Econometrics
Review
Fall 2014
– Includes all direct purchases of resources (labor in particular).
(d) Net Exports
– Exports minus Imports
– Imports: Spending by individuals, firms, and governments for goods and services produced
in foreign nations.
– Exports: Goods and services produced in a nation and sold to buyers in other nations.
5. Bureau of Economic Analysis (BEA): An agency of the U.S. Department of Commerce that compiles
the national income and product accounts.
6. Real versus Nominal
– Real GDP: The value of final goods and services produced in a given year when valued at the
prices of a reference base year.
– Nominal GDP: The value of the final goods and services produced in a given year valued at the
prices that prevailed in that same year. It is a more precise name for GDP.
7. Real GDP per person (capita): Real GDP divided by the population.
8. Criticisms
(a)
(b)
(c)
(d)
(e)
(f)
Household production
Underground economic activity
Health and life expectancy
Leisure time
Environmental quality
Political freedom and social justice
• The Price Level & Inflation
1. Price level: The average level of prices.
2. Inflation [Deflation]: A persistently rising [falling] price level.
3. Why Is Inflation [Deflation] Problematic?
(a)
(b)
(c)
(d)
Redistribution of Income
Redistribution of Wealth
Lowers Real GDP and Employment
Diverts Resources from Production
4. Hyperinflation: An inflation rate of 50 percent a month or higher that grinds the economy to a halt
and causes a society to collapse.
5. Measurement
(a) Consumer Price Index (CPI): An index that measures the average of the prices paid by urban
consumers for a fixed basket of the consumer goods and services.
(b) GDP Deflator
(c) Personal Consumption Expenditure (PCE) Deflator
6. cost-of-living adjustment (COLA) An automatic increase in the incomes (wages) of workers when
inflation occurs; guaranteed by a collective bargaining contract between firms and workers.
• Unemployment
1. Working-age population: The total number of people aged 16 years and over who are not in jail,
hospital, or some other form of institutional care.
2. Labor force: The sum of the people who are employed and who are unemployed.
3. Unemployment rate: The percentage of the people in the labor force who are unemployed.
4. Labor force participation rate: The percentage of the working-age population who are members of
the labor force.
5. Types
– Frictional Unemployment: The unemployment that arises from normal labor turnover-from
people entering and leaving the labor force and from the ongoing creation and destruction of
jobs.
University of Illinois at Chicago
25
ECON 300 – Econometrics
Review
Fall 2014
– Structural Unemployment: The unemployment that arises when changes in technology or
international competition change the skills needed to perform jobs or change the locations of
jobs.
– Cyclcial Unemployment: The higher than normal unemployment at a business cycle trough and
the lower than normal unemployment at a business cycle peak.
6. Criticisms
– Part-time employment
– Discouraged worker: A marginally attached worker who has stopped looking for a job because of
repeated failure to find one. Former workers who have left the labor force because they have not
been able to find employment.
7. Natural unemployment rate: The unemployment rate when the economy is at full employment-natural
unemployment as a percentage of the labor force.
8. Full employment: A situation in which the the unemployment rate equals the natural unemployment
rate. At full employment, there is no cyclical unemployment-all unemployment is frictional and
structural.
• Macroeconomic Policy
– Fiscal Policy
∗ Expansionary Fiscal Policy: An increase in government purchases of goods and services, a
decrease in net taxes, or some combination of the two for the purpose of increasing aggregate
demand and expanding real output.
∗ Contractionary Fiscal Policy: A decrease in government purchases for goods and services, an
increase in net taxes, or some combination of the two, for the purpose of decreasing aggregate
demand and thus controlling inflation.
∗ Discretionary Fiscal Policy: Deliberate changes in taxes (tax rates) and government spending by
Congress to promote full employment, price stability, and economic growth.
– Monetary Policy
∗ The Fed’s Policy Tools
· Open market operation (OMO): The purchase or sale of government securities - U.S. Treasury
bills and bonds - by the Federal Reserve in the loanable funds market.
· Quantitative easing: buying securities beyond the short-term Treasury securities that are
usually involved in open market operations. The Fed began purchasing 10-year Treasury
notes to keep their interest rates from rising. Interest rates on home mortgage loans typically
move closely with interest rates on 10-year Treasury notes. The Fed also purchased certain
mortgage-backed securities (MBS). The Fed’s objectives in keeping interest rates on 10-year
Treasury notes low and in purchasing mortgage-backed securities was to keep interest rates on
mortgages low and to keep funds flowing into the mortgage market in order to help stimulate
demand for housing.
University of Illinois at Chicago
26
ECON 300 – Econometrics
Review
Fall 2014
References
Gujarati, D. N. (2003). Basic Econometrics. McGraw-Hill/Irwin.
Wooldridge, J. M. (2013). Introductory Econometrics: A Modern Approach. Thomson South-Western.
University of Illinois at Chicago
27
Download