Economics 323 Business Conditions Analysis Spring 2016
Dr. H. H. Stokes
Economics 323 (Version 9 November 2015)
Business Conditions Analysis
Dr. Houston H. Stokes
722 UH
hhstokes@uic.edu
Web Page www.uic.edu/~hhstokes
TA:
Required Text
Business Forecasting, John Hanke, Dean Wichern 9th ed Pearson / Prentice Hall, 2009
Optional References
Data Analysis Using Stata by Ulrich Kohler and Fraunke Kreuter. 3rd Edition, Stata Press, 2012.
Introduction to Time Series Using Stata by Sean Becketti. Stata Press, 2013
Introduction to Econometrics by Christopher Dougherty, 4th edition, Oxford University Press, 2011.
Introductory Econometrics: A Modern Approach by Jeffrey M. Wooldridge, 5th edition, 2013, South
Western Cengage Learning. This book is a bit expensive but is available from the Library.
“Notes on the Basics of Econometric Modeling geared to Dougherty (2011)" by Houston H. Stokes, is
available on-line from the course web page and will be discussed in class. See file
Preliminary_Notes.docx
“Econometric Notes” by Houston H. Stokes contains some sections on important topics. It also provides
an introduction to Statistics. See especially sections 1-3 and 5 which we will initially discuss. This
document is available from the course web page. See file
Econometric_Notes.docx
Purpose of Course
The purpose of the course is to extend the student’s knowledge of Business Conditions Analysis by
teaching statistical methods of business forecasting. This includes forecasting micro and macro data.
Students will run the computer in class and get experience in a number of software systems. Students will
be introduced to OLS and GLS analysis, nonlinear estimation of GLS models, recursive residual analysis
options to test the stability of the estimated coefficients and simple time series ARIMA models.
Students will be asked to apply their knowledge in number of computer exercises which will be
graded and a final exam, which is a take home. The grading will be 2/3 computer exercises and 1/3 open
1
Economics 323 Business Conditions Analysis Spring 2016
Dr. H. H. Stokes
book final or an approved final project. Students can work in teams of two people but must turn in an
individual paper with their name on it. If working in a team, be sure and indicate your team member on the
cover sheet of your homework. Once a team is formed, it must stay together for the term. Students must
have selected their teams by the third week or operate as a "lone wolf" by choice. Students are free to use
software of their choosing. Stata and B34S will be discussed in the course as well as some Excel. The goal
is to both teach how to do applied economic work and to allow students to list software they can use on
their resumes. Students wanting free B34S software systems for their home machines will be allowed to
obtain them.
Objectives
The main objective of the course is to give students knowledge that will allow them to apply the
economic theory they have learned in other courses in a manner that will both be useful and will facilitate
them obtaining a job. Unless economic knowledge is able to be applied, it is of substantially less value and
will soon be forgotten. Since forecasting provides a competitive edge, a great deal of emphasis is placed on
developing systematic forecasting skills. Systematic forecasting skills are those that can be replicated by
others. Many students taking this class in prior years have used their projects in job interviews to illustrate
their skills and to differentiate themselves from other job seekers.
Jobs
Since many of you will want to obtain jobs that use your economics training. The best way to
make a good impression at the job interview is differentiate your skills from the skills of the other
applicants. It has been found that student resumes that stress the fact that they have completed a
successful econometric research project in the area of the job may have a "leg up" on those applicants
that appears clueless at the interview stage. To achieve this end, students can mention the types of
problems they have analyzed and the software they have used on their resumes. The project alternative is
in place of the take home final and would be done by each individual student.
2
Economics 323 Business Conditions Analysis Spring 2016
Dr. H. H. Stokes
Computer Software
For many of you this will be your first time using the computer for serious work apart from the
basics you have learned in the required stat courses. It is important that you do not shrink from this
aspect of the course. Computer work is exacting and is necessary for research. The problem sets can be
run on your own machine or from the labs where it is possible to open the class web pages and cut and
paste commands into control files. Many students find it helpful to work together on their computer
projects since they will be able to explain to each other what they are trying to do and learn to express
themselves. What is important is to discuss results clearly and with understanding. Attaching masses of
computer output is NOT a substitute for writing in a clear fashion. In business stress is laid on being
clear and specific in what you have found. While many of the problems can be solved using Excel many
cannot be solved using this limited software. For this reason we will be using Stata for much of our
work.
Stata examples in this document provides a quick introduction to the software and will be
discussed in class first to get you up and running. Since this document is on-line, you will be able to cut
and paste the control files and get up and running very fast. Using the internet and cut and paste it is easy
to run models.
Kohler & Kreuter (2012) provides more Stata detail, if that is needed. A student version of Stata
is available from the UIC computer center for a nominal charge or can be accessed from various UIC
labs.
Stata is available in SCE408, SELE 2249F, SELE 2249, and BSB 4133.
Students can obtain a copy of Stata that expires on 6/30/2015 for
$90.00
https://webstore.illinois.edu/Shop/search.aspx?keyword=Stata
Advanced references that are available for more detail on forecasting as needed.
1. Chapter 2 of Specifying and Diagnostically Testing Econometric Models Houston H. Stokes, 2nd ed,
Quorum Books 1997. Updated version available on line from web page.
2. Chapter 7 of Specifying and Diagnostically Testing Econometric Models Houston H. Stokes, 2nd ed,
Quorum Books 1997 Updated version available on line from web page.
3. Chapter 9 of Specifying and Diagnostically Testing Econometric Models Houston H. Stokes, 2nd ed,
Quorum Books 1997. Updated version available on line from web page.
4. Stokes (200x) The Essentials of Time Series Modeling: An Applied Treatment with Emphasis on
Topics Relevant to Financial Analysis Chapter 2 "Time series modeling objectives" available on line
from web page.
3
Economics 323 Business Conditions Analysis Spring 2016
Dr. H. H. Stokes
5. Stokes (200x) The Essentials of Time Series Modeling: An Applied Treatment with Emphasis on
Topics Relevant to Financial Analysis Chapter 4 "Stationary Time Series Models" available on line
from web page.
6. Stokes (200x) The Essentials of Time Series Modeling: An Applied Treatment with Emphasis on
Topics Relevant to Financial Analysis Chapter 5 "Estimation of AR(p), MA(q) and ARMA(p,q) Models"
available on line from web page.
Quick Start: To get going assume you have a file of data on the age of 6 cars and their value
age
1
3
6
10
5
2
value
1995
875
695
345
595
1795
Your goal is to estimate a model of the form
Value=   Age
Your results should show
Value =
1852.9 (6.45)
_
R2 = .672
178.41 Age
(-3.35)
which has been discussed in the overview notes on statistics.
A simple Stata setup for this problem is
input x y
1 1995
3 875
6 695
10 345
5 595
2 1795
end
list
summarize
regress y
x
which if saved with the name
cars.do
can be is run in batch with the command
4
Economics 323 Business Conditions Analysis Spring 2016
Dr. H. H. Stokes
stata /e do cars.do
which produces in cars.log
___ ____ ____ ____ ____ (R)
/__
/
____/
/
____/
___/
/
/___/
/
/___/
12.1
Statistics/Data Analysis
Copyright 1985-2011 StataCorp LP
StataCorp
4905 Lakeway Drive
College Station, Texas 77845 USA
800-STATA-PC
http://www.stata.com
979-696-4600
stata@stata.com
979-696-4601 (fax)
Single-user Stata perpetual license:
Serial number: 3012042652
Licensed to: Houston H. Stokes
U of Illinois
Notes:
1.
Stata running in batch mode
. do cars.do
. input x y
1.
2.
3.
4.
5.
6.
7.
1 1995
3 875
6 695
10 345
5 595
2 1795
end
x
y
. list
1.
2.
3.
4.
5.
6.
+-----------+
| x
y |
|-----------|
| 1
1995 |
| 3
875 |
| 6
695 |
| 10
345 |
| 5
595 |
|-----------|
| 2
1795 |
+-----------+
. summarize
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------x |
6
4.5
3.271085
1
10
y |
6
1050
679.5219
345
1995
. regress y
x
Source |
SS
df
MS
-------------+-----------------------------Model | 1702935.05
1 1702935.05
Residual | 605814.953
4 151453.738
-------------+-----------------------------Total |
2308750
5
461750
Number of obs
F( 1,
4)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
6
11.24
0.0285
0.7376
0.6720
389.17
-----------------------------------------------------------------------------y |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------x | -178.4112
53.20631
-3.35
0.028
-326.1356
-30.68683
_cons |
1852.85
287.3469
6.45
0.003
1055.048
2650.653
-----------------------------------------------------------------------------.
end of do-file
5
Economics 323 Business Conditions Analysis Spring 2016
Dr. H. H. Stokes
An alternative is to place this file in the Stata do file editor and execute it!
Data is saved on-line in the Excel directory. These can be accessed by Stata or Excel from your PC.
APPL
BEER
BOATS
CALDATA
CAMPERS
CAR
CASE10_1
CASE10_4
CASE2_2
CASE4_1
CASE4_3
CASE5_1
CASE6_1
CASE6_2
CASE6_3
CASE6_5
CASE7_1
CASE7_3
CASE7_4
CASE8_1
CASE8_2
CASE8_4
CASE9_5
CENEX
COMPANY
DAY90
DISINC
EDPRICE
EGGS
EMPLOYEE
EX10_3
EX10_4
EX10_5
EX10_6
EX10_7
EX10_8
EX4_1
EX4_3
EX4_4
EX4_5
EX5_1
EX5_4
EX6_1
EX6_11
EX7_1
EX7_12
EX7_5
EX7_7
EX7_8
EX8_10
EX8_2
EX8_8
EX9_1
EX9_3
EX9_4
FARMS
FORREST
FURNACE
GNP
HANKEINFO
HOMES
HOUSEHOLD
IMPORTS
Appliance Shipments Monthly 1983-1993
Monthly Beer Production 1983-1993
Demand for Boats
Monthly Sales of Electronic Calculators
Camp Development Decision
Monthly Deliveries of Dorf Company
Weekly Restaurant Sales
Lydia Pinkham Annual Data 1907-1960
Mr. Tux Rental Data
Monthly Sales Data 1983-1995
Consumer Credit Counseling Clients
Solar Alternative Company Sales 93-94
Tiger Transport Company - Weight & MPG
Butcher Products Production Data
Ace Personnel Company
Credit clients
Bond Market Dataset
Mr. Tux with Dummy Variables
Consumer Credit Counseling Extended Data
Small Engine Doctor Data
Case Study 8_2 Tux Data
AAA Emergency Road Call Data 1988 - 1993
Full AAA Emergency Data Jun 87 - Jul 93
Cenex Chemical Process
Data Not Found
Quarterly 90-day Treasury Bills
Disposable Income 1955 - 1985
Price of Higher Education 70-93
Production of Eggs 1961-1992
Employee Study
Daily Transportation Index Close
Readings from Atron Process
Errors of Atronm Quality Control
Errors for Ed Jones Quality Control
Closing Stocks of ISC
Keytron Sales
VCR's Sold
40 Random Numbers
Sears Sales
Outboard Marine 1984-1996
Acme Tool Company
Weekly Movie Video Rentals
Milk Sales
Hardware Advertising and Sales
Milk vs Price vs Advertising
Washington Power Usage
Job Performance
Food Expenditure
Zurenko Pharmaceutical Company
Quarterly Sales Outboard Marine
New Passenger Cars in the United States 60-92
Monthly Registration of Cars 1986-1992
Reynolds Metals 1976-1996
Novak Corp Sales 1980-1996
Yearly Sears Sales 1976-1966
Number of US Farms 1975-1993
Forest Products Car Loadings
Shipments of Furnaces 1982-1990
GNP 1950 - 1991
Lists of Hanke Datasets
Demand for Motor Homes
Household vs populations
National Imports for the years 1967 to 1986
6
Economics 323 Business Conditions Analysis Spring 2016
INVEST
KINSTON
MARRIAGE
MEDIAN
MOODY
MOTEL
PAPER
PE
PERFUME
PR10_10
PR10_11
PR10_12
PR10_13
PR10_14
PR10_15
PR10_7
PR10_8
PR10_9
PR2_1
PR2_10
PR2_13
PR2_2
PR2_7
PR2_9
PR4_13
PR4_17
PR4_20
PR5_11
PR5_12
PR5_13
PR5_14
PR5_15
PR5_6
PR5_9
PR6_11
PR6_12
PR6_13
PR6_3
PR6_4
PR6_5
PR6_6
PR6_7
PR6_8
PR6_9
PR7_10
PR7_11
PR7_13
PR7_15
PR7_8
PR8_11
PR8_13
PR8_19
PR8_20
PR9_10
PR9_11
PR9_12
PR9_13
PR9_14
PR9_15
PR9_16
PR9_17
PRES
PRIME1
PRIME2
RAILROAD
REFILL
RIDERSHIP
SALARY
SEARINC
STATIONS
Non residential Investment 1950-1989
Monthly Sales of Kinston
Number of Marriages 1965-1989
Population Dataset
Electric Utility Stocks Annual Ave
Monthly Occupancy For Model 9 1987-1996
Monthly Demand for Paper Products
Quarterly Industrial P/E Ratio
Monthly Demand for Perfume
80 obs of data
96 obs
IBM Stock Quotes
Daily DEF Corporation stock
Weekly Auto Accidents 1984-1985
Corn Price in Spokane Washington
Chips Bakery
126 obs of test data
80 Obs of test data
Customer Transactions
Books Sold vs Shelf Space
Random Stock Vs Temp Data
Housing Prices
Family Sizes
Maintenance of Buses
Marriages in US 85 - 92
Quarterly Loans Dominion Bank
Earning Per Share Price Company
Demand for Hughes Supply
Asset Value General American Investors
Revenues from Southdown Inc
Triton Sales per share
Revenues for the Consolidated Edison Company
Apex Mutual Fund Price
Davenport Bond Yield
Building Permits vs Interest Rate
Print Cost Data
Defective parts vs batch size
Sales vs Advertising
Checkout time vs Value of Purchase
Maintenance cost vs age
ooks Sold vs Shelf Space
Orders vs Catalogues
Yearly Investment vs Interest Rate
Forecasting a Competitor bid
Sales = f(# outlets, # cars)
# Registered autos = f( )
What Makes a winning baseball team
Presto Sales
Checkout time vs Purchase & # items
Capital Spending 1977 - 1993
Spending on TV Ads
Goodyear Tires Sales 1985-1996
Retail Sales Data
Gas Consumption in the US
Resort Study
Revenue Data
Shareholder Data
passengers who flew on Thompson Airline planes.
Thomas Furniture Company sales
Dicksen vs Industry Sales
Savings in period 1935 - 1954
New Prescriptions 1990-1996
Quarterly Prime Rate 1985-1991
Monthly average prime 1945-1995
Railroad Labor Annual Average Cents Per Hour
Monthly refill data 1983-1990
Daily Bus Ridership
Age vs Salary Data
Sears Sales 1955 - 1985
Number of TV stations changing hands in 1991
7
Dr. H. H. Stokes
Economics 323 Business Conditions Analysis Spring 2016
STOCK
TRAVEL
WASH
WELLHEAD
WOOD
Dr. H. H. Stokes
S & P Monthly 1945-1995
US Citizen departures 1961-1991
Boats - Motor Homes - Income
Average Wellhead Price, Natural gas 72-93
Monthly Wood Production 1992-1996
Problem Sets, Final/Project:
There are 5 problem sets. These will be due on the 4th, 6th, 9th, 11th and 13th week for problem sets
1 - 5 respectively. Lates will not be accepted unless there is prior written approval. Problem set
answers should be typed and carefully laid out. Since they will be 2/3 of the grade in the course, care should
be taken in their preparation. If past classes are any indication, an impressive layout can be leveraged in a
subsequent job interview to illustrate your capability to solve "real world" problems using modern methods
of analysis. The remaining 1/3 of the grade is either the student project which involves obtaining data from
the web or other sources and estimating a regression and ARIMA model OR taking the take home final.
The objective of the project is to give students a paper which illustrates their capabilities which they can
use in job interviews to further show their capability. If you have a job goal in one specific industry, it is a
good idea to select a paper topic that is related to this area. Further information on the project is contained
below.
8
Economics 323 Business Conditions Analysis Spring 2016
Dr. H. H. Stokes
Assignments
I. Introduction to Data Analysis
-
Hanke-Wichern Chapters 1-2
Stokes [5], [6]
II. Regression Analysis
-
Hanke-Wichern Chapter 6, 7
Stokes Econometric_Notes
Stokes (1997) Chapter 2 listed as [2]
Stokes (2004) Chapter 2 listed as [9]
III. Recursive Residual Analysis
-
Stokes (1997) Chapter 9 listed as [4]
IV. ARIMA Model Building - Identification and estimation.
Stokes (1997) Chapter 7 listed as [3]
Stokes (2004) Chapter 4 "Stationary Time Series Models" listed at [10]
Stokes (2004) Chapter 5 "Estimation of AR(p), MA(q) and ARMA(p,q) Models" listed as [11]
Hanke-Wichern Chapter 8 and 9
Problem Sets
1. Introductory Statistical Analysis - Due 4th week
2. Estimation and Testing of Regression Models. Due 6th week.
3. Applied Econometric Analysis. Due 9th week
4. Identification of ARIMA Models using real and generated data. Due 12th week
5. Estimation of ARIMA Models. Due 14th week.
9
Economics 323 Business Conditions Analysis Spring 2016
Dr. H. H. Stokes
Terms and concepts to understand
OLS Model – define, state assumptions and discuss the effect on the estimates or standard errors if the
assumptions of OLS are not met
Multicollinearity
Simultanity
Exogenous, Endogenous
Differences-in-Differences model – define and discuss interpretation
Regression discontinuity
Proxie variable
Probit, logit and tobit model – define and show how used.
Panel Data – define and slow the advantages and disadvantages of Fixed effects model, random effects
model.
Instrumental variable - Define and show how used. Be able to discuss and give examples.
Effects of serial correlation, heteroskedasticity and model specification on estimated coefficients, estimated
standard errors and the ability of a model to accurately draw inferences.
Population
Sample
Sample selection bias.
Datasets:
Datasets for the problem sets are on-line under the course web page.
Datasets for SAS are in the class FTP location in
ftp.uic.edu/pub/depts/econ/hhstokes/e323/sas_files/
Datasets for Excel / Stata are in the class FTP location in
ftp.uic.edu/pub/depts/econ/hhstokes/e323/excel_files/
10
Economics 323 Business Conditions Analysis Spring 2016
Dr. H. H. Stokes
Problem Set # 1 Introductory Statistical Analysis
Goals: Introduction to Computer
Use Data Sampling
1.
Hanke-Wichern (2009) page 50 # 12 list data on the number of books sold (BSOLD) and the feet of
shelf space (SHELFS). You are asked to calculate the correlation BSOLD and SHELFS. In addition run
a regression of BSOLD = f(constant SHELFS). Draw a skatter diagram. Use your model to predict how
many books will be sold if the shelf space is 8.23. For software you can use any program that you
would like.
The means obtained should be:
Variable
WEEK
BSOLD
SHELFS
CONSTANT
Label
# Cases
1 Week Data Collected
2 Books Sold
3 Shelf Space
4
11
11
11
11
Mean
6.00000
210.182
4.88182
1.00000
Std. Dev.
3.31662
54.7153
1.42816
0.00000
Variance
11.0000
2993.76
2.03964
0.00000
Maximum
11.0000
295.000
7.70000
1.00000
Stata will run the problem with statements:
input week bsold shelfs
* Data from Hanke - Wichern Edition 9 page 50 # 12
* Data from Hanke - Wichern Edition 8 page 48
* Data from Edition 7 page 44
1 275 6.8
2 142 3.3
3 168 4.1
4 197 4.2
5 215 4.8
6 188 3.9
7 241 4.9
8 295 7.7
9 125 3.1
10 266 5.9
11 200 5.0
end
label variable week
"Week Data Collected "
label variable bsold "Books Sold
"
label variable shelfs "Shelf Space
"
summarize
describe
set graphics on
graph twoway scatter
corr
regress bsold shelfs
bsold shelfs, saving(graphp1_1)
11
Minimum
1.00000
125.000
3.10000
1.00000
Economics 323 Business Conditions Analysis Spring 2016
Dr. H. H. Stokes
2. Hanke-Wichern (2009) page 50 problem 13 shows a dataset of 200 weekly observations of temperature
in Spokane (Temp) and the number of shares of stock trades for Sunshire Mining (Shares). The dataset is
Pr2-9.xls, You are asked to calculate the correlation between Shares and Temp and run a model Shares =
f(constant, Temp). Code to load from a file is shown below shown below together with You can either
run this problem with a modified script in “batch mode” or load the datafile directly into Stata from the
web and give the appropriate commands.
import excel using "c:\master\master1\class\e323\hanke\excel\ch2\pr2-9.xls",firstrow
summ
* list
corr
regress Shares Temp
*
* Alternative ways to do a bootstrap
*
regress Shares Temp, vce(bootstrap, reps(400) seed(10101))
*
* Here we see what the coef look like using resampling techniques.
bootstrap _b _se, reps(400) seed(10101):regress Shares Temp
matrix list e(b_bs)
*
regress Shares Temp, robust
When you load the data means should be:
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------C1 |
200
100.5
57.87918
1
200
Shares |
200
48.86
29.28461
0
99
Temp |
200
47.75
28.176
1
99
Discuss in detail what you find in terms of coefficients and SE’s .
2. The next part of the problem is to sample the data 400 times (with replacement) and see what happens
to the SE and the coef.
3. Using the OLS results and the bootstrap coef forecast Shares for a Temp value of 63. Show your work.
12
Economics 323 Business Conditions Analysis Spring 2016
Dr. H. H. Stokes
Problem Set # 2. Regression Analysis
Goals: OLS Model Specification
Introduction to GLS
Pindyck & Rubinfeld [1981] page 458 - 466, which can be consulted for more detail, contains data
on a number of economic series. This data is on line in member penrub.dct and can be called from inside
Stata. The steps to load this data on PC are
infile using “c:\master\master1\class\e323\penrub.dct”, clear
Means and variable names and descriptions are:
Variable
time
qt
c
g
gnp
gnpp
iin
inv
inr
ir
m
p
rl
rs
trans
ur
w
wlth
yd
constant
# Label
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
QUARTER
PERSONNAL CONSUMPTION EXP. 58 DOLLARS
GOV. EXP. GOODS & SERVICES 58 DOLLARS
GROSS NATIONAL PRODUCT IN 58 DOLLARS
POTENTIAL GNP IN 58 DOLLARS
INVENTORY INVESTMENT IN 58 DOLLARS
LEVEL OF BUSINESS INV. IN 58 DOLLARS
FIXED NON RES. INVEST IN 58 DOLLARS
FIXED RES INV NON FARM STRUCT 58 DOLLARS
M/P IN 58 DOLLARS
IMPLICIT GNP DEFLATOR
LONG TERM INTEREST RATE
SHORT TERM INTEREST RATE
FED GOVERNMENT TRANSFER PAY. 58 DOLLARS
UNEMPLOYMENT RATE
NOMINAL WAGE IN DOLLARS PER HOUR
INDEX OF REAL HOUSEHOLD WEALTH
DISPOSIBLE INCOME IN 58 DOLLARS
Data file contains
Data begins on (D:M:Y)
Frequency is
Mean
88 observations on
1: 1:1956 and ends on
Std. Dev.
1966.50
2.50000
415.268
124.123
640.481
696.262
5.15966
162.439
67.0768
28.8623
149.040
132.515
5.05501
4.38733
36.4505
5.39624
3.79868
1.96199
564.101
1.00000
6.38065
1.12444
98.1198
22.2271
141.650
208.121
4.85589
37.1501
17.2897
6.27482
8.11765
35.3421
1.31016
1.66629
15.7636
1.35257
1.55299
0.372728
124.761
0.00000
Variance
40.7126
1.26437
9627.50
494.042
20064.6
43314.5
23.5796
1380.13
298.934
39.3734
65.8962
1249.07
1.71653
2.77652
248.492
1.82945
2.41178
0.138926
15565.2
0.00000
20 variables. Current missing value code is
Maximum
1977.00
4.00000
607.500
155.500
901.200
1089.40
17.5600
223.200
94.5500
44.2900
165.400
218.400
7.27000
8.32300
67.3300
8.86700
7.41300
3.01800
793.800
1.00000
Minimum
1956.00
1.00000
279.100
84.6000
434.200
413.400
-13.1600
111.300
40.4400
19.4100
136.800
93.9000
2.88700
0.957000
16.1100
3.40000
1.90400
1.33500
382.400
1.00000
0.1000000000000000E+32
1:10:1977
4.
Assignment: Be sure that you have read and understand the material in Stokes [1997] Chapter 2 and
Hanke-Wichern [2009] Chapter 7
1. Define and discuss the following econometric problems:
a. Heteroskedasticity
b. Serial Correlation
c. Simultaneity
13
Economics 323 Business Conditions Analysis Spring 2016
Dr. H. H. Stokes
2. Using Stata estimate the following models.
a. ct =  + 1 ydt
b. ct =  + 1 ydt + 2 ct-1
c. ct =  + 1 ydt + 2 ct-1 + 3 ydt-1
d. ct =  + 1 [ydt - transt] + 2 transt + 3 [wltht wltht-1] + 4 [rst + rst-1 + rst-2 +rst-3] + 5 ct-1
You will have to build the variables [ydt - transt] and [rst + rst-1 + rst-2 + rst-3]. The following code will do
this task if the main dataset has been loaded.
infile using "c:\master\master1\class\e323\penrub.dct", clear
* infile using "g:\e323\penrub.dct", clear
* set a time variable
gen trend = _n
tsset trend
summ
describe
* Model a
regress c yd
estat dwatson
* GLS for models with no lagged dependent variable on right
prais c yd
* what happend if you try next command?
* prais c yd, corc
*
* Does the rho make sense
prais c yd, ssesearch
*
newey c yd, lag(1)
*
* build data
gen lag_c
= L.c
gen lag_yd
= L.yd
gen yd_m_trans = yd-trans
gen dif_wlth
= wlth - L.wlth
gen sum_rs = rs + L.rs +L2.rs +L3.rs
* Model b
regress c yd lag_c
* Model c
regress c yd lag_c lag_yd
* Model d
regress c
yd_m_trans trans dif_wlth sum_rs
What models can be estimated using GLS?
Discuss what you have found.
14
Economics 323 Business Conditions Analysis Spring 2016
Dr. H. H. Stokes
Problem set # 3 – Applied Econometric Analysis
1. This problem is based on a modification of problem 6 on pages 314 of Hanke
Explain each of the following concepts and how it might be used.
a. Correlation matrix
b. R2
c. Multicollinearity
d. Residual
e. Dummy variable
f. Stepwise regression.
2. Solve Hanke problem 12 page 317. Data load means are:
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------Sales |
11
29.60909
13.75947
3.5
52.3
Outlets |
11
1554.364
843.9054
125
2850
Auto |
11
12.42727
6.858585
4.1
24.6
Income |
11
60.25455
27.1665
19.7
98.5
The following partial program shows you how to fit a tentative model and obtain the fit and the error.
import excel using "c:\master\master1\class\e323\hanke\excel\ch7\Pr7-13.xls",firstrow
summ
describe
list
corr
*
regress Sales Income
predict fit
predict error, resid
list
*
drop fit
drop error
a. Analyze the correlation matrix
b. How much error is involved in the prediction for region 1?
c. Forecast the annual sales in region 12, given 2,500 retail outlets and 20.2 million automobiles
registered.
d. Are the partial regression coefficients sensible?
15
Economics 323 Business Conditions Analysis Spring 2016
Dr. H. H. Stokes
3. This problem is based on problem 16 on page 320 of Hanke. The correct commands to load the data
and get means and correlation are:
import excel using "c:\master\master1\class\e323\hanke\excel\ch7\Pr7-15.xls",firstrow
summ
list
corr
ERA = earned run average
SO = Strike outs
BA = Batting average
RUNS = Runs
HR = Home Runs
SB
= Stolen Bases
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------WINS |
26
80.92308
9.711532
57
98
ERA |
26
3.904615
.3906429
3.06
4.59
SO |
26
938.0769
73.26168
739
1033
BA |
26
.2556538
.0096992
.241
.28
RUNS |
26
697.1923
70.10679
576
829
-------------+-------------------------------------------------------HR |
26
130.1154
32.00791
68
209
SB |
26
120
37.89776
50
221
. corr
(obs=26)
|
WINS
ERA
SO
BA
RUNS
HR
SB
-------------+--------------------------------------------------------------WINS |
1.0000
ERA | -0.4937
1.0000
SO |
0.0488 -0.3932
1.0000
BA |
0.4460
0.0152 -0.0067
1.0000
RUNS |
0.6267
0.2788 -0.2091
0.6449
1.0000
HR |
0.2088
0.4896 -0.2150
0.1536
0.6636
1.0000
SB |
0.1904 -0.4039 -0.0617 -0.2070 -0.1623 -0.3053
1.0000
Assume you have been retained to determine what is important for developing a winning team.
a. Discuss the importance of each variable using correlation and regression analysis.
b. What is a best equation to use to forecast wins? Give detailed reasons why you selected this
equation. The stepwise command might be useful
stepwise, pr(.2) : regress y x
c. Prepare a report to submit to the team manager.
16
Economics 323 Business Conditions Analysis Spring 2016
Dr. H. H. Stokes
Problem set # 4 - Identification of ARIMA Models using real and generated data
Assignment - Review Stokes [1997] Chapter 6 and Hanke-Wichern [2005] chapter 9
1. Define and discuss the use of:
a. Autocorrelation Function (AFC)
b. Partial Autocorrelation Function (PAFC)
2. Discuss what you would look for to identify:
a. An AR(1) model
b. A MA(1) model
3. Estimate the ACF and PACF for the variables C, RS, and M in dataset PENRUB. It is recommended
that you investigate the original series, the first differenced series. From your work, what is the correct
amount of differencing? Why? What do these ACF tell us about the series?
17
Economics 323 Business Conditions Analysis Spring 2016
Dr. H. H. Stokes
4. The files test_arima_1500.dct and test_arima_15000.dct generate series with the following
characteristics:
a. (1-.7B)Xt = et.
b. Xt = (1-.7B)et.
c. (1+.7B)Xt = et.
d. Xt = (1+.7B)et.
e. (1-.65B)Xt = (1-.4B - .7B4)et.
for 1500 and 15000 observations respectively. Means for the 1500 data are
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------ar1_a |
1500
-.000633
1.45299 -5.480957
4.640228
ma1_b |
1500
-.0003761
1.245439 -4.938618
4.3658
ar1_c |
1500
-5.77e-06
1.417546
-5.18021
5.23172
ma1_d |
1500
.0000899
1.241997 -4.590242
3.586889
arma_e |
1500
.0008087
1.34893
-5.08477
4.419892
-------------+-------------------------------------------------------norm |
1500
-.0001431
1.018634 -4.000103
3.422285
trend |
1500
750.5
433.157
1
1500
a. Estimate the ACF and PACF for the first five series and discuss.
b. Next estimate the correct models and see how close you get. Contrast results for the 1500
observation series with the 15000 observation series.
Computer help:
* infile using "c:\master\master1\class\e323\test_arima_1500.dct",clear
infile using "c:\master\master1\class\e323\test_arima_15000.dct",clear
* set a time variable
gen trend = _n
tsset trend
summ
describe
corrgram ar1_a, lags(24)
corrgram ma1_b, lags(24)
corrgram ar1_c, lags(24)
corrgram ma1_d, lags(24)
corrgram arma_e, lags(24)
corrgram norm,
lags(24)
arima
arima
arima
arima
arima
ar1_a,
ma1_b,
ar1_c,
ma1_d,
arma_e,
arima(1,0,0)
arima(0,0,1)
arima(1,0,0)
arima(0,0,1)
ar(1) ma(1,4)
18
Economics 323 Business Conditions Analysis Spring 2016
Dr. H. H. Stokes
Problem set # 5 - Estimation of ARIMA Models
Goal: Be able to forecast
1. Discuss the advantages and disadvantages of ARIMA models in comparison to large scale models.
Under what conditions would an ARIMA modeling procedure be appropriate, a large scale econometric
modeling procedure be appropriate?
2. Using the ACF and PACF that you estimated in case study # 2, estimate the ARIMA models for C, RS,
M. Be sure to have the correct amount of differencing. Try your models using the predict option. Discuss
your models. A sample job is shown. Two ARIMA models are shown. One appears to work better. Why?
infile using "c:\master\master1\class\e323\penrub.dct", clear
* infile using "g:\e323\penrub.dct", clear
* set a time variable
gen trend = _n
tsset trend
summ
describe
* arima c, arima(1,1,1)
arima c, ar(1,2,3,4,5)
predict arxb_c
predict ardy_c, dynamic(80)
list c D.c ar*
19
Economics 323 Business Conditions Analysis Spring 2016
Dr. H. H. Stokes
Help on autobj.
AUTOBJ
-
Automatic Estimation of Box-Jenkins Model
The ARMA command estimates univariate BJ models using ML and
method of moments. Since only one AR and MA factor is allowed,
this command can be used to select relatively simple models
from inside a user selected framework. If many series are to be
filtered quickly, this command should be considered. Models
with very many terms can be estimated.
The more complex command AUTOBJ will automatically identify
models with AR, MA, SAR and SMA factors without the user having
to specify the model. This use of time series AI allows
filtering of a large number of quite different series
possible. A limit of 10 terms can be in the model but up to 6
factors can be estimated. These limits are due to the BoxJenkins philosophy that suggests parsimonious models be used.
The AUTOBJ command is based on the BJIDEN and BJEST routines
available as B34S commands. The underlying code is based on
the Peck Box Jenkins program that was developed under the
supervision of George Box at UW starting in the late 60's.
In addition to automatic model selection using the :autobuild
option, the AR and MA parameters can be specified in "manual"
mode of operation..
call autobj(x :options);
x
series to filter.
If the user wants to impose differencing, this should be done
outside the command or inside the command with the command
:rdif or :sdif. Other wise using automatic model building,
differencing will be selected if the AR parameter is above
the :roottol value which defaults to .8.
:autobuild
- Automatically selects the arima model
starting from a "generic" arima(1,1)
model on appropriately differenced
data.
:rawacfpacf
- Give Raw ACF and PACF prior to model
being fit..
:difrawacf
- Gives difference as well as raw acf
and pacf if :rawacfpacf set.
:assumptions
- Lists assumptions. Not usually used.
:seasonal n
- Sets the seasonal period. If this is
not present seasonal differencing will
not be attempted.
:seasonal2 n
- Sets the second seasonal period. If
seasonal2 is set, seasonal must be set.
Used with hourly and weekly data.
:longar n
- Sets initial default AR order.
Default=1. Range 0-2. This is not
allowed if seasonal2 is set.
:longma n
- Sets initial default MA order.
Default=1. Range 0-2. This option is
not allowed if seasonal2 is set.
:nodif
- Suppress automatic differencing
selection.
20
Economics 323 Business Conditions Analysis Spring 2016
Dr. H. H. Stokes
:rdif
- Forces Regular Differencing.
:sdif
- Forces Seasonal Differencing.
:trend
- Estimate a trend if there is
differencing.
:noest
- No estimation will be performed. This
option requires that the model has been
saved.
:cleanmod
- On the last step, the model will be
cleaned of parameters that have |t|
values LT droptol. This option makes a
very parsimonious model.
:forcedstart
- Forces a default starting value of .1
to be set. This is usually not needed.
:nosearch
- Turns off spike hunting.
:spikelimit i
- Sets limit to look for spikes.
Default = max(12,2*seasonal)
:spiketol
- Sets t for spike inclusion.
Default = droptol. If this is set too
low the program will cycle since a term
will be added which will not be
significant due to the |t| not meeting
the droptol.
:arlimit
r
r
- Sets a value to check for |t| of
adjacent ACF terms. If r is set
smaller, it is more likely AR terms
will be added. Change this value
with caution. Default = 1.3.
:startvalue r
- Sets default parameter start value
for automatic model building.
Default = .1
:print
- Print results.
:printres
- Print residuals.
:printit
- Print iterations
:printsteps
- Prints Model selection steps for
automatic model building.
:backforecast
- Use backforecasting. This option allows
residuals to be calculated for all data
points. It can result in instable
estimation. This option should be used
with care.
:maxtry
n
- Maximum tries at auto model selection.
Default = 4.
:roottol r
- Set auto model differencing tolerance.
Default = .8
:droptol r
- Sets drop tolerance. Default = 1.7
:eps1
- Sets max change in relative sum of
squares before iteration stops.
Default = 0.0 => this criterian
not used.
r
21
Economics 323 Business Conditions Analysis Spring 2016
Dr. H. H. Stokes
:eps2
r
- Sets relative max change in each
parameter. Default = .004
:maxit
i
- Sets maximum number of iterations
allowed. Default = 20
:nac
i
- Sets # autocorrelations printed.
Max = 999.
:npac
i
- Sets number of partial autocorrelations
printed.
Options to override auto selection of the model.
Note: Specify AR and MA in this order if present.
:ar ivec
- set AR orders. Can specify up to
three factors. For example:
:ar index(1 2 3) index(12)
:ma ivec
- set ma orders. Can specify up to
three factors. For example:
:ma index(1 2 3) index(12)
:arparm rarray
- Initial ar values. Usually not
needed.
:maparm
- Initial ma values. Usually not
needed.
:forecast index(i1 i2)
- Sets forecast number and origin.
Limit for number = 100
:smodeln
- Sets model save name. If :noest
is in effect, this sets the model
name to used to make forecasts.
Variables created if options selected:
%numar
-
Number of AR factors
%numma
-
Number MA factors
%numdif
-
Number difference factors
**********************************************
Defined if %numar > 0
%arparms
-
AR parameters
%arse
-
SE of AR parameters
%arord
-
AR orders
%narfact
-
Number of parameters in each factor
**********************************************
Defined if %numma > 0
%maparms
-
MA parameters
22
Economics 323 Business Conditions Analysis Spring 2016
Dr. H. H. Stokes
%mase
-
SE of MA parameters
%maord
-
MA orders
%nmafact
-
Number of MA parameters in
each factor
**********************************************
Defined if %numdif > 0
%diford
-
Dif Orders (6 element array)
**********************************************
%coef
-
constant, ar parameters, ma
parameters
%se
-
Coefficient Standard Errors
%t
-
Coefficient t scores
%cname
-
Coefficient names
123456
AR - 1
AR - 2
MA - 1
MA - 2
give info on the factor
%corder
-
Coefficient order
Defined if Forecasting
*********************************************
%fcast
-
Vector of forecasts
%foreobs
-
Vector of Forecast obs
%fse
-
Forecast standard error
%fpsi
-
Forecast psi weights
%nres
-
nob -(max(arorder, maorder)+2)
%res
-
Residual vector of length %nres.
%resobs
-
Observation # of residual
%y
-
Y vector lined up same as %res.
%yhat
-
Estimated y
%yvar
-
Y variable name
%rss
-
Residual sum of squares
%sumabs
-
Sum of |e(t)|
%maxabs
-
Maximum |e(t)|
Notes: If :ar or :ma is found, auto identification will not be
performed.
If auto identification is used, the beginning values
23
Economics 323 Business Conditions Analysis Spring 2016
Dr. H. H. Stokes
will often be close to the final values because of the
"hidden" identification estimation runs. The switch
:printsteps will show these estimations although usually
this is not needed.
The following statement will detect if the program ran:
if(kind(%res).eq.-99)then;
call print('AUTOBJ failed');
endif;
Example # 1 Identify the Gas model:
b34sexec options ginclude('gas.b34'); b34srun;
b34sexec matrix;
call loaddata;
call load(rtest);
/$
/$ This roottol setting forces no differencing
/$
/$ call autobj(gasout :print :nac 24 :npac 24
/$ :roottol .99
:autobuild );
/$ This turns off differencing
call autobj(gasout :print :nac 24 :npac 24
:autobuild );
:nodif
call rtest(%res,gasout,48);
/$ Default let program decide
call autobj(gasout :print :nac 24 :npac 24
/$
:printsteps
:spiketol 2.0 :autobuild );
call rtest(%res,gasout,48);
b34srun;
Example # 2 Identify Retail Data
b34sexec options ginclude('b34sdata.mac')
member(retail); b34srun;
b34sexec matrix;
call loaddata;
call load(rtest);
call autobj(applance :autobuild :seasonal 12 :nac 36
:print :assumptions
/$
/$ maxtry limits model
/$ :printsteps :maxtry 2
/$
:forecast index(20,norows(applance))
);
call names(all);
call tabulate(%cname,%corder,%coef,%se,%t);
call print(%yvar,%numar,%numma,%numdif);
if(%numdif.ne.0)call print(%diford);
if(%numar.ne.0)
call print(%narfact,%arord,%arparms,%arse);
if(%numma.ne.0)
call print(%nmafact,%maord,%maparms,%mase);
b34srun;
24
Economics 323 Business Conditions Analysis Spring 2016
SAS help follows
* arma(0,1,2) Model;
proc arima;
identify var=c(1,1) noprint;
estimate q=(2);
forecast lead=10;
run;
* arma(2,1,0) Model;
proc arima;
identify var=c(1,1) noprint;
estimate p=(2);
forecast lead=10;
run;
* arma(1,0,2) Model;
proc arima;
identify var=c noprint;
estimate p=(1) q=(2);
forecast lead=10;
run;
25
Dr. H. H. Stokes
Economics 323 Business Conditions Analysis Spring 2016
Dr. H. H. Stokes
Project
For those selecting this option, the project will consist of the individual student selecting 2 to 3
series and fitting an ARIMA models to the series. These series must be related to some interesting topic
that is relevant. After fitting ARIMA models the student should next relate these series using regression
methods. In your write up of the project briefly outline the economic theory you are basing your results on,
the techniques that you are using and the results obtained. The objective of this project is to test how well
you can apply the theory that you have learned to data you have selected. A major objective is to have to
be able to show a completed econometric study at a job interview. The paper should be 15-20 pages
typed. In the past many students have used corrected copies of this paper in the job interview process with
great success. Being able to a research project in econometrics will really set you apart from other job
seekers. Students wishing to do this option must submit a 1/2 page proposal by the end of the 10th week.
Project papers are due the end of the 15th week. With special approval, the paper can be a team project but
in this case a longer and more extensive project on the order of 40 pages is required.
26