RCR.R User Guide Table of Contents Version 0.9

advertisement
RCR.R User Guide
Version 0.9
Table of Contents
1 Introduction ............................................................................................................................................. 2
1.1 Obtaining the program...................................................................................................................... 2
1.2 Installation ........................................................................................................................................ 3
1.3 Running the example file .................................................................................................................. 3
1.4 Updates and technical support ......................................................................................................... 3
2 Estimating the RCR model........................................................................................................................ 3
2.1 The basic estimation command ........................................................................................................ 3
2.2 Additional estimation options ........................................................................................................... 4
Choosing a custom range for λ............................................................................................................ 4
Cluster-robust standard errors ........................................................................................................... 4
Confidence interval options ................................................................................................................ 5
3 Creating plots ........................................................................................................................................... 5
Changing the range of values to plot .................................................................................................. 5
Plotting *λL,λH+ and *θL,θH] ................................................................................................................... 5
Including a legend ............................................................................................................................... 6
Choosing plot elements ...................................................................................................................... 6
Confidence intervals ........................................................................................................................... 6
Advanced options ............................................................................................................................... 6
4 Notes for advanced users ........................................................................................................................ 6
4.1 Class descriptions .............................................................................................................................. 6
rcr........................................................................................................................................................ 7
parameter ........................................................................................................................................... 7
Theta ................................................................................................................................................... 8
4.2 A note on standard error estimates .................................................................................................. 8
5 Version History ........................................................................................................................................ 9
References ................................................................................................................................................ 10
1 Introduction
This file provides documentation for the downloadable computer program (written in R) implementing
the estimation method described in my paper “Bounding a linear causal effect using relative correlation
restrictions” (Krauth 2008).
The program is designed to estimate the linear model:
(1)
where y is a scalar outcome of interest, z is a scalar variable whose effect on y we are interested in
estimating, and X is a vector of control variables such that (without loss of generality)
.
The parameter of interest here is θ, which is being interpreted as the effect of z on y. That effect is
identified and can be estimated by the OLS regression of y on (z,X) if z is exogenous given the control
variables, i.e., if
. However, there are many cases where it is unreasonable to believe that
z is absolutely exogenous, but reasonably to believe that it is “mostly” exogenous (i.e., that
is
small).
One convenient way of modeling “mostly” exogenous is by replacing the absolute restriction that
with the weaker relative correlation restriction:
(2)
where Λ is some interval specified by the econometrician. This program estimates the econometric
model defined by equations (1) and (2).
1.1 Obtaining the program
The program distribution is available online at http://www.sfu.ca/~bkrauth/code/code.html and
contains four files:
rcr_r_userguide.pdf
This document.
rcr.r
A text file containing the R functions needed to implement RCR estimation.
example.r
An example R program that uses the functions in rcr.r to estimate a particular model on
actual data. This particular program estimates class size effects in kindergarten for Project STAR
students, i.e., the first column in Table 3 of Krauth (2008).
example_data.dta
The data set (in Stata format) used by example.r
The code uses R, an open-source statistical package based on the S language. The code will also run on
commercial packages based on S (e.g., on S-Plus). R can be obtained free of charge from http://www.rproject.org/.
1.2 Installation
To “install” the code, just unzip the distribution files into a directory of your choice.
The text file rcr.r contains the source code for all of the necessary functions. To make those
functions available during a particular R session, just execute the command:
source(“pathname/rcr.r”)
where pathname is the full path of the directory into which you have placed the files. For example, if
you have placed the files in C:\work, then execute the command source(“c:/work/rcr.r”).
1.3 Running the example file
The distribution also includes an example file named example.r. It provides several examples using
the CMU function on the Project STAR data described in the paper. To run it, open R and execute the
command:
source(“pathname/example.r”,chdir=T)
where pathname is the full path of the directory into which you have placed the files. For example, if
you
have
placed
the
files
in
C:\work,
then
execute
the
command
source(“c:/work/example.r”,chdir=T).
1.4 Updates and technical support
I am happy to provide technical support by email at bkrauth@sfu.ca. Updates to the code and
documentation will be available at http://www.sfu.ca/~bkrauth/code/code.html.
My intention is to make this program easy to use, so I greatly appreciate any comments or suggestions.
2 Estimating the RCR model
2.1 The basic estimation command
The RCR model can be estimated by simply executing the function:
rcr(x,y,z)
where
x is an n-by-k matrix of control variables. The first column of x should be a column of ones.
y is an n-by-1 matrix (or n-vector) of outcome variables.
z is an n-by-1 matrix (or n-vector) of treatment variables.
This function returns an object of class rcr. RCR objects have methods for several generic functions,
including print, plot, and summary. Output from rcr will look something like this:
Global RCR Parameter Estimates:
lambdaStar
thetaStar
lambda0
Estimate Std. Error
2.5% 97.5%
12.31
2.10
8.19 16.4
8.17
30.61 -51.83 68.2
28.94
108.56 -183.83 241.7
The lambda(theta) function has critical points at:
limit
-1.00e+100
localmax thetastar- thetastar+
-1.48e+01
8.17e+00
8.17e+00
limit
1.00e+100
Estimated bounds on theta for given bounds on lambda:
lambda_L lambda_H theta_L theta_H 2.5% 97.5%
0
0
5.20
5.2 3.91 6.49
0
1
5.14
5.2 3.26 6.49
Confidence intervals calculated using conservative method.
The printout above has 3 sections.
1. The section titled “Global RCR Parameter Estimates” provides estimates of λ*, θ* and
θ(0) as defined in the paper, along with standard errors and confidence intervals.
2. The section titled “The lambda(theta) function has critical points at:”
identifies (to an approximation) all points at which the estimated λ(θ) function has either a
discontinuity or a change in direction.
3. The section titled “Estimated bounds on theta for given bounds on
lambda” provides estimates of *θL,θH] for a few common assumptions about λ in *λL,λH].
2.2 Additional estimation options
The RCR function has several optional inputs that may be necessary for particular applications. The
example.r file includes an example that uses all of these options to produce the results for the first
column in Table 3 of the paper.
Choosing a custom range for λ
The optional argument Lambda is a J-by-2 matrix in which each row represents a value of *λL,λH] for
which to estimate *θL, θH]. For example, if you want to estimate *θL,θH] for λ in *0,0.2+ and for λ in (∞,0], execute the command:
rcr(x,y,z,Lambda=matrix(c(0,-Inf,0.2,0),ncol=2))
By default, Lambda is set to include a few convenient values.
Cluster-robust standard errors
By default, standard errors are calculated under the assumption that the data are independent and
identically distributed. To use cluster-robust standard errors use the optional argument cluster. For
example, you might execute:
rcr(x,y,z,cluster=clustervar)
where clustervar is the name of an n-vector of cluster identifiers.
Confidence interval options
The optional argument level can be used to select an alternative asymptotic level for the calculation of
confidence intervals. For example, if you want 99% confidence intervals (the default is 95%), execute
the command:
rcr(x,y,z,level=0.99)
The level argument operates the same way as in the generic R function confint.
The optional argument ciType is a (case-insensitive) text string indicating the method to use when
calculating confidence intervals for θ. Several alternatives are currently supported:
Conservative: Constructs a confidence interval for θ from the lower bound for the
confidence interval of θL and the upper bound for the confidence interval of θH. Imbens and
Manski (2004) show that this approach can be too conservative when θH- θL is relatively large.
Imbens-Manski: Uses the method proposed by Imbens and Manski (2004).
Stoye: Uses the method proposed by Stoye (2008).
For example, if you want to use the Imbens-Manski method, execute the command:
rcr(x,y,z,ciType=”Imbens-Manski”)
The default uses the conservative method.
3 Creating plots
Plots of the estimated λ(θ) function can be created by calling the generic function plot on the results
of the rcr function. For example, executing the command:
plot(rcr(x,y,z))
will generate a plot. The plot produced with the default options will probably not look good, but more
informative plots can be made using the optional arguments below.
Changing the range of values to plot
xlim is a 2-vector giving the range of values to plot on the horizontal (x) axis. ylim is a 2-vector giving
the range of values to plot on the vertical (y) axis.
Plotting [λL,λH] and [θL,θH]
The optional argument Lambda is a 2-vector giving a range for *λL,λH] for which to plot the estimated
*θL,θH]. The default (Lambda=NULL) is to simply not plot such an estimate.
Including a legend
The optional argument placeLegend tells plot where (if anywhere) to place a legend for the plot.
This option is passed directly to R’s legend command. Valid options include “topleft”,
“topright”, etc. See the R documentation for the legend command for more details. The default
(placeLegend=FALSE) is to have no legend at all.
Choosing plot elements
These optional arguments take logical values, and are used to determine whether particular statistics
are plotted.
plotLambdaFunction indicates whether to plot the estimated λ(θ) function (default TRUE).
plotThetaStar indicates whether to plot the estimated value of θ* (default TRUE)
plotLambdaStar indicates whether to plot the estimated value of λ* (default TRUE)
Confidence intervals
The optional argument plotCI indicates whether to plot confidence intervals for the estimated Θ(Λ)
(default is FALSE).
The optional argument plotLambdaCI indicates whether to plot confidence intervals for the
estimated λ(θ) function (default is FALSE).
The optional argument level indicates the confidence level to use.
The optional argument ciType indicates the type of confidence interval to estimate for θ (see the
explanation of ciType in Section 2.2 for details).
Advanced options
The adjustYlim argument is a logical argument that tells plot whether to adjust the range of the
vertical axis to look nice. It doesn’t work very well yet. The default is (adjustYlim=FALSE).
The gridsize argument indicates the number of points at which to estimate the λ(θ) function. The
default is (gridsize=100)
The colorScheme argument is a vector of colors to be used in the plot. See the built-in R function
colors() for a list of available colors.
The default is (colorScheme=
c("black","blue","gray","green","lightgreen"))
In addition to these options, many of the standard optional arguments to R’s plot function are accepted.
4 Notes for advanced users
4.1 Class descriptions
Like many estimation methods programmed in R, this program has an object-oriented design. That is,
Data tends to be packaged into (potentially elaborate) structures called objects.
Each object is a member of one or more classes. An object’s class dictates its structure.
There are numerous “generic” functions (e.g., print) that can be called on all sorts of different
objects but whose actual behavior (“method”) depends on the class of the object that has been
passed to it.
In order to understand how this program works it is important to know the main classes, the structure
of objects in each class, and the methods available.
rcr
How they are created: Objects of class rcr are created with the rcr function.
What they are: A rcr object is a list with a particular set of elements that describe the results from
RCR estimation:
Several parameter objects (see below for explanation of parameter objects) named
o moments (this is the vector of sample moments from which all other
parameters are calculated)
o thetaStar (the parameter θ* described in the paper)
o lambdaStar (the parameter λ* described in the paper)
o lambda0 (the parameter λ(0) described in the paper)
o thetaSegments (the set of θ values at which λ(θ) is discontinuous or
switches direction).
A function named lambda (the function λ(θ) described in the paper).
etc.
What can be done with them: There are rcr-specific methods for the generic functions print,
summary, and plot. There is also a function is.rcr that tests whether an object is a (valid) rcr
object.
parameter
How they are created: Parameter objects are created with the function parameter. This function
is used mostly internally.
What they are: A parameter object is a list that describes a vector of parameter estimates, including its
covariance matrix. Specifically it is a list with elements:
A k-vector named E that gives the actual estimate.
A j-by-j matrix named V that gives the covariance matrix of moments
A k-by-j matrix named gradient that gives the gradient of the estimate with respect to
moments.
What can be done with them: It has methods for the generic functions print, confint, and vcov.
There is also a function parameter for the creation of parameter objects, and a function
is.parameter() to test whether an object is a parameter object. Finally, two or more parameter
objects can be concatenated into a single object using a special method for the generic function c().
Theta
How they are created: Theta objects are created by the function estimateTheta. This function is
mostly used internally.
What they are: A Theta object is a special kind of parameter object that specifically describes the pair of
parameter estimates *θL,θH]. It inherits from the parameter class.
What can be done with them: Anything that can be done for a parameter object can be done with a
Theta object. In addition, Theta objects have methods for the generic functions confint and
summary.
4.2 A note on standard error estimates
Standard error estimates for most parameters of interest are based on application of the delta method.
That is, the parameter of interest is a (usually) differentiable function h(M), where M is some easily
calculated vector of asymptotically normal summary statistics with easily-estimated covariance matrix.
Standard errors are estimated by application of the formula
where the derivative in the above expression is approximated numerically using the finite difference
method. For example, if M were a scalar, the finite difference method would approximate the
derivative by:
where ε is some small positive number. Unfortunately, it is not easy to determine in advance how big ε
should be to get a good approximation. In principle, the approximation above gets better with smaller
ε, but rounding error in the calculation of h(M) becomes a problem if ε is too small.
The current value the program uses for ε is stored in bkoptions$eps, and its default value is 10-10. It
is a good idea to try out somewhat smaller and somewhat larger values to see if the standard error
estimates are sensitive to the choice of value.
5 Version History
Version 0.9 (6/29/2008)
References
Imbens, Guido and Charles F. Manski , 2004. Confidence intervals for partially identified parameters.
Econometrica 72: 1845-1857.
Krauth, Brian, 2008. Bounding a linear causal effect using relative correlation restrictions. Working
paper, Simon Fraser University. Available online at http://www.sfu.ca/~bkrauth/papers/rcr.pdf.
Stoye, Jörg, 2008. More on confidence intervals for partially identified parameters. CEMMAP Working
Paper. Available online at http://cemmap.ifs.org.uk/wps/cwp1108.pdf.
Download