FEAR 1.15 User`s Guide

advertisement
FEAR 1.15 User’s Guide
Paul W. Wilson
Department of Economics
222 Sirrine Hall
Clemson University
Clemson, South Carolina 29634 USA
pww@clemson.edu
9 November 2010
This manual is for FEAR version 1.15, a library for estimating productive efficiency, etc. using R.
c 2010 Paul W. Wilson. All rights reserved.
Copyright Contents
1 Introduction
1
2 License Issues
2
3 Downloading and Installing R
4
4 Adding FEAR to R
4.1 Where to get FEAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Installing FEAR into R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
5
5
5 Estimation with FEAR
5.1 Getting help . . . . . . . . . . . . . . . . . . . . . . . .
5.2 Example #1: DEA estimates of technical efficiency . .
5.3 Example #2: Outlier detection for frontier models . .
5.4 Example #3: Other estimators of technical efficiency:
5.5 Example #4: Farrell-Debreu efficiencies . . . . . . . .
5.6 Estimating other things . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7
7
8
12
15
15
17
1
Introduction
An extensive literature concerning the measurement of efficiency in production has developed since Debreu
(1951) and Farrell (1957) provided basic definitions for technical and allocative efficiency in production. One
large section of this literature focuses on linear-programming based measures of efficiency along the lines of
Charnes et al. (1978) and Färe et al. (1985). In addition, the free disposal hull (FDH) method of Deprins
et al. (1984) is sometimes used; this estimator can be written as a linear program, algthough it is easier to
compute estimates using numerical methods rather than linear programming. Within this literature, the
that rely on convexity assumptions are known as Data Envelopment Analysis (DEA).
DEA estimators have been applied in more than 1,800 articles published in more than 490 refereed journals
(Gattoufi et al., 2004). DEA and similar non-parametric estimators offer numerous advantages, the most
obvious being that one need not specify a (potentially erroneous) functional relationship between production
inputs and outputs. Although much of the nonparametric efficiency literature has ignored statistical issues
such as inference, hypothesis testing, etc., the statistical properties of DEA estimators have recently been
established; see Simar and Wilson (2000b) for a survey of these results, and Kneip et al. (2008) for more
recent results.
Standard software packages (e.g., LIMDEP, STATA, TSP) used by econometricians do not include
procudures for DEA or other nonparametric efficiency estimators. Several specialized, commercial software packages reviewed by Hollingsworth (1999) and Barr (2004) are available, and to varying degrees, these
are good at what they were designed to do. Each includes facilities for reading data into the program, in some
cases in a variety of formats, and procedures for estimating models that the authors have programmed into
their software. A common complaint heard among practitioners, however, runs along the lines of “package
X will not let me estimate the model I want!” The existing packages are designed for ease of use (again,
with varying degrees of success), but the cost of this is often inflexibility, limiting the user to procedures
the authors have explicitly made available. Moreover, none of the existing packages include procedures for
statistical inference. Although the asymptotic distribution of DEA estimators is now known (see Kneip
et al., 2008, for details) for the general case with p inputs and q outputs, bootstrap methods remain the only
useful approach for inference. None of the existing packages incorporate the bootstrap methods proposed by
Simar and Wilson (1998, 2000a).
FEAR 1.15 consists of a software library that can be linked to the general-purpose statistical package R.
The routines included in FEAR 1.15 allow the user to compute DEA estimates of technical, allocative, and
overall efficiency while assuming either variable, non-increasing, or constant returns to scale. The routines
are highly flexible, allowing measurement of efficiency of one group of observations relative to a technology
defined by a second, reference group of observations. Consequently, the routines can be used to compute
Malmquist indices, scale efficiency measures, super-efficiency scores along the lines of Andersen and Petersen
(1993), and other measures that might be of interest. Routines are also included to facilitate implementation
1
of the bootstrap methods described by Simar and Wilson (1998, 2000a). These features can be further used
to implement methods of inference for Malmquist indices as in Simar and Wilson (1999), statistical tests for
irrelevant inputs and outputs or aggregation possibilities as described in Simar and Wilson (2001b). as well
as statistical tests of constant returns to scale versus non-increasing or varying returns to scale as described
in Simar and Wilson (2001a). A routine for maximum likelihood estimation of a truncated regression model
is included for regressing DEA efficiency estimates on environmental variables as described in Simar and
Wilson (2007). In addition, FEAR 1.15 includes commands that can be used to perform outlier analysis
using the methods of Wilson (1993), and to compute FDH efficiency estimates along the lines of Deprins
et al. (1984), the robust, root-n consistent order-m efficiency estimators described by Cazals et al. (2002),
and the robust, root-n consistent order-α efficiency estimators described by Daouia and Simar (2007) and
Wheelock and Wilson (2008). Most of these features are unavailable in existing software packages.
2
License Issues
R is distributed under the terms of the GNU General Public License Version 2, June 1991. This license can
be found on the internet at http://www.gnu.org/copyleft/gpl.html. Further details on licensing of R can be
found by typing license() in the R console window described below.
FEAR 1.15 is provided under the license that appears in the file LICENSE included with the software.
This license states the following:
License for the use of the R package "FEAR:
copyright 2010, Paul W. Wilson
Frontier Efficiency Analysis in R,"
DEFINTIONS:
"OWNER" refers to the author, Paul W. Wilson, whose address is given below.
"Software" means the frontier efficiency analysis software
developed by OWNER known as the "FEAR software," "FEAR package,"
"FEAR library," or any other designation referring to the R package
"FEAR: Frontier Efficiency Analysis in R," including any source code,
binary code, or user documentation.
"You" means you, the user and licensee of the Software. If you are
employed and intend using the Software in connection with your
employment duties, then you warrant that your employer has authorised
you to accept this License on behalf of your employer.
"Academic use" includes and is limited to use of the software for
scientific, academic purposes intended to result in publication of scientific
papers in academic journals, in your role as a faculty member or student
at a university, college, or secondary school.
"Commercial use" is any use of the software that is not specifically
academic use as defined above. This includes any use by anyone working
as an employee, contractor, paid consultant, or in any other capacity
2
for a commercial firm, non-government organization, government agency,
or any other organization that is not a university, college, or secondary
school.
"Academic user" refers to anyone engaged in academic use of the software
as defined above.
"Commercial user" refers to anyone engaged in commercial use or other
non-academic use of the software as defined above.
LICENSE TERMS AND CONDITIONS:
1. All rights in this software are reserved by OWNER. You are only permitted
to have this Software in your possession and to make use of it if you have
agreed to the terms and conditions set forth in this license or in another
license granted by OWNER.
2. Accessing or use of the FEAR software constitutes acceptance of terms and
conditions set forth in this license.
3. The software is made available AS IS, without any warranty, either express
or implied. OWNER bears no liability for any losses or damages (including
direct, indirect, special, or consequential losses or damages or any other
losses or damages) resulting from your use of the software or otherwise in
connection with this license.
4. An academic users may use this software freely for ACADEMIC USE as
defined above. However, OWNER retains all rights to the software.
5. This license must be distributed with the SOFTWARE.
6. You are prohibited from the following activities, unless specifically
permitted by OWNER:
(i)
making copies of the Software except to the extent necessary for
backup or research purposes;
(ii) distributing, selling, sub-licensing, or otherwise making the
software available for use by a third party; and
(iii) removing or altering the file containing this license, any logo,
copyright, or other proprietary notices, symbols, labels, or
documentation in the software.
7. By accessing or using this software, you agree to not
(i) reverse engineer, decompile, or deassemble the software; or
(ii) use the software to develop copycat or functionally equivalent
technology or derivative technologies based on the methods employed
in the software.
8. You must recognize OWNER’s authorship and ownership of the software
in any report, paper or publication containing references to the software
or results generated by the software by including the following citation
in any such report, paper, publication, etc.:
3
Wilson, Paul W. (2008), "FEAR 1.0: A Software Package for Frontier
Efficiency Analysis with R," Socio-Economic Planning Sciences 42,
247--254.
9. COMMERCIAL USE or other non-ACADEMIC USE is permitted by commercial
users after payment of a license fee of US\$180.00 and receipt of
email from OWNER confirming payment of the license fee. Such payment
allows the software to be used on one central processing unit (CPU).
Commercial users desiring to run the software in parallel or grid-computing
environments or on multiple CPUs or machines should contact OWNER
for further information.
10. All enquiries for the use of this Software should be referred to OWNER:
Paul W. Wilson
Department of Economics
Clemson University
Clemson, South Carolina 29630 USA
email: pww@clemson.edu
phone: 1-864-656-2032
3
Downloading and Installing R
R is a language and environment for statistical computing graphics. It is an implementation of the S
language developed at Bell Laboratories, but unlike the commercial version of S marketed as S-Plus by
Lucent Technoligies, R is freely available under the Free Software Foundation’s GNU General Public License.
According to the R project’s web pages (http://www.r-project.org),
“R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical
tests, time-series analysis, classification, clustering, ...) and graphical techniques, and is highly
extensible.... One of R’s strengths is the ease with which well-designed publication-quality plots
can be produced, including mathematical symbols and formulae where needed. Great care has
been taken over the defaults for the minor design choices in graphics, but the user retains full
control.... R is an integrated suite of software facilities for data manipulation, calculation and
graphical display. It includes (i) an effective data handling and storage facility; (ii) a suite of
operators for calculations on arrays, in particular matrices; (iii) a large, coherent, integrated
collection of intermediate tools for data analysis; (iv) graphical facilities for data analysis and
display either on-screen or on hardcopy; and (v) a well-developed, simple and effective programming language which includes conditionals, loops, user-defined recursive functions and input and
output facilities.”
R includes an online help facility and extensive documentation in the form of manuals included with the
package. In addition, several books describing uses of R are available (eg., Dalgaard, 2002; Venables and
Ripley, 2002; and Verzani, 2004); see also the recent review by Racine and Hyndman (2002).
4
The current version of R can be downloaded from http://lib.stat.cmu.edu/R/CRAN. Pre-compiled
binary versions are available for a variety of platforms; at present, however, FEAR 1.15 is being made
available only for the Microsoft Windows (NT, 95 and later) operating systems running on Intel (and clone)
machines.
Most users will want to download the precompiled, binary version of R for Windows operating systems.
After downloading the installer from the R web pages, close all running applications, then left-click on Start
then Run, find the R installer (for R version 2.4.0, this is a file named R-2.4.0-win32.exe), and run the
installer, following instructions as they appear. If all goes well, at the end of the installation process, users
should find the R icon on thier screen.
4
Adding FEAR to R
The next step is to make the FEAR package available to R.
4.1
Where to get FEAR
After downloading and installing R, FEAR 1.15 can be downloaded from
http://www.economics.clemson.edu/faculty/wilson/Software/FEAR
The FEAR package is contained in a zip file (FEAR.zip), which should be copied to some location on the
user’s machine. Do not unzip the file before installing into R. A copy of the paper announcing the FEAR
package, Wilson (2008), this manual, and the license file for FEAR 1.15 can be found at the same site.
4.2
Installing FEAR into R
Before the FEAR package can be used, it must be “installed” into R. This must be done only once, unless
upgrading to a later version of FEAR. After downloading the FEAR package, start R by clicking on its
desktop icon. This will open the R graphical user interface (GUI), shown in Figure 1.
Along the top of the R GUI, find the word “Package,”, and left-click on this; then left-click on “Install
package(s) from local zip files...” in the pop-up menu (shown in Figure 2) that appears. Next, a Windows
file-selection window will appear, as shown in Figure 3. Use the navigation buttons to find the file FEAR.zip,
highlight the file name, then left-click on the “Open” button in the Windows file-selection window.
The following should appear in the R console window within the R GUI:
> utils:::menuInstallLocal()
package ’FEAR’ successfully unpacked and MD5 sums checked
updating HTML package descriptions
>
5
Figure 1: The R Windows GUI.
Next, to make the commands in the FEAR package available for use, click on the R console window to
make it active, and type library(FEAR) or require(FEAR) at the prompt; this should result in the following
displayed in the R console window:
> library(FEAR)
FEAR (Frontier Efficiency Analysis with R) 1.0 installed
Copyright Paul W. Wilson 2006
See file COPYING for license and citation information
>
Note that FEAR uses another package, KernSmooth, which is included in current distributions of R. This
package is loaded automatically by FEAR, and is used by some of the commands for bandwidth-selection in
FEAR.
6
Figure 2: The R Packages menu.
Figure 3: The R file selection interface.
5
Estimation with FEAR
5.1
Getting help
After successfully completing the installation of R and FEAR as described in the previous sections, one
can begin using FEAR. User-accessible commands are listed and described in the FEAR Command Reference available on the FEAR website. One can also list the available commands in FEAR by typing
help(package=FEAR at the prompt in the R console window (remember, to access the FEAR package, one
must type library(FEAR) or require(FEAR) first).
Alternatively, R’s HTML help facility can also be used. Left-click on “Help” at the top of the R GUI, and
then on “HTML help” in the pop-up menu that appears (see Figure 4). This will open a browser window and
display links for R manuals, packages, etc. as shown in Figure 5. In this window, click on the “Packages” link
to get a list of available packages, then click on “FEAR” to display a list of commands available in the FEAR
package as shown in Figure 6. Clicking on the command names will result in more detailed explanations.
Next, some simple examples using commands from the FEAR package are given. FEAR includes several
7
Figure 4: The R help pop-up menu.
datasets that can be used to illustrate the package’s capabilities.
5.2
Example #1: DEA estimates of technical efficiency
Typing data(ccr) loads the data given in Charnes et al. (1981). These data contain observations on p = 5
inputs and q = 3 outputs for n = 70 schools. The following commands create a (p × n) matrix of input
vectors and a (q × n) matrix of outputs from the Charnes et al. data:
data(ccr)
x=matrix(nrow=5,ncol=70)
x[1,]=ccr$x1
x[2,]=ccr$x2
x[3,]=ccr$x3
x[4,]=ccr$x4
x[5,]=ccr$x5
y=matrix(nrow=3,ncol=70)
y[1,]=ccr$y1
y[2,]=ccr$y2
y[3,]=ccr$y3
Alternatively, one might type
data(ccr)
x=t(matrix(c(ccr$x1,ccr$x2,ccr$x3,ccr$x4,ccr$x5),
nrow=70,ncol=5))
y=t(matrix(c(ccr$y1,ccr$y2,ccr$y3),nrow=70,ncol=3))
The second example puts the data into a long vector (using the c command), assigns this to a matrix (R
8
Figure 5: The R HTML help facility.
9
Figure 6: The R HTML help facility showing commands in FEAR 1.15.
10
puts data column-wise into matrices), and then assigns the transpose of this matrix (obtained using R’s t
command) to either x or y.
Next, FEAR’s dea command can be used to estimate technical efficiency for the observations in the
Charnes et al. (1981) data, relative to the technology estimated from these observations. The command dea
computes estimates of either Shephard (1970) input or ouput distance functions; here, the input-orientation
is used:
dhat=dea(XOBS=x,YOBS=y)
Alternatively, output distance functions could be estimated by adding the argument ORIENTATION=2 to the
dea command above. The default is to allow variable returns to scale in the estimated technology, but constant returns or non-increasing returns to scale are also possible using the RTS argument; see documentation
for the dea command in the FEAR Command Reference.
The command boot.sw98 can be used to estimate confidence intervals via the homogeneous bootstrap
method described by Simar and Wilson (1998, 2000b):
tmp=boot.sw98(XOBS=x,YOBS=y,DHAT=dhat,NREP=2000)
Finally, the results from the dea and boot.sw98 commands can be manipulated to produce LaTeX code
for a table:
n=ncol(x)
#number of DMUs
table.in=matrix(nrow=n,ncol=7)
table.in[,1]=c(1:n)
table.in[,2]=dhat
table.in[,3]=dhat-tmp$bias
#bias-corrected estimate
table.in[,4]=tmp$bias
table.in[,5]=tmp$var
table.in[,6:7]=tmp$conf.int
table.in[1:9,1]=paste(" ",table.in[1:9,1],sep="")
At this point, table.in is a (70 × 7) matrix, with each row corresponding to an observation in the Charnes
et al. (1981) data. The first column contains the observation number (1–70); the second column contains
the DEA estimates of the Shephard input distance function for each observation; columns 3 contains a
bias-corrected estimates of the Shephard input distance function, obtained by subtracting the bootstrap bias
estimate from the original distance function estimates in dhat; columns 4–5 contain the bootstrap bias and
variance estimates, respectively; and columns 6–7 contain estimated upper and lower bounds for 95-percent
confidence intervals obtained by bootstrapping.
Continuing to format the results for use in a LaTeX table,
table.in[,2]=ifelse(nchar(table.in[,2])==1,
paste(table.in[,2],".",sep=""),
table.in[,2])
11
table.in[,2:7]=paste(table.in[,2:7],"000000",sep="")
table.in[,c(2:3,5:7)]=substr(table.in[,c(2:3,5:7)],1,6)
table.in[,4]=substr(table.in[,4],1,7)
table.in=paste(table.in[,1]," & ",
table.in[,2]," & ",
table.in[,3]," & ",
table.in[,4]," & ",
table.in[,5]," & ",
table.in[,6]," & ",
table.in[,7]," \\",sep="")
Note that table.in has been converted to a vector of length 70 by these commands.
Typing
table.in[1:5] at the prompt in the R console window will display the first five elements of table.in
as shown here:
1
2
3
4
5
&
&
&
&
&
1.0393
1.1098
1.0697
1.1091
1.0000
&
&
&
&
&
1.0728
1.1391
1.0977
1.1279
1.0454
&
&
&
&
&
-0.0335
-0.0293
-0.0280
-0.0188
-0.0454
&
&
&
&
&
0.0006
0.0002
0.0003
9.3540
0.0010
&
&
&
&
&
1.0424
1.1139
1.0732
1.1132
1.0033
&
&
&
&
&
1.1223
1.1660
1.1320
1.1446
1.1051
\\
\\
\\
\\
\\
The LaTeX code in table.in can be written to a file named table in.tex using the command
write(table.in,file="table in.tex")
Inserting the code into a LaTeX document, one can produce a table of results similar to Table 1.
5.3
Example #2: Outlier detection for frontier models
Wilson (1993) describes an influence-function approach for detecting outliers in the context of frontier models.
This is of vital importance, since conventional nonparametric estimators such as FDH or DEA are very
sensitive to outliers. The Wilson (1993) method is implemented by FEAR’s ap and ap.plot commands.
The Charnes et al. (1981) data can be analyzed for outliers using the following commands:
data(ccr)
x=t(matrix(c(ccr$x1,ccr$x2,ccr$x3,ccr$x4,ccr$x5),
nrow=70,ncol=5))
y=t(matrix(c(ccr$y1,ccr$y2,ccr$y3),nrow=70,ncol=5))
tmp=ap(x,y,NDEL=12)
ap.plot(RATIO=tmp$ratio)
Here, the ap.plot command reproduces the log-ratio plot for the Charnes et al. (1981) data that appears
in Wilson (1993). The plot is drawn on-screen, in the R GUI, as shown in Figure 7. Alternatively, R can write
Postscript code for plots to a file, which can be incorporated later into LaTeX or other document-producing
software. To write the plot to a Postscript file, one would add a postscript command before the ap.plot
command in the example shown above. Type help(postscript) or use R’s HTML help facility for further
information.
12
Units
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Eff. Scores (VRS)
1.0393
1.1098
1.0697
1.1091
1.0000
1.0990
1.1218
1.1049
1.1647
1.0629
1.0000
1.0000
1.1596
1.0104
1.0000
1.0524
1.0000
1.0000
1.0498
1.0000
1.0000
1.0000
1.0258
1.0000
1.0217
1.0609
1.0000
1.0097
1.1321
1.1193
1.1949
1.0000
1.0503
1.1640
1.0000
1.2611
1.1914
1.0000
1.0621
1.0528
1.0500
1.0491
1.1564
1.0000
1.0000
1.0954
1.0000
1.0000
1.0000
1.0431
1.0871
1.0000
1.1498
1.0000
1.0006
1.0000
1.0788
1.0000
1.0000
1.0199
Eff. Bias-Corrected
1.0728
1.1391
1.0977
1.1279
1.0454
1.1282
1.1427
1.1371
1.1889
1.0952
1.0456
1.0463
1.1782
1.0379
1.0554
1.0807
1.0536
1.0408
1.0764
1.0541
1.0529
1.0311
1.0476
1.0521
1.0381
1.0807
1.0457
1.0339
1.1673
1.1447
1.2185
1.0473
1.0737
1.1866
1.0428
1.2900
1.2243
1.0526
1.0943
1.0778
1.0640
1.0791
1.1871
1.0535
1.0358
1.1159
1.0516
1.0529
1.0458
1.0706
1.1106
1.0537
1.1739
1.0554
1.0313
1.0511
1.1096
1.0547
1.0540
1.0395
\
BIAS
-0.0335
-0.0293
-0.0280
-0.0188
-0.0454
-0.0292
-0.0209
-0.0322
-0.0242
-0.0323
-0.0456
-0.0463
-0.0186
-0.0275
-0.0554
-0.0283
-0.0536
-0.0408
-0.0266
-0.0541
-0.0529
-0.0311
-0.0218
-0.0521
-0.0164
-0.0198
-0.0457
-0.0242
-0.0352
-0.0254
-0.0236
-0.0473
-0.0234
-0.0226
-0.0428
-0.0289
-0.0329
-0.0526
-0.0322
-0.0250
-0.0140
-0.0300
-0.0307
-0.0535
-0.0358
-0.0205
-0.0516
-0.0529
-0.0458
-0.0275
-0.0235
-0.0537
-0.0241
-0.0554
-0.0307
-0.0511
-0.0308
-0.0547
-0.0540
-0.0196
σ
b
0.0006
0.0002
0.0003
9.3540
0.0010
0.0002
0.0001
0.0005
0.0001
0.0005
0.0010
0.0010
7.2837
0.0002
0.0020
0.0004
0.0020
0.0006
0.0002
0.0018
0.0017
0.0002
0.0001
0.0016
5.2518
9.8236
0.0010
0.0001
0.0004
0.0001
0.0001
0.0011
0.0001
0.0001
0.0008
0.0002
0.0003
0.0019
0.0004
0.0002
4.3406
0.0003
0.0003
0.0020
0.0006
0.0001
0.0015
0.0020
0.0010
0.0004
0.0001
0.0019
0.0001
0.0021
0.0003
0.0015
0.0005
0.0021
0.0020
7.9640
Lower Bound
1.0424
1.1139
1.0732
1.1132
1.0033
1.1030
1.1255
1.1090
1.1683
1.0661
1.0038
1.0041
1.1642
1.0143
1.0034
1.0558
1.0034
1.0034
1.0534
1.0035
1.0037
1.0029
1.0293
1.0029
1.0250
1.0651
1.0034
1.0128
1.1370
1.1242
1.1995
1.0039
1.0543
1.1683
1.0037
1.2657
1.1960
1.0034
1.0661
1.0560
1.0536
1.0530
1.1610
1.0036
1.0037
1.0991
1.0037
1.0038
1.0032
1.0455
1.0908
1.0037
1.1539
1.0043
1.0047
1.0034
1.0819
1.0039
1.0032
1.0242
Upper Bound
1.1223
1.1660
1.1320
1.1446
1.1051
1.1566
1.1595
1.1855
1.2101
1.1402
1.1073
1.1021
1.1928
1.0690
1.1443
1.1234
1.1447
1.0839
1.1068
1.1411
1.1304
1.0554
1.0737
1.1289
1.0499
1.0978
1.1008
1.0592
1.2087
1.1651
1.2376
1.1073
1.0991
1.2082
1.0977
1.3195
1.2538
1.1430
1.1331
1.1068
1.0749
1.1137
1.2213
1.1478
1.0845
1.1369
1.1241
1.1445
1.1032
1.1118
1.1373
1.1430
1.1945
1.1506
1.0677
1.1255
1.1535
1.1461
1.1476
1.0543
Table 1: Estimates for Charnes et al. (1981) data (2000 bootstrap replications).
13
Units
61
62
63
64
65
66
67
68
69
70
Eff. Scores (VRS)
1.1202
1.0000
1.0379
1.0749
1.0252
1.0687
1.0568
1.0000
1.0000
1.0373
Eff. Bias-Corrected
1.1523
1.0543
1.0613
1.0932
1.0489
1.0833
1.0770
1.0554
1.0545
1.0563
\
BIAS
-0.0321
-0.0543
-0.0234
-0.0183
-0.0237
-0.0146
-0.0202
-0.0554
-0.0545
-0.0190
σ
b
0.0005
0.0020
0.0001
8.9526
0.0001
4.8927
0.0001
0.0020
0.0020
8.7786
Lower Bound
1.1242
1.0035
1.0415
1.0784
1.0283
1.0722
1.0605
1.0038
1.0037
1.0411
Upper Bound
1.1988
1.1448
1.0834
1.1097
1.0703
1.0953
1.0940
1.1478
1.1425
1.0719
Table 1: (continued).
Figure 7: Log-ratio plot for Charnes et al. (1981) data produced by ap.plot command.
14
5.4
Example #3: Other estimators of technical efficiency:
FEAR 1.15 also includes commands to implement FDH and order-m (Cazals et al., 2002) efficiency estimators. To remain consistent with the dea command, which estimates Shephard (1970) input or output
distance functions, the commands that implement the FDH and order-m estimators are also designed to
estimate the Shephard measures, as opposed to the reciprocal Farrell-Debreu measures.
Continuing with the Charnes et al. data from the previous examples, FDH estimators of input-efficiency
can be computed by
fdh(XOBS=x,YOBS=y)
after putting the data into matrices x and y as before. Alternatively, order-m input-efficiency estimates can
be computed using
orderm(XOBS=x,YOBS=y)
with the default value m = 25 for the trimming parameter. The resulting estimates, along with DEA
estimates obtained using FEAR’s dea command, are shown in Table 2. See the FEAR Command Reference
for details on the various options that are available with these commands.
The results in Table 2 provide a useful diagnostic check. It is now well-known that DEA and FDH
estimators suffer from the curse of dimensionality; i.e., their convergence rates diminish as the number of
inputs and outputs increases (see Simar and Wilson, 2000 for discussion). In Table 2, all but 5 of the
FDH estimates—almost 93 percent of the sample—are identically equal to 1. This means that most of the
apparent inefficiency implied by the DEA estimates is due solely to the convexity assumption incorporated
by DEA estimators.1 In other words, with 8 dimensions (5 inputs and 3 outputs), 70 observations are likely
far too few to obtain statistically meaningful estimates of technical efficiency. Here, the Charnes et al. (1981)
data are used only for illustrative purposes.
It is also interesting to note that most of the order-m estimates in Table 2 are less than 1. This is to be
expected, since (i) most of the FDH estimates are equal to 1, and (ii) the input-oriented order-m estimates
converge to the corresponding FDH estimates (from below) as m → ∞ for fixed sample size n. See Cazals
et al. (2002) for details.
5.5
Example #4: Farrell-Debreu efficiencies
As noted previously, FEAR’s efficiency estimation commands are designed to estimate the Shephard (1970)
measures of efficiency. Researchers sometimes work in terms of the Farrell (1957) definitions, where efficiency
is measured by the reciprocals of the Shephard measures. This creates no particular difficulties. Assuming
1 FDH efficiency estimators estimate efficiency for a given point relative to the free-disposal hull of the sample observations,
while DEA efficiency estimators estimate efficiency for a given point relative to the convex hull of the free-disposal hull of the
sample observations. Therefore, any differences between FDH and DEA estimators can only be due to the DEA estimators’
incorporation of the convexity assumption on the feasible set of inputs and outputs. See Simar and Wilson (2000b).
15
Obs.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
DEA
1.0393
1.1098
1.0697
1.1091
1.0000
1.0990
1.1218
1.1049
1.1647
1.0629
1.0000
1.0000
1.1596
1.0104
1.0000
1.0524
1.0000
1.0000
1.0498
1.0000
1.0000
1.0000
1.0258
1.0000
1.0217
1.0609
1.0000
1.0097
1.1321
1.1193
1.1949
1.0000
1.0503
1.1640
1.0000
FDH
1.0000
1.0000
1.0000
1.0513
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0135
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0577
1.0000
1.0000
1.0000
1.0000
order-m
1.0000
0.9512
0.9823
0.9906
0.6757
0.8892
0.9580
0.9619
0.9864
0.9971
0.9925
0.9867
1.0065
0.6968
0.6137
0.9926
0.7684
0.9385
0.9970
0.9505
0.9871
0.8533
1.0000
0.8862
0.9463
0.9794
0.9136
0.8403
0.8477
0.8375
1.0027
0.7511
1.0000
0.9822
1.0000
Obs.
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
DEA
1.2611
1.1914
1.0000
1.0621
1.0528
1.0500
1.0491
1.1564
1.0000
1.0000
1.0954
1.0000
1.0000
1.0000
1.0431
1.0871
1.0000
1.1498
1.0000
1.0006
1.0000
1.0788
1.0000
1.0000
1.0199
1.1202
1.0000
1.0379
1.0749
1.0252
1.0687
1.0568
1.0000
1.0000
1.0373
FDH
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0129
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0245
1.0000
1.0000
1.0000
1.0000
order-m
0.9438
0.9409
0.6387
0.9437
0.8980
0.9558
0.8602
0.9324
1.0000
0.6929
0.9962
0.9360
0.6437
0.6611
0.9944
0.8358
0.9983
0.8856
1.0000
0.8423
0.5873
0.9204
0.7303
1.0000
0.8994
0.8117
0.6601
0.8394
0.9641
0.7755
0.9654
0.9507
0.8572
0.5724
0.8774
Table 2: Various estimators of input efficiency for Charnes et al. (1981) data.
16
input and output observations have been placed in matrices x and y as in section 5.2, the following commands
will produce estimates of Farrell input efficiencies as well as estimates of 95-percent confidence intervals:
tmp1=dea(XOBS=x,YOBS=y)
tmp2=boot.sw98(XOBS=x,YOBS=y,DHAT=tmp1)
dhat=1/tmp1
n=ncol(x)
ci=matrix(nrow=n,ncol=2)
ci[,1]=1/tmp2$conf.int[,2]
ci[,2]=1/tmp2$conf.int[,1]
Estimates of the Farrell input-efficiencies are contained in dhat, while corresponding estimates of 95percent confidence intervals are contained in ci. Note that taking reciprocals of the confidence interval
estimates returned by boot.sw98 requires reversing the order of the bounds; i.e., the reciprocal of the upper
bound for the Shephard measure gives the lower bound for the Farrell measure, while the reciprocal of the
lower bound for the Shephard measures gives the upper bound for the Farrell measure.2
5.6
Estimating other things
The commands implemented in FEAR 1.15 are designed to be very flexible. In addition to the dea, fdh, and
orderm commands illustrated in the previous examples, commands are provided for estimating cost efficiency
(cost.min), revenue efficiency (revenue.max), and profit efficiency (profit.max). Malmquist indices and
various decompositions may be estimated using the commands malmquist.components and malmquist; in
addition, these commands can be used to obtain bootstrap estimates of confidence intervals for Malmquist
indices, etc., as described by Simar and Wilson (1999). See the FEAR Command Reference for details on
these commands and arguments that may be passed, as well as simple examples for each command.
The commands that estimate various efficiencies are designed to estimate self-efficiency among a set of
observations (e.g., to estimate efficiency for a set of n observatons relative to the technology supported
by those same n observatons), or to estimate efficiency for a set of points relative to a reference set of
points. In the case of the dea command, this allows one to estimate Malmquist indices, where cross-period
efficiencies must be estimated. In addition, by a series of appropriate invocations of the dea command,
one can estimate components of the various decompositons of Malmquist indices that have appeared in the
literature, or that might appear in the future (rather than using the pre-coded facility implemented by the
commands malmquist.components and malmquist mentioned earlier).
This feature distinguishes FEAR from existing software packages, where typically only estimators that
have been explicitly programmed into the package can be computed. FEAR offers sufficient flexibility to
compute a wide variety of estimates, even of quantities that perhaps have not been estimated previously.
2
Note also that one should not merely take reciprocals of the bootstrap bias and variance estimates returned by boot.sw98
to obtain the corresponding Farrell measures.
17
The homogeneous bootstrap method introduced by Simar and Wilson (1998, 2000b) is implemented by
the command boot.sw98, but the ability to estimate efficiency among observations in one group relative to
observations in another group makes it easy to program other bootstrap procedures. Using the commands in
FEAR 1.15, it is straightforward to implement the bootstrap for Malmquist indices proposed by Simar and
Wilson (1999), the heterogenous bootstrap proposed by Simar and Wilson (2000a), the bootstrap tests of
hypotheses regarding irrelevant variables or additivity relationships proposed by Simar and Wilson (2001b),
as well as the bootstrap tests of hypotheses regarding returns to scale proposed by Simar and Wilson (2001a)
Simar and Wilson (2007) discuss a bootstrap method for making inference in two-step procedures, where
one regresses DEA efficiency estimates from a first-stage estimation on observed environmental variables
in a second-stage regression. The trunc.reg command in FEAR 1.15 can be used to estimated truncated
normal regression equations as in Simar and Wilson (2006), and the rnorm.trunc command can be used to
implement either of the bootstrap alogorithms proposed by Simar and Wilson. Again, see Wilson (2008) for
details on these commands.
18
References
Andersen, P. and N. C. Petersen (1993), A procedure for ranking efficient units in data envelopment analysis,
Management Science 39, 1261–1264.
Barr, R. S. (2004), DEA software tools and technology, in W. W. Cooper, L. M. Seiford, and J. Zhu, eds.,
Handbook on Data Envelopment Analysis, Boston: Kluwer Academic Publishers, pp. 539–566.
Cazals, C., J. P. Florens, and L. Simar (2002), Nonparametric frontier estimation: A robust approach,
Journal of Econometrics 106, 1–25.
Charnes, A., W. W. Cooper, and E. Rhodes (1978), Measuring the efficiency of decision making units,
European Journal of Operational Research 2, 429–444.
— (1981), Evaluating program and managerial efficiency: An application of data envelopment analysis to
program follow through, Management Science 27, 668–697.
Dalgaard, P. (2002), Introductory Statistics with R, New York: Springer-Verlag, Inc.
Daouia, A. and L. Simar (2007), Nonparametric efficiency analysis: A multivariate conditional quantile
approach, Journal of Econometrics 140, 375–400.
Debreu, G. (1951), The coefficient of resource utilization, Econometrica 19, 273–292.
Deprins, D., L. Simar, and H. Tulkens (1984), Measuring labor inefficiency in post offices, in M. M. P.
Pestieau and H. Tulkens, eds., The Performance of Public Enterprises: Concepts and Measurements,
Amsterdam: North-Holland, pp. 243–267.
Färe, R., S. Grosskopf, and C. A. K. Lovell (1985), The Measurement of Efficiency of Production, Boston:
Kluwer-Nijhoff Publishing.
Farrell, M. J. (1957), The measurement of productive efficiency, Journal of the Royal Statistical Society A
120, 253–281.
Gattoufi, S., M. Oral, and A. Reisman (2004), Data envelopment analysis literature: A bibliography update
(1951–2001), Socio-Economic Planning Sciences 38, 159–229.
Hollingsworth, B. (1999), Data envelopment analysis and productivity analysis: A review of the options,
Economic Journal 109, F458–F462.
Kneip, A., L. Simar, and P. W. Wilson (2008), Asymptotics and consistent bootstraps for DEA estimators
in non-parametric frontier models, Econometric Theory 24, 1663–1697.
Racine, J. and R. Hyndman (2002), Using R to teach econometrics, Journal of Applied Econometrics 17,
175–189.
Shephard, R. W. (1970), Theory of Cost and Production Functions, Princeton: Princeton University Press.
Simar, L. and P. W. Wilson (1998), Sensitivity analysis of efficiency scores: How to bootstrap in nonparametric frontier models, Management Science 44, 49–61.
— (1999), Estimating and bootstrapping Malmquist indices, European Journal of Operational Research 115,
459–471.
— (2000a), A general methodology for bootstrapping in non-parametric frontier models, Journal of Applied
Statistics 27, 779–802.
— (2000b), Statistical inference in nonparametric frontier models: The state of the art, Journal of Productivity Analysis 13, 49–78.
19
— (2001a), Nonparametric tests of returns to scale, European Journal of Operational Research 139, 115–132.
— (2001b), Testing restrictions in nonparametric efficiency models, Communications in Statistics 30, 159–
184.
— (2006), Estimation and inference in two-stage, semi-parametric models of productive efficiency, Journal
of Econometrics In press.
— (2007), Estimation and inference in two-stage, semi-parametric models of productive efficiency, Journal
of Econometrics 136, 31–64.
Venables, W. N. and B. D. Ripley (2002), Modern Applied Statistics with S , New York: Springer-Verlag,
Inc.
Verzani, J. (2004), Using R for Introductory Statistics, London: Chapman and Hall.
Wheelock, D. C. and P. W. Wilson (2008), Non-parametric, unconditional quantile estimation for efficiency
analysis with an application to Federal Reserve check processing operations, Journal of Econometrics
145, 209–225.
Wilson, P. W. (1993), Detecting outliers in deterministic nonparametric frontier models with multiple outputs, Journal of Business and Economic Statistics 11, 319–323.
— (2008), FEAR: A software package for frontier efficiency analysis with R, Socio-Economic Planning
Sciences 42, 247–254.
20
Download