FEAR 1.15 User’s Guide Paul W. Wilson Department of Economics 222 Sirrine Hall Clemson University Clemson, South Carolina 29634 USA pww@clemson.edu 9 November 2010 This manual is for FEAR version 1.15, a library for estimating productive efficiency, etc. using R. c 2010 Paul W. Wilson. All rights reserved. Copyright Contents 1 Introduction 1 2 License Issues 2 3 Downloading and Installing R 4 4 Adding FEAR to R 4.1 Where to get FEAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Installing FEAR into R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 5 5 5 Estimation with FEAR 5.1 Getting help . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Example #1: DEA estimates of technical efficiency . . 5.3 Example #2: Outlier detection for frontier models . . 5.4 Example #3: Other estimators of technical efficiency: 5.5 Example #4: Farrell-Debreu efficiencies . . . . . . . . 5.6 Estimating other things . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 7 8 12 15 15 17 1 Introduction An extensive literature concerning the measurement of efficiency in production has developed since Debreu (1951) and Farrell (1957) provided basic definitions for technical and allocative efficiency in production. One large section of this literature focuses on linear-programming based measures of efficiency along the lines of Charnes et al. (1978) and Färe et al. (1985). In addition, the free disposal hull (FDH) method of Deprins et al. (1984) is sometimes used; this estimator can be written as a linear program, algthough it is easier to compute estimates using numerical methods rather than linear programming. Within this literature, the that rely on convexity assumptions are known as Data Envelopment Analysis (DEA). DEA estimators have been applied in more than 1,800 articles published in more than 490 refereed journals (Gattoufi et al., 2004). DEA and similar non-parametric estimators offer numerous advantages, the most obvious being that one need not specify a (potentially erroneous) functional relationship between production inputs and outputs. Although much of the nonparametric efficiency literature has ignored statistical issues such as inference, hypothesis testing, etc., the statistical properties of DEA estimators have recently been established; see Simar and Wilson (2000b) for a survey of these results, and Kneip et al. (2008) for more recent results. Standard software packages (e.g., LIMDEP, STATA, TSP) used by econometricians do not include procudures for DEA or other nonparametric efficiency estimators. Several specialized, commercial software packages reviewed by Hollingsworth (1999) and Barr (2004) are available, and to varying degrees, these are good at what they were designed to do. Each includes facilities for reading data into the program, in some cases in a variety of formats, and procedures for estimating models that the authors have programmed into their software. A common complaint heard among practitioners, however, runs along the lines of “package X will not let me estimate the model I want!” The existing packages are designed for ease of use (again, with varying degrees of success), but the cost of this is often inflexibility, limiting the user to procedures the authors have explicitly made available. Moreover, none of the existing packages include procedures for statistical inference. Although the asymptotic distribution of DEA estimators is now known (see Kneip et al., 2008, for details) for the general case with p inputs and q outputs, bootstrap methods remain the only useful approach for inference. None of the existing packages incorporate the bootstrap methods proposed by Simar and Wilson (1998, 2000a). FEAR 1.15 consists of a software library that can be linked to the general-purpose statistical package R. The routines included in FEAR 1.15 allow the user to compute DEA estimates of technical, allocative, and overall efficiency while assuming either variable, non-increasing, or constant returns to scale. The routines are highly flexible, allowing measurement of efficiency of one group of observations relative to a technology defined by a second, reference group of observations. Consequently, the routines can be used to compute Malmquist indices, scale efficiency measures, super-efficiency scores along the lines of Andersen and Petersen (1993), and other measures that might be of interest. Routines are also included to facilitate implementation 1 of the bootstrap methods described by Simar and Wilson (1998, 2000a). These features can be further used to implement methods of inference for Malmquist indices as in Simar and Wilson (1999), statistical tests for irrelevant inputs and outputs or aggregation possibilities as described in Simar and Wilson (2001b). as well as statistical tests of constant returns to scale versus non-increasing or varying returns to scale as described in Simar and Wilson (2001a). A routine for maximum likelihood estimation of a truncated regression model is included for regressing DEA efficiency estimates on environmental variables as described in Simar and Wilson (2007). In addition, FEAR 1.15 includes commands that can be used to perform outlier analysis using the methods of Wilson (1993), and to compute FDH efficiency estimates along the lines of Deprins et al. (1984), the robust, root-n consistent order-m efficiency estimators described by Cazals et al. (2002), and the robust, root-n consistent order-α efficiency estimators described by Daouia and Simar (2007) and Wheelock and Wilson (2008). Most of these features are unavailable in existing software packages. 2 License Issues R is distributed under the terms of the GNU General Public License Version 2, June 1991. This license can be found on the internet at http://www.gnu.org/copyleft/gpl.html. Further details on licensing of R can be found by typing license() in the R console window described below. FEAR 1.15 is provided under the license that appears in the file LICENSE included with the software. This license states the following: License for the use of the R package "FEAR: copyright 2010, Paul W. Wilson Frontier Efficiency Analysis in R," DEFINTIONS: "OWNER" refers to the author, Paul W. Wilson, whose address is given below. "Software" means the frontier efficiency analysis software developed by OWNER known as the "FEAR software," "FEAR package," "FEAR library," or any other designation referring to the R package "FEAR: Frontier Efficiency Analysis in R," including any source code, binary code, or user documentation. "You" means you, the user and licensee of the Software. If you are employed and intend using the Software in connection with your employment duties, then you warrant that your employer has authorised you to accept this License on behalf of your employer. "Academic use" includes and is limited to use of the software for scientific, academic purposes intended to result in publication of scientific papers in academic journals, in your role as a faculty member or student at a university, college, or secondary school. "Commercial use" is any use of the software that is not specifically academic use as defined above. This includes any use by anyone working as an employee, contractor, paid consultant, or in any other capacity 2 for a commercial firm, non-government organization, government agency, or any other organization that is not a university, college, or secondary school. "Academic user" refers to anyone engaged in academic use of the software as defined above. "Commercial user" refers to anyone engaged in commercial use or other non-academic use of the software as defined above. LICENSE TERMS AND CONDITIONS: 1. All rights in this software are reserved by OWNER. You are only permitted to have this Software in your possession and to make use of it if you have agreed to the terms and conditions set forth in this license or in another license granted by OWNER. 2. Accessing or use of the FEAR software constitutes acceptance of terms and conditions set forth in this license. 3. The software is made available AS IS, without any warranty, either express or implied. OWNER bears no liability for any losses or damages (including direct, indirect, special, or consequential losses or damages or any other losses or damages) resulting from your use of the software or otherwise in connection with this license. 4. An academic users may use this software freely for ACADEMIC USE as defined above. However, OWNER retains all rights to the software. 5. This license must be distributed with the SOFTWARE. 6. You are prohibited from the following activities, unless specifically permitted by OWNER: (i) making copies of the Software except to the extent necessary for backup or research purposes; (ii) distributing, selling, sub-licensing, or otherwise making the software available for use by a third party; and (iii) removing or altering the file containing this license, any logo, copyright, or other proprietary notices, symbols, labels, or documentation in the software. 7. By accessing or using this software, you agree to not (i) reverse engineer, decompile, or deassemble the software; or (ii) use the software to develop copycat or functionally equivalent technology or derivative technologies based on the methods employed in the software. 8. You must recognize OWNER’s authorship and ownership of the software in any report, paper or publication containing references to the software or results generated by the software by including the following citation in any such report, paper, publication, etc.: 3 Wilson, Paul W. (2008), "FEAR 1.0: A Software Package for Frontier Efficiency Analysis with R," Socio-Economic Planning Sciences 42, 247--254. 9. COMMERCIAL USE or other non-ACADEMIC USE is permitted by commercial users after payment of a license fee of US\$180.00 and receipt of email from OWNER confirming payment of the license fee. Such payment allows the software to be used on one central processing unit (CPU). Commercial users desiring to run the software in parallel or grid-computing environments or on multiple CPUs or machines should contact OWNER for further information. 10. All enquiries for the use of this Software should be referred to OWNER: Paul W. Wilson Department of Economics Clemson University Clemson, South Carolina 29630 USA email: pww@clemson.edu phone: 1-864-656-2032 3 Downloading and Installing R R is a language and environment for statistical computing graphics. It is an implementation of the S language developed at Bell Laboratories, but unlike the commercial version of S marketed as S-Plus by Lucent Technoligies, R is freely available under the Free Software Foundation’s GNU General Public License. According to the R project’s web pages (http://www.r-project.org), “R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, ...) and graphical techniques, and is highly extensible.... One of R’s strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control.... R is an integrated suite of software facilities for data manipulation, calculation and graphical display. It includes (i) an effective data handling and storage facility; (ii) a suite of operators for calculations on arrays, in particular matrices; (iii) a large, coherent, integrated collection of intermediate tools for data analysis; (iv) graphical facilities for data analysis and display either on-screen or on hardcopy; and (v) a well-developed, simple and effective programming language which includes conditionals, loops, user-defined recursive functions and input and output facilities.” R includes an online help facility and extensive documentation in the form of manuals included with the package. In addition, several books describing uses of R are available (eg., Dalgaard, 2002; Venables and Ripley, 2002; and Verzani, 2004); see also the recent review by Racine and Hyndman (2002). 4 The current version of R can be downloaded from http://lib.stat.cmu.edu/R/CRAN. Pre-compiled binary versions are available for a variety of platforms; at present, however, FEAR 1.15 is being made available only for the Microsoft Windows (NT, 95 and later) operating systems running on Intel (and clone) machines. Most users will want to download the precompiled, binary version of R for Windows operating systems. After downloading the installer from the R web pages, close all running applications, then left-click on Start then Run, find the R installer (for R version 2.4.0, this is a file named R-2.4.0-win32.exe), and run the installer, following instructions as they appear. If all goes well, at the end of the installation process, users should find the R icon on thier screen. 4 Adding FEAR to R The next step is to make the FEAR package available to R. 4.1 Where to get FEAR After downloading and installing R, FEAR 1.15 can be downloaded from http://www.economics.clemson.edu/faculty/wilson/Software/FEAR The FEAR package is contained in a zip file (FEAR.zip), which should be copied to some location on the user’s machine. Do not unzip the file before installing into R. A copy of the paper announcing the FEAR package, Wilson (2008), this manual, and the license file for FEAR 1.15 can be found at the same site. 4.2 Installing FEAR into R Before the FEAR package can be used, it must be “installed” into R. This must be done only once, unless upgrading to a later version of FEAR. After downloading the FEAR package, start R by clicking on its desktop icon. This will open the R graphical user interface (GUI), shown in Figure 1. Along the top of the R GUI, find the word “Package,”, and left-click on this; then left-click on “Install package(s) from local zip files...” in the pop-up menu (shown in Figure 2) that appears. Next, a Windows file-selection window will appear, as shown in Figure 3. Use the navigation buttons to find the file FEAR.zip, highlight the file name, then left-click on the “Open” button in the Windows file-selection window. The following should appear in the R console window within the R GUI: > utils:::menuInstallLocal() package ’FEAR’ successfully unpacked and MD5 sums checked updating HTML package descriptions > 5 Figure 1: The R Windows GUI. Next, to make the commands in the FEAR package available for use, click on the R console window to make it active, and type library(FEAR) or require(FEAR) at the prompt; this should result in the following displayed in the R console window: > library(FEAR) FEAR (Frontier Efficiency Analysis with R) 1.0 installed Copyright Paul W. Wilson 2006 See file COPYING for license and citation information > Note that FEAR uses another package, KernSmooth, which is included in current distributions of R. This package is loaded automatically by FEAR, and is used by some of the commands for bandwidth-selection in FEAR. 6 Figure 2: The R Packages menu. Figure 3: The R file selection interface. 5 Estimation with FEAR 5.1 Getting help After successfully completing the installation of R and FEAR as described in the previous sections, one can begin using FEAR. User-accessible commands are listed and described in the FEAR Command Reference available on the FEAR website. One can also list the available commands in FEAR by typing help(package=FEAR at the prompt in the R console window (remember, to access the FEAR package, one must type library(FEAR) or require(FEAR) first). Alternatively, R’s HTML help facility can also be used. Left-click on “Help” at the top of the R GUI, and then on “HTML help” in the pop-up menu that appears (see Figure 4). This will open a browser window and display links for R manuals, packages, etc. as shown in Figure 5. In this window, click on the “Packages” link to get a list of available packages, then click on “FEAR” to display a list of commands available in the FEAR package as shown in Figure 6. Clicking on the command names will result in more detailed explanations. Next, some simple examples using commands from the FEAR package are given. FEAR includes several 7 Figure 4: The R help pop-up menu. datasets that can be used to illustrate the package’s capabilities. 5.2 Example #1: DEA estimates of technical efficiency Typing data(ccr) loads the data given in Charnes et al. (1981). These data contain observations on p = 5 inputs and q = 3 outputs for n = 70 schools. The following commands create a (p × n) matrix of input vectors and a (q × n) matrix of outputs from the Charnes et al. data: data(ccr) x=matrix(nrow=5,ncol=70) x[1,]=ccr$x1 x[2,]=ccr$x2 x[3,]=ccr$x3 x[4,]=ccr$x4 x[5,]=ccr$x5 y=matrix(nrow=3,ncol=70) y[1,]=ccr$y1 y[2,]=ccr$y2 y[3,]=ccr$y3 Alternatively, one might type data(ccr) x=t(matrix(c(ccr$x1,ccr$x2,ccr$x3,ccr$x4,ccr$x5), nrow=70,ncol=5)) y=t(matrix(c(ccr$y1,ccr$y2,ccr$y3),nrow=70,ncol=3)) The second example puts the data into a long vector (using the c command), assigns this to a matrix (R 8 Figure 5: The R HTML help facility. 9 Figure 6: The R HTML help facility showing commands in FEAR 1.15. 10 puts data column-wise into matrices), and then assigns the transpose of this matrix (obtained using R’s t command) to either x or y. Next, FEAR’s dea command can be used to estimate technical efficiency for the observations in the Charnes et al. (1981) data, relative to the technology estimated from these observations. The command dea computes estimates of either Shephard (1970) input or ouput distance functions; here, the input-orientation is used: dhat=dea(XOBS=x,YOBS=y) Alternatively, output distance functions could be estimated by adding the argument ORIENTATION=2 to the dea command above. The default is to allow variable returns to scale in the estimated technology, but constant returns or non-increasing returns to scale are also possible using the RTS argument; see documentation for the dea command in the FEAR Command Reference. The command boot.sw98 can be used to estimate confidence intervals via the homogeneous bootstrap method described by Simar and Wilson (1998, 2000b): tmp=boot.sw98(XOBS=x,YOBS=y,DHAT=dhat,NREP=2000) Finally, the results from the dea and boot.sw98 commands can be manipulated to produce LaTeX code for a table: n=ncol(x) #number of DMUs table.in=matrix(nrow=n,ncol=7) table.in[,1]=c(1:n) table.in[,2]=dhat table.in[,3]=dhat-tmp$bias #bias-corrected estimate table.in[,4]=tmp$bias table.in[,5]=tmp$var table.in[,6:7]=tmp$conf.int table.in[1:9,1]=paste(" ",table.in[1:9,1],sep="") At this point, table.in is a (70 × 7) matrix, with each row corresponding to an observation in the Charnes et al. (1981) data. The first column contains the observation number (1–70); the second column contains the DEA estimates of the Shephard input distance function for each observation; columns 3 contains a bias-corrected estimates of the Shephard input distance function, obtained by subtracting the bootstrap bias estimate from the original distance function estimates in dhat; columns 4–5 contain the bootstrap bias and variance estimates, respectively; and columns 6–7 contain estimated upper and lower bounds for 95-percent confidence intervals obtained by bootstrapping. Continuing to format the results for use in a LaTeX table, table.in[,2]=ifelse(nchar(table.in[,2])==1, paste(table.in[,2],".",sep=""), table.in[,2]) 11 table.in[,2:7]=paste(table.in[,2:7],"000000",sep="") table.in[,c(2:3,5:7)]=substr(table.in[,c(2:3,5:7)],1,6) table.in[,4]=substr(table.in[,4],1,7) table.in=paste(table.in[,1]," & ", table.in[,2]," & ", table.in[,3]," & ", table.in[,4]," & ", table.in[,5]," & ", table.in[,6]," & ", table.in[,7]," \\",sep="") Note that table.in has been converted to a vector of length 70 by these commands. Typing table.in[1:5] at the prompt in the R console window will display the first five elements of table.in as shown here: 1 2 3 4 5 & & & & & 1.0393 1.1098 1.0697 1.1091 1.0000 & & & & & 1.0728 1.1391 1.0977 1.1279 1.0454 & & & & & -0.0335 -0.0293 -0.0280 -0.0188 -0.0454 & & & & & 0.0006 0.0002 0.0003 9.3540 0.0010 & & & & & 1.0424 1.1139 1.0732 1.1132 1.0033 & & & & & 1.1223 1.1660 1.1320 1.1446 1.1051 \\ \\ \\ \\ \\ The LaTeX code in table.in can be written to a file named table in.tex using the command write(table.in,file="table in.tex") Inserting the code into a LaTeX document, one can produce a table of results similar to Table 1. 5.3 Example #2: Outlier detection for frontier models Wilson (1993) describes an influence-function approach for detecting outliers in the context of frontier models. This is of vital importance, since conventional nonparametric estimators such as FDH or DEA are very sensitive to outliers. The Wilson (1993) method is implemented by FEAR’s ap and ap.plot commands. The Charnes et al. (1981) data can be analyzed for outliers using the following commands: data(ccr) x=t(matrix(c(ccr$x1,ccr$x2,ccr$x3,ccr$x4,ccr$x5), nrow=70,ncol=5)) y=t(matrix(c(ccr$y1,ccr$y2,ccr$y3),nrow=70,ncol=5)) tmp=ap(x,y,NDEL=12) ap.plot(RATIO=tmp$ratio) Here, the ap.plot command reproduces the log-ratio plot for the Charnes et al. (1981) data that appears in Wilson (1993). The plot is drawn on-screen, in the R GUI, as shown in Figure 7. Alternatively, R can write Postscript code for plots to a file, which can be incorporated later into LaTeX or other document-producing software. To write the plot to a Postscript file, one would add a postscript command before the ap.plot command in the example shown above. Type help(postscript) or use R’s HTML help facility for further information. 12 Units 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Eff. Scores (VRS) 1.0393 1.1098 1.0697 1.1091 1.0000 1.0990 1.1218 1.1049 1.1647 1.0629 1.0000 1.0000 1.1596 1.0104 1.0000 1.0524 1.0000 1.0000 1.0498 1.0000 1.0000 1.0000 1.0258 1.0000 1.0217 1.0609 1.0000 1.0097 1.1321 1.1193 1.1949 1.0000 1.0503 1.1640 1.0000 1.2611 1.1914 1.0000 1.0621 1.0528 1.0500 1.0491 1.1564 1.0000 1.0000 1.0954 1.0000 1.0000 1.0000 1.0431 1.0871 1.0000 1.1498 1.0000 1.0006 1.0000 1.0788 1.0000 1.0000 1.0199 Eff. Bias-Corrected 1.0728 1.1391 1.0977 1.1279 1.0454 1.1282 1.1427 1.1371 1.1889 1.0952 1.0456 1.0463 1.1782 1.0379 1.0554 1.0807 1.0536 1.0408 1.0764 1.0541 1.0529 1.0311 1.0476 1.0521 1.0381 1.0807 1.0457 1.0339 1.1673 1.1447 1.2185 1.0473 1.0737 1.1866 1.0428 1.2900 1.2243 1.0526 1.0943 1.0778 1.0640 1.0791 1.1871 1.0535 1.0358 1.1159 1.0516 1.0529 1.0458 1.0706 1.1106 1.0537 1.1739 1.0554 1.0313 1.0511 1.1096 1.0547 1.0540 1.0395 \ BIAS -0.0335 -0.0293 -0.0280 -0.0188 -0.0454 -0.0292 -0.0209 -0.0322 -0.0242 -0.0323 -0.0456 -0.0463 -0.0186 -0.0275 -0.0554 -0.0283 -0.0536 -0.0408 -0.0266 -0.0541 -0.0529 -0.0311 -0.0218 -0.0521 -0.0164 -0.0198 -0.0457 -0.0242 -0.0352 -0.0254 -0.0236 -0.0473 -0.0234 -0.0226 -0.0428 -0.0289 -0.0329 -0.0526 -0.0322 -0.0250 -0.0140 -0.0300 -0.0307 -0.0535 -0.0358 -0.0205 -0.0516 -0.0529 -0.0458 -0.0275 -0.0235 -0.0537 -0.0241 -0.0554 -0.0307 -0.0511 -0.0308 -0.0547 -0.0540 -0.0196 σ b 0.0006 0.0002 0.0003 9.3540 0.0010 0.0002 0.0001 0.0005 0.0001 0.0005 0.0010 0.0010 7.2837 0.0002 0.0020 0.0004 0.0020 0.0006 0.0002 0.0018 0.0017 0.0002 0.0001 0.0016 5.2518 9.8236 0.0010 0.0001 0.0004 0.0001 0.0001 0.0011 0.0001 0.0001 0.0008 0.0002 0.0003 0.0019 0.0004 0.0002 4.3406 0.0003 0.0003 0.0020 0.0006 0.0001 0.0015 0.0020 0.0010 0.0004 0.0001 0.0019 0.0001 0.0021 0.0003 0.0015 0.0005 0.0021 0.0020 7.9640 Lower Bound 1.0424 1.1139 1.0732 1.1132 1.0033 1.1030 1.1255 1.1090 1.1683 1.0661 1.0038 1.0041 1.1642 1.0143 1.0034 1.0558 1.0034 1.0034 1.0534 1.0035 1.0037 1.0029 1.0293 1.0029 1.0250 1.0651 1.0034 1.0128 1.1370 1.1242 1.1995 1.0039 1.0543 1.1683 1.0037 1.2657 1.1960 1.0034 1.0661 1.0560 1.0536 1.0530 1.1610 1.0036 1.0037 1.0991 1.0037 1.0038 1.0032 1.0455 1.0908 1.0037 1.1539 1.0043 1.0047 1.0034 1.0819 1.0039 1.0032 1.0242 Upper Bound 1.1223 1.1660 1.1320 1.1446 1.1051 1.1566 1.1595 1.1855 1.2101 1.1402 1.1073 1.1021 1.1928 1.0690 1.1443 1.1234 1.1447 1.0839 1.1068 1.1411 1.1304 1.0554 1.0737 1.1289 1.0499 1.0978 1.1008 1.0592 1.2087 1.1651 1.2376 1.1073 1.0991 1.2082 1.0977 1.3195 1.2538 1.1430 1.1331 1.1068 1.0749 1.1137 1.2213 1.1478 1.0845 1.1369 1.1241 1.1445 1.1032 1.1118 1.1373 1.1430 1.1945 1.1506 1.0677 1.1255 1.1535 1.1461 1.1476 1.0543 Table 1: Estimates for Charnes et al. (1981) data (2000 bootstrap replications). 13 Units 61 62 63 64 65 66 67 68 69 70 Eff. Scores (VRS) 1.1202 1.0000 1.0379 1.0749 1.0252 1.0687 1.0568 1.0000 1.0000 1.0373 Eff. Bias-Corrected 1.1523 1.0543 1.0613 1.0932 1.0489 1.0833 1.0770 1.0554 1.0545 1.0563 \ BIAS -0.0321 -0.0543 -0.0234 -0.0183 -0.0237 -0.0146 -0.0202 -0.0554 -0.0545 -0.0190 σ b 0.0005 0.0020 0.0001 8.9526 0.0001 4.8927 0.0001 0.0020 0.0020 8.7786 Lower Bound 1.1242 1.0035 1.0415 1.0784 1.0283 1.0722 1.0605 1.0038 1.0037 1.0411 Upper Bound 1.1988 1.1448 1.0834 1.1097 1.0703 1.0953 1.0940 1.1478 1.1425 1.0719 Table 1: (continued). Figure 7: Log-ratio plot for Charnes et al. (1981) data produced by ap.plot command. 14 5.4 Example #3: Other estimators of technical efficiency: FEAR 1.15 also includes commands to implement FDH and order-m (Cazals et al., 2002) efficiency estimators. To remain consistent with the dea command, which estimates Shephard (1970) input or output distance functions, the commands that implement the FDH and order-m estimators are also designed to estimate the Shephard measures, as opposed to the reciprocal Farrell-Debreu measures. Continuing with the Charnes et al. data from the previous examples, FDH estimators of input-efficiency can be computed by fdh(XOBS=x,YOBS=y) after putting the data into matrices x and y as before. Alternatively, order-m input-efficiency estimates can be computed using orderm(XOBS=x,YOBS=y) with the default value m = 25 for the trimming parameter. The resulting estimates, along with DEA estimates obtained using FEAR’s dea command, are shown in Table 2. See the FEAR Command Reference for details on the various options that are available with these commands. The results in Table 2 provide a useful diagnostic check. It is now well-known that DEA and FDH estimators suffer from the curse of dimensionality; i.e., their convergence rates diminish as the number of inputs and outputs increases (see Simar and Wilson, 2000 for discussion). In Table 2, all but 5 of the FDH estimates—almost 93 percent of the sample—are identically equal to 1. This means that most of the apparent inefficiency implied by the DEA estimates is due solely to the convexity assumption incorporated by DEA estimators.1 In other words, with 8 dimensions (5 inputs and 3 outputs), 70 observations are likely far too few to obtain statistically meaningful estimates of technical efficiency. Here, the Charnes et al. (1981) data are used only for illustrative purposes. It is also interesting to note that most of the order-m estimates in Table 2 are less than 1. This is to be expected, since (i) most of the FDH estimates are equal to 1, and (ii) the input-oriented order-m estimates converge to the corresponding FDH estimates (from below) as m → ∞ for fixed sample size n. See Cazals et al. (2002) for details. 5.5 Example #4: Farrell-Debreu efficiencies As noted previously, FEAR’s efficiency estimation commands are designed to estimate the Shephard (1970) measures of efficiency. Researchers sometimes work in terms of the Farrell (1957) definitions, where efficiency is measured by the reciprocals of the Shephard measures. This creates no particular difficulties. Assuming 1 FDH efficiency estimators estimate efficiency for a given point relative to the free-disposal hull of the sample observations, while DEA efficiency estimators estimate efficiency for a given point relative to the convex hull of the free-disposal hull of the sample observations. Therefore, any differences between FDH and DEA estimators can only be due to the DEA estimators’ incorporation of the convexity assumption on the feasible set of inputs and outputs. See Simar and Wilson (2000b). 15 Obs. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 DEA 1.0393 1.1098 1.0697 1.1091 1.0000 1.0990 1.1218 1.1049 1.1647 1.0629 1.0000 1.0000 1.1596 1.0104 1.0000 1.0524 1.0000 1.0000 1.0498 1.0000 1.0000 1.0000 1.0258 1.0000 1.0217 1.0609 1.0000 1.0097 1.1321 1.1193 1.1949 1.0000 1.0503 1.1640 1.0000 FDH 1.0000 1.0000 1.0000 1.0513 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0135 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0577 1.0000 1.0000 1.0000 1.0000 order-m 1.0000 0.9512 0.9823 0.9906 0.6757 0.8892 0.9580 0.9619 0.9864 0.9971 0.9925 0.9867 1.0065 0.6968 0.6137 0.9926 0.7684 0.9385 0.9970 0.9505 0.9871 0.8533 1.0000 0.8862 0.9463 0.9794 0.9136 0.8403 0.8477 0.8375 1.0027 0.7511 1.0000 0.9822 1.0000 Obs. 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 DEA 1.2611 1.1914 1.0000 1.0621 1.0528 1.0500 1.0491 1.1564 1.0000 1.0000 1.0954 1.0000 1.0000 1.0000 1.0431 1.0871 1.0000 1.1498 1.0000 1.0006 1.0000 1.0788 1.0000 1.0000 1.0199 1.1202 1.0000 1.0379 1.0749 1.0252 1.0687 1.0568 1.0000 1.0000 1.0373 FDH 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0129 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0245 1.0000 1.0000 1.0000 1.0000 order-m 0.9438 0.9409 0.6387 0.9437 0.8980 0.9558 0.8602 0.9324 1.0000 0.6929 0.9962 0.9360 0.6437 0.6611 0.9944 0.8358 0.9983 0.8856 1.0000 0.8423 0.5873 0.9204 0.7303 1.0000 0.8994 0.8117 0.6601 0.8394 0.9641 0.7755 0.9654 0.9507 0.8572 0.5724 0.8774 Table 2: Various estimators of input efficiency for Charnes et al. (1981) data. 16 input and output observations have been placed in matrices x and y as in section 5.2, the following commands will produce estimates of Farrell input efficiencies as well as estimates of 95-percent confidence intervals: tmp1=dea(XOBS=x,YOBS=y) tmp2=boot.sw98(XOBS=x,YOBS=y,DHAT=tmp1) dhat=1/tmp1 n=ncol(x) ci=matrix(nrow=n,ncol=2) ci[,1]=1/tmp2$conf.int[,2] ci[,2]=1/tmp2$conf.int[,1] Estimates of the Farrell input-efficiencies are contained in dhat, while corresponding estimates of 95percent confidence intervals are contained in ci. Note that taking reciprocals of the confidence interval estimates returned by boot.sw98 requires reversing the order of the bounds; i.e., the reciprocal of the upper bound for the Shephard measure gives the lower bound for the Farrell measure, while the reciprocal of the lower bound for the Shephard measures gives the upper bound for the Farrell measure.2 5.6 Estimating other things The commands implemented in FEAR 1.15 are designed to be very flexible. In addition to the dea, fdh, and orderm commands illustrated in the previous examples, commands are provided for estimating cost efficiency (cost.min), revenue efficiency (revenue.max), and profit efficiency (profit.max). Malmquist indices and various decompositions may be estimated using the commands malmquist.components and malmquist; in addition, these commands can be used to obtain bootstrap estimates of confidence intervals for Malmquist indices, etc., as described by Simar and Wilson (1999). See the FEAR Command Reference for details on these commands and arguments that may be passed, as well as simple examples for each command. The commands that estimate various efficiencies are designed to estimate self-efficiency among a set of observations (e.g., to estimate efficiency for a set of n observatons relative to the technology supported by those same n observatons), or to estimate efficiency for a set of points relative to a reference set of points. In the case of the dea command, this allows one to estimate Malmquist indices, where cross-period efficiencies must be estimated. In addition, by a series of appropriate invocations of the dea command, one can estimate components of the various decompositons of Malmquist indices that have appeared in the literature, or that might appear in the future (rather than using the pre-coded facility implemented by the commands malmquist.components and malmquist mentioned earlier). This feature distinguishes FEAR from existing software packages, where typically only estimators that have been explicitly programmed into the package can be computed. FEAR offers sufficient flexibility to compute a wide variety of estimates, even of quantities that perhaps have not been estimated previously. 2 Note also that one should not merely take reciprocals of the bootstrap bias and variance estimates returned by boot.sw98 to obtain the corresponding Farrell measures. 17 The homogeneous bootstrap method introduced by Simar and Wilson (1998, 2000b) is implemented by the command boot.sw98, but the ability to estimate efficiency among observations in one group relative to observations in another group makes it easy to program other bootstrap procedures. Using the commands in FEAR 1.15, it is straightforward to implement the bootstrap for Malmquist indices proposed by Simar and Wilson (1999), the heterogenous bootstrap proposed by Simar and Wilson (2000a), the bootstrap tests of hypotheses regarding irrelevant variables or additivity relationships proposed by Simar and Wilson (2001b), as well as the bootstrap tests of hypotheses regarding returns to scale proposed by Simar and Wilson (2001a) Simar and Wilson (2007) discuss a bootstrap method for making inference in two-step procedures, where one regresses DEA efficiency estimates from a first-stage estimation on observed environmental variables in a second-stage regression. The trunc.reg command in FEAR 1.15 can be used to estimated truncated normal regression equations as in Simar and Wilson (2006), and the rnorm.trunc command can be used to implement either of the bootstrap alogorithms proposed by Simar and Wilson. Again, see Wilson (2008) for details on these commands. 18 References Andersen, P. and N. C. Petersen (1993), A procedure for ranking efficient units in data envelopment analysis, Management Science 39, 1261–1264. Barr, R. S. (2004), DEA software tools and technology, in W. W. Cooper, L. M. Seiford, and J. Zhu, eds., Handbook on Data Envelopment Analysis, Boston: Kluwer Academic Publishers, pp. 539–566. Cazals, C., J. P. Florens, and L. Simar (2002), Nonparametric frontier estimation: A robust approach, Journal of Econometrics 106, 1–25. Charnes, A., W. W. Cooper, and E. Rhodes (1978), Measuring the efficiency of decision making units, European Journal of Operational Research 2, 429–444. — (1981), Evaluating program and managerial efficiency: An application of data envelopment analysis to program follow through, Management Science 27, 668–697. Dalgaard, P. (2002), Introductory Statistics with R, New York: Springer-Verlag, Inc. Daouia, A. and L. Simar (2007), Nonparametric efficiency analysis: A multivariate conditional quantile approach, Journal of Econometrics 140, 375–400. Debreu, G. (1951), The coefficient of resource utilization, Econometrica 19, 273–292. Deprins, D., L. Simar, and H. Tulkens (1984), Measuring labor inefficiency in post offices, in M. M. P. Pestieau and H. Tulkens, eds., The Performance of Public Enterprises: Concepts and Measurements, Amsterdam: North-Holland, pp. 243–267. Färe, R., S. Grosskopf, and C. A. K. Lovell (1985), The Measurement of Efficiency of Production, Boston: Kluwer-Nijhoff Publishing. Farrell, M. J. (1957), The measurement of productive efficiency, Journal of the Royal Statistical Society A 120, 253–281. Gattoufi, S., M. Oral, and A. Reisman (2004), Data envelopment analysis literature: A bibliography update (1951–2001), Socio-Economic Planning Sciences 38, 159–229. Hollingsworth, B. (1999), Data envelopment analysis and productivity analysis: A review of the options, Economic Journal 109, F458–F462. Kneip, A., L. Simar, and P. W. Wilson (2008), Asymptotics and consistent bootstraps for DEA estimators in non-parametric frontier models, Econometric Theory 24, 1663–1697. Racine, J. and R. Hyndman (2002), Using R to teach econometrics, Journal of Applied Econometrics 17, 175–189. Shephard, R. W. (1970), Theory of Cost and Production Functions, Princeton: Princeton University Press. Simar, L. and P. W. Wilson (1998), Sensitivity analysis of efficiency scores: How to bootstrap in nonparametric frontier models, Management Science 44, 49–61. — (1999), Estimating and bootstrapping Malmquist indices, European Journal of Operational Research 115, 459–471. — (2000a), A general methodology for bootstrapping in non-parametric frontier models, Journal of Applied Statistics 27, 779–802. — (2000b), Statistical inference in nonparametric frontier models: The state of the art, Journal of Productivity Analysis 13, 49–78. 19 — (2001a), Nonparametric tests of returns to scale, European Journal of Operational Research 139, 115–132. — (2001b), Testing restrictions in nonparametric efficiency models, Communications in Statistics 30, 159– 184. — (2006), Estimation and inference in two-stage, semi-parametric models of productive efficiency, Journal of Econometrics In press. — (2007), Estimation and inference in two-stage, semi-parametric models of productive efficiency, Journal of Econometrics 136, 31–64. Venables, W. N. and B. D. Ripley (2002), Modern Applied Statistics with S , New York: Springer-Verlag, Inc. Verzani, J. (2004), Using R for Introductory Statistics, London: Chapman and Hall. Wheelock, D. C. and P. W. Wilson (2008), Non-parametric, unconditional quantile estimation for efficiency analysis with an application to Federal Reserve check processing operations, Journal of Econometrics 145, 209–225. Wilson, P. W. (1993), Detecting outliers in deterministic nonparametric frontier models with multiple outputs, Journal of Business and Economic Statistics 11, 319–323. — (2008), FEAR: A software package for frontier efficiency analysis with R, Socio-Economic Planning Sciences 42, 247–254. 20