Data Envelopment Analysis Robert M. Hayes 2005 Overview Introduction Data Envelopment Analysis DEA Models Extensions to include a priori Valuations Strengths and Weaknesses of DEA Implementation of DEA The Example of Libraries Annals of Operations Research 66 Annals of Operations Research 73 Introduction Utility Functions Cost/Effectiveness Interpretation for Libraries Utility Functions A fundamental requirement in applying operations research models is the identification of a "utility function" which combines all variables relevant to a decision problem into a single variable which is to be optimized. Underlying the concept of a utility function is the view that it should represent the decision-maker's perceptions of the relative importance of the variables involved rather than being regarded as uniform across all decision-makers or externally imposed. The problem, of course, is that the resulting utility functions may bear no relationship to each other and it is therefore difficult to make comparisons from one decision context to another. Indeed, not only may it not be possible to compare two different decision-makers but it may not be possible to compare the utility functions of a single decision-maker from one context to another. Cost/Effectiveness A traditional way to combine variables in a utility function is to use a cost/effectiveness ratio, called an "efficiency" measure. It measures utility by the "cost per unit produced". On the surface, that would appear to make comparison between two contexts possible by comparing the two cost/effectiveness ratios. The problem, though, is that two different decision-makers may place different emphases on the two variables. Cost/Effectiveness It also must be recognized that effectiveness will usually entail consideration of a number of products and services and costs a number of sources of costs. Cost/effectiveness measurement requires combining the sources of cost into a single measure of cost and the products and services into a single measure of effectiveness. Again, the problem of different emphases of importance must be recognized. This is especially the case for the several measures of effectiveness. But it may also be the case with the several measure of costs, since some costs may be regarded as more important than others even though they may all be measured in dollars. When some costs cannot be measured in dollars, the problem is compounded. Cost/Effectiveness More generally, instead of costs and effectiveness, the variables may be identified as "input" and "output". The efficiency ratio is then no long characterized as cost/effectiveness but as "output/input", but the issues identified above are the same. Interpretation for Libraries This issue can be illustrated by evaluating library performance. Effectiveness here is the extent to which library services meet the expectations or goals set by the organization served. It is likely to be measured by several services which are the outputs of library operations—making a collection available for use, circulation or other uses of materials, answering of information questions, instructing and consulting. Inputs are represented by acquisitions, staff, and space, to which evident costs can be assigned, but they are also represented by measures of the populations served. Interpretation for Libraries Efficiency measures the library’s ability to transform its inputs (resources and demands) into production of outputs (services). The objective in doing so is to optimize the balance between the level of outputs and the level of inputs. The success of the library, like that of other organizations, depends on its ability to behave both effectively and efficiently. The issue at hand, then is how to combine the several measures of input and output into a single measure of efficiency. The method we will examine is that called "data envelopment analysis". Data Envelopment Analysis Data Envelopment Analysis (DEA) measures the relative efficiencies of organizations with multiple inputs and multiple outputs. The organizations are called the decision-making units, or DMUs. DEA assigns weights to the inputs and outputs of a DMU that give it the best possible efficiency. It thus arrives at a weighting of the relative importance of the input and output variables that reflects the emphasis that appears to have been placed on them for that particular DMU. At the same time, though, DEA then gives all the other DMUs the same weights and compares the resulting efficiencies with that for the DMU of focus. Data Envelopment Analysis If the focus DMU looks at least as good as any other DMU, it receives a maximum efficiency score. But if some other DMU looks better than the focus DMU, the weights having been calculated to be most favorable to the focus DMU, then it will receive an efficiency score less than maximum. Graphical Illustration To illustrate, consider seven DMUs which each have one input and one output: L1 = (2,2), L2 = (3,5), L3 = (6,7), L4 = (9,8), L5 = (5,3), L6 = (4,1), L7 = (10,7). 9 L4 8 L3 7 Output 6 L7 L2 5 4 L5 3 L1 2 L6 1 0 0 1 2 3 4 5 6 Input 7 8 9 10 11 Graphical Illustration DEA identifies the units in the comparison set which lie at the top and to the left, as represented by L1, L2, L3, and L4. These units are called the efficient units, and the line connecting them is called the "envelopment surface" because it envelops all the cases. DMUs L5 through L7 are not on the envelopment surface and thus are evaluated as inefficient by the DEA analysis. There are two ways to explain their weakness. One is to say that, for example, L5 could perhaps produce as much output as it does, but with less input (comparing with L1 and L2); the other is to say it could produce more output with the same input (comparing with L2 and L3). Graphical Illustration Thus, there are two possible definitions of efficiency depending on the purpose of the evaluation. One might be interested in possible reduction of inputs (in DEA this is called the input orientation) or augmentation of outputs (the output orientation) in achieving technical efficiency. Depending on the purpose of the evaluation, the analysis provides different sets of peer groups to which to compare. However, there are times when reduction of inputs or augmentation of outputs is not sufficient. In our example, even when L6 reduces its input from 4 units to 2, there is still a gap between it and its peer L1 in the amount of one unit of output. In DEA, this is called the "slack" which means excess input or missing output that exists even after the proportional change in the input or the outputs. Graphical Illustration This example will be used to illustrate the several forms that the DEA model can take. In each case, the results presented are based on the implementation of DEA that will be discussed later in this presentation. It is an Excel spreadsheet using the add-in Solver capability. The spreadsheet is identical for all of the forms, but the application of Solver differs in the target optimized and in the values to be varied, so for each form the target and the values to be varied will be identified. DEA Models The Basic EDA Concept Variations of DEA Formulation Formulation: Primal or Dual Orientation: Input or Output Returns to Scale: Fixed or Variable The Basic EDA Concept Assume that each DMU has values for a set of inputs and a set of outputs. Choose non-negative weights to be applied to the inputs and outputs for a focus DMU so as to maximize the ratio of weighted outputs divided by weighted inputs But do so subject to the condition that, if the same weights are applied to each of the DMUs (including the focus DMU), the corresponding ratios are not greater than 1 Do that for each DMU. The resulting value of the ratio for each DMU is its EDA efficiency. It is 1 if the DMU is efficient and less than 1 if it is not. Formulation Let (Yk,Xk) = (Yki,Xkj), k = 1 to n, i = 1 to s, j = 1 to m Maximize mYk/nXk for each value of k from 1 to n, subject in each case to mYj/nXj <= 1, j= 1 to n, where mYk means Si mi*Yki, i = 1 to s, nXk means Si ni*Xki, i = 1 to m mYj means Si mi*Yji, i = 1 to s and j = 1 to n nXj means Si ni*Xji, i = 1 to m and j = 1 to n. mi, ni >= 0 The solution is the set of maximum values for mYk/nXk and the associated values for m and n Basic Linear Programming Model For solution, this optimization problem is transformed into a linear programming problem, schematically displayed as follows: Min m n Yj -Xj <= 0 a -I <= -I b -I <= -I >= >= Max Yk - Xk In a moment, we will interpret this display as it is translated into alternative formulations of the optimization target and conditional inequalities. Variations of DEA Formulation But first, it is necessary to identify several sources of variation in the basic DEA formulation, leading to a variety of different models for implementation: (1) Formulation (2) Orientation (3) Returns to Scale (4) Discretionary? (5) Models Primal Form Input Minimization Fixed Returns Discretionary Variables Additive Dual Form Output Maximization Variable Returns Non-discretionary Variables Multiplicative We will now examine and illustrate each of those sources of variation. (1) Formulation: Primal or Dual The first source of variation is interpretation of the display for the linear programming model. One interpretation, called the Primal, treats the rows of the display as representing the model. The other interpretation, called the Dual, treats the columns as representing the model. Let’s examine each of those in turn. Primal Formulation m n Yj -Xj <= (2) -I <= (3) -I <= 0 -I -I (M) Yk - Xk The rows of this display are interpreted as follows: (M) Maximize W = mYk – nXk subject to (1) mYj – nXj <= 0, j = 1 to n (2) -m <= -1, or m >= 1 (3) -n <= -1, or n >= 1 The Dual Formulation a b Yj -Xj -I -I >= >= Yk - Xk (m) 0 -I -I The Columns of this display are interpreted as follows: (m) Minimize W = -a - b subject to (1) Yj – a >= Yk (2) –Xj - b >= -Xk The Choice of Formulation Since the results from the two formulation are equal, though expressed differently, the choice between them is based on computational efficiency or, perhaps, ease of interpretation. The Dual form is more efficient in computation if the number of DMUs is large compared to the number of input and output variables. Note that the Primal form entails n conditions (n being the number of DMUs) which, in the Dual form, are replaced by just m + s conditions (m being the number of input variables and s, the number of output variables) Illustration To illustrate, consider the example previously presented. The target to be minimized in the Dual form is W = – a – b. The values to be varied are (, a, b), or (m, n. The following table shows the solution for both forms: L1 L2 L3 L4 L5 L6 L7 X 2 3 6 9 5 4 10 Y 2 5 7 8 3 1 7 W - 1.33 0.00 - 3.00 - 7.00 - 5.33 - 5.67 - 9.67 a b 1.33 0.00 3.00 7.00 5.33 5.67 9.67 = 0.67 = 00 = 00 = 00 = = = m 1 1 1 1 1 1 1 n 1.67 1.67 1.67 1.67 1.67 1.67 1.67 Illustration Graphically, the results are as follows: 25 20 15 10 5 0 0 2 4 6 8 10 12 The maximum value for W, over all cases, is at L2, where W = 0 and the ratio of Y/X is a maximum. The slack for each other case is the vertical distance to the line which goes from the origin (0,0) through L2 (3,5). (2) Orientation: Input or Output The second source of variation, orientation, provides the means for focusing on minimizing input or on maximizing output. These represent two quite different objectives in making assessments of efficiency. Is the objective to be minimally expensive (e.g., to save money) or is it to be maximally effective? Orientation to Input The linear programming display for the input orientation is as follows: Min m n Yj -Xj <= 0 a -I <= 0 b -I <= 0 I c-1 Xk <= >= >= Max Yk - Xk It adds one additional condition, nXk <= 1, to the display. Orientation to Input The resulting Dual formulation is as follows: (m) Minimize W = c-1 subject to (1) Yj – a >= Yk (2) –Xj – b + (c – 1)Xk >= -Xk or Xk + b <= cXk Yj -Xj a -I b -I c-1 Xk >= >= Max Yk - Xk (m) 0 0 0 I Orientation to Input Continuing with the same example, the following table shows the solutions in both formulations. The target is W = c – 1. Values to be varied are now (, a, b, c) or (m and n. L1 L2 L3 L4 L5 L6 L7 X 2 3 6 9 5 4 10 Y 2 5 7 8 3 1 7 W=c-1 - 0.40 0.00 - 0.30 - 0.46 - 0.64 - 0.85 - 0.58 a b = 0.40 = 00 = 0 = 0 = 00 = 00 = 0 m 0.30 0.20 0.10 0.07 0.12 0.15 0.06 n 0.50 0.33 0.17 0.11 0.20 0.25 0.10 Note that L2 still dominates the solution, but the graph is now quite different, Orientation to Input 12 10 8 6 4 2 0 0 2 4 6 8 10 12 Orientation to Output The linear programming display for the output orientation is as follows: Min m n Yj -Xj <= 0 a -I <= 0 b -I <= 0 I c - 1 Yk <= >= >= Max Yk - Xk It adds one additional condition, mYk <= 1, to the display. Orientation to Output The resulting Dual formulation is as follows: (m) Minimize W = 1 – c subject to (1) Yj – a >= cYk (2) –Xj – b >= – Xk or Xk + b <= Xk Yj -Xj a -I b -I 1 - c Yk >= >= Max Yk - Xk (m) 0 0 0 I Orientation to Output Continuing with the same example, the following table shows the solutions in both formulations. The target is W = 1 – c. Values to be varied are still (, a, b, c) or (m and n. L1 L2 L3 L4 L5 L6 L7 X 2 3 6 9 5 4 10 Y 2 5 7 8 3 1 7 W=1-c - 0.67 0.00 - 0.43 - 0.87 - 1.78 - 5.67 - 1.38 a b = 0.67 = 00 = 00 = 00 = = = m 0.50 0.20 0.14 0.13 0.33 1.00 0.14 n 0.83 0.33 0.24 0.21 0.56 1.67 0.24 Note that L2 still dominates the solution, but the graph is now quite different, Orientation to Output Note that the graphical display is identical to that for the general form, though the interpretation is somewhat different (replacing efficiencies by slacks). 25 20 15 10 5 0 0 2 4 6 8 10 12 (3) Returns to Scale: Fixed or Variable The third basis for variation among DEA models is “returns to scale”. The examples presented to this point have each involved “constant returns to scale”. That is, the ratio mY/nX can be replaced by the inequality mY – nX <= 0. These variations of the DEA model are called CCR models and reflect the requirement of constant returns to scale, But if there are “variable returns to scale”, the ratio mY/nX must now be replaced by mY – nX + u <= 0 where u can now vary to reflect the variable returns to scale. The results from that change are dramatic and make the DEA model much more interesting. The resulting models are called BCC models. Variable Returns to Scale, Basic Model The linear programming display for the basic DEA model is as follows: Min m n u Yj -Xj I <= 0 a -I <= - I b -I <= - I >= >= Max Yk - Xk I It adds the variable u to the display. Variable Returns: Orientation to Input The linear programming display for the variables returns to scale, input orientation is as follows: m n Yj -Xj a -I b -I c-1 Xk >= >= Max Yk - Xk u Min I <= 0 <= 0 <= 0 I <= >= I It adds one additional condition, nXk <= 1, to the display. Orientation to Input The resulting Dual formulation is as follows: (m) Minimize W = c-1 subject to (1) Yj – a >= Yk (2) –Xj – b + (c – 1)Xk >= -Xk or Xk + b <= cXk (3) >= 1 u Yj -Xj I a -I b -I c-1 Xk >= >= Max Yk - Xk I (m) 0 0 0 I The new, third condition makes things interesting. Orientation to Input Continuing with the same example, the following table shows the solutions in both formulations. The target is W = c – 1. Values to be varied are now (, a, b, c) or (m, n, u. L1 L2 L3 L4 L5 L6 L7 X 2 3 6 9 5 4 10 Y 2 5 7 8 3 1 7 W=c-1 0.00 0.00 0.00 0.00 - 4.00 - 5.00 - 4.00 a b 0.00 0.00 0.00 0.00 2.00 4.00 0.00 =1.00 = 1.00 = 1.00 = 1.00 = 1.00 = 1.00 = 1.00 m 0.00 0.00 0.00 0.00 0.00 0.00 0.00 n 0.00 0.00 0.00 0.00 0.00 0.00 0.00 u 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Orientation to Input 12 10 8 6 4 2 0 0 2 4 6 8 10 12 Orientation to Output The linear programming display for the output orientation is as follows: Min m n Yj -Xj <= 0 a -I <= 0 b -I <= 0 I c - 1 Yk <= >= >= Max Yk - Xk It adds one additional condition, mYk <= 1, to the display. Orientation to Output Yj -Xj a -I b -I 1 - c Yk >= >= Max Yk - Xk (m) 0 0 0 I The resulting Dual formulation is as follows: (m) Minimize W = 1 – c subject to (1) Yj – a >= cYk (2) –Xj – b >= – Xk or Xk + b <= Xk Orientation to Output Continuing with the same example, the following table shows the solutions in both formulations. The target is W = 1 – c. Values to be varied are still (, a, b, c) or (m and n. L1 L2 L3 L4 L5 L6 L7 X 2 3 6 9 5 4 10 Y 2 5 7 8 3 1 7 W=1-c - 0.67 0.00 - 0.43 - 0.87 - 1.78 - 5.67 - 1.38 a b = 0.67 = 00 = 00 = 00 = = = m 0.50 0.20 0.14 0.13 0.33 1.00 0.14 n 0.83 0.33 0.24 0.21 0.56 1.67 0.24 Note that L2 still dominates the solution, but the graph is now quite different, Orientation to Output Note that the graphical display is identical to that for the general form, though the interpretation is somewhat different (replacing efficiencies by slacks). 25 20 15 10 5 0 0 2 4 6 8 10 12 Extensions to include a priori Valuations To this point, DEA has been essentially a mathematical process in which the data for input and output are taken as given, without further interpretation with respect to the reality of operations. But reality needs to be recognized, so there are several extensions that can be made to the basic DEA model, applicable to any of the variations. They fall into seven categories: (1) Discretionary and Non-discretionary Variables (2) Categorical Variables (3)A priori restrictions on Weights (4) Relationships between Weights on Variables (5) A priori assessments of Efficient Units (6) Substitutability of Variables (7) Discrimination among Efficient Units Discretionary & Non-discretionary In identifying input and output variables, one wants to include all that are relevant to the operation. For example, the level of output is determined not only by what the unit itself does but by the size of the market to which the output is delivered. The result, though, is that some relevant variables, such as the size of the market, are not under the control of management. Such variables, called non-discretionary, are in contrast to those that are under management control, called discretionary. In assessing efficiency, all variables are considered, but in determining the criterion function to be maximized or minimized, only the discretionary variables are included. Categorical Variables In the DEA model as so far presented, the variables are treated as essentially quantitative, but sometimes one would like to identify non-quantitative variables, such as ordinal or nominal variables. For example, one might like to compare institutions of the same type, such as public or private universities. This is accomplished by introducing categorical variables containing numbers for order or identifiers for names. A priori Restrictions on Weights In the model as presented, the weights are limited only by the requirements that they be non-negative. However, there may be reason to require that weights be further limited. For example, it may be felt that a given variable must be included in the assessment so its weight must have at least a minimal value greater than zero. This might represent an output that is essential in assessment. As another example, a variable may be such a large weight would over-emphasize its a priori importance so that there should be an upper limit on the weight. This might represent an output variable that is counterproductive. Relationships between Weights Sometimes, a priori knowledge may imply that there is a necessary relationship among variables. For example, an output variable may absolutely require some level of an input variable. Such a priori knowledge may be expressed as a ratio between the weights assigned to the related variables. A priori assessments of Efficient Units Some DMUs may be regarded, based on a priori knowledge, as eminently efficient or notoriously inefficient. While one might argue about the validity of such a priori judgments, frequently they must be recognized. To do so, additional conditions may be imposed upon the choice of weights. For example, the condition mYj/nXj <= 1 may be replaced by equality for a given DMU which is regarded as eminently efficient. Substitutability of Variables A still unresolved issue is the means for representing substitutability of variables. For example, two input variables may represent two different type of labor which may be, to some extent, substitutable for each other. How is such substitutability to be incorporated? Let’s explore this issue a bit further since, by doing so, we can illuminate some additional perspectives on the basic DEA model. Substitutability of Variables For simplicity in description, consider two input variables and a single output variable that has the same value for all DMUs. The graphic representation of the envelopment surface can now best be presented not in terms of the relationship between output and input, as previously shown, but between the variables of input. The two variables are “Professional Staff” and “NonProfessional Staff”. The assumption is that they are completely substitutable and that physicians differ only in their “styles” of providing service, represented by the mix of the two means for doing so. The “efficient” DMUs are located on the red envelopment surface, which shows the minimums in use of variables. Substitutability of Variables 10 Style 1 9 Style 2 Style 3 Non-Professional Staff 8 7 6 5 4 Style 4 3 2 1 0 -1 0 1 2 Professional Staff 3 4 5 Discrimination among Efficient Units Strengths & Weaknesses of DEA Strengths DEA can handle multiple inputs and multiple outputs DEA doesn't require relating inputs to outputs. Comparisons are directly against peers Inputs and outputs can have very different units Weaknesses Measurement error can cause significant problems DEA does not measure"absolute" efficiency Statistical tests are not applicable Large problems can be computationally intensive Implementation of DEA Structure Spreadsheet implementation Choice of Model Spreadsheet Structure Spreadsheet Calculations Solver Elements in Spreadsheet Visual Basic Program Access to the Implementation The data included in the spreadsheet is for ARL libraries in 1996. Choice of Model The spreadsheet includes means to identify the choice of model by means of three parameters: Form: Dual represented by 0 and Primal by 1 Orientation: Addition by 0, Input by 1, Output by 2 Convexity: No by 0, Yes by 1 Given the specification, solution of the resulting model is initiated by pressing Ctrl-q. The solution is effected by a Visual Basic program that determines the model from the parameters and then launches the Excel Add-In called Solver. The program then produces the output on Sheet 3 that shows the results. Spreadsheet Structure The DEA Spreadsheet for application to ARL libraries consists of three main parts: (1) The source data, stored in cells B16:R117 (2) The spreadsheet calculations, stored in cells D5:R15 (3) The Solver related calculations, stored in cells B1:B15, A7:A117, T12:T117 The source data consists of the 10 input and 5 output variables for each of the ARL institutions plus, in row B16:R16, a set of normalizing factors, one for each of the variables. Spreadsheet Calculations The Spreadsheet calculations in D5:R14 can be illustrated by D5:D14 and N5:N14: C D 5 Discretionary? 1 6 Weights 0.000001 7 8 9 Comp =SUMPRODUCT(Mult,D17:D113)*D16 10 Slacks 15.2073410229378 11 Mod Comp =D9+D10 12 =INDEX(C17:C126,MATCH($B$12,$B$17:$B$126,0),1) =INDEX(Data,MATCH($B$12,$B$17:$B$126,0),COLUMN()-3)*D16 13 =D12*$B$13 14 =IF($B$2=1,D13,D12) Spreadsheet Calculations The Spreadsheet calculations in D5:R14 can be illustrated by D5:D14 and N5:N14: C N 5 Discretionary? 1 6 Weights 9.99999999999265E-07 7 8 9 Comp =SUMPRODUCT(Mult,N17:N113)*N16 10 Slacks 5.56269731722995 11 Mod Comp =N9-N10 12 =INDEX(C17:C126,MATCH($B$12,$B$17:$B$126,0),1) =INDEX(Data,MATCH($B$12,$B$17:$B$126,0),COLUMN()-3)*N16 13 =N12*$B$13 14 =IF($B$2=2,N13,N12) Solver Elements in Spreadsheet # 0 1 2 3 4 5 6 7 8 9 10 11 B1 B2 B3 Target 0 0 0 B7 0 0 1 B7 0 1 0 B8 0 1 1 B8 0 2 0 B9 0 2 1 B9 1 0 0 B6 1 0 1 B6 1 1 0 B6 1 1 1 B6 1 2 0 B6 1 2 1 B6 Min Min Min Min Min Min Max Max Max Max Max Max Vary $D$10:$R$10,$A$17:$A$113 $D$10:$R$10,$A$17:$A$113 $D$10:$R$10,$A$17:$A$113,$B$13 $D$10:$R$10,$A$17:$A$113,$B$13 $D$10:$R$10,$A$17:$A$113,$B$13 $D$10:$R$10,$A$17:$A$113,$B$13 $D$6:$R$6 $D$6:$R$6,$S$6 $D$6:$R$6 $D$6:$R$6,$S$6 $D$6:$R$6 $D$6:$R$6,$S$6 Conditions $D$11:$R$11= $D$11:$R$11= $D$11:$R$11= $D$11:$R$11= $D$11:$R$11= $D$11:$R$11= $D$14:$R$14 $D$14:$R$14 $D$14:$R$14 $D$14:$R$14 $D$14:$R$14 $D$14:$R$14 $A$17:$A$113>= $A$17:$A$113>= $A$17:$A$113>= $A$17:$A$113>= $A$17:$A$113>= $A$17:$A$113>= $T$17:$T$113<= $T$17:$T$113<= $T$17:$T$113<= $T$17:$T$113<= $T$17:$T$113<= $T$17:$T$113<= 0 0 0 0 0 0 0 0 0 0 0 0 $B$127= 1 $B$127= 1 $B$127= $T$12= $T$12= $T$12= $T$12= $T$12= $T$12= 1 1 1 1 1 1 1 B7=-Slacks = SUMPRO DUCT(D5:R5,D10:R10) B8=Inrensity-1 B9=1-Intensity B6=SUMPRO DUCT($N$12:$R$12,$N$6:$R$6)-SUMPRO DUCT($D$12:$M$12,$D$6:$M$6)+IF($B$3=1,$S$6,0) $T$17:$T$113 are illustrated by $T$17: $T$17=SUMPRO DUCT(N17:R17,$N$16:$R$16,$N$6:$R$6)-SUMPRO DUCT(D17:M17,$D$16:$M$16,$D$6:$M$6)+IF($B$3=1,$S$6,0) $T$12=IF($B$2=0,1,IF($B$2=1,SUMPRO DUCT($D$12:$M$12,$D$6:$M$6),SUMPRO DUCT($N$12:$R$12,$N$6:$R$6))) $B$127=SUM($A$17:$A$117) Visual Basic Program Application.Range("B3").Select Convex = Selection.Value A = 6 * Form + 2 * Orient + Convex SolverReset 'Set Target, MaxMinVal, Change If A = 0 Or A = 1 Then 'Dual, Addition SolverOk SetCell:="B7", MaxMinVal:=2, ValueOf:="0", _ ByChange:= "$D$10:$R$10,$A$17:$A$113" End If If A = 2 Or A = 3 Then 'Dual, Input SolverOk SetCell:="B8", MaxMinVal:=2, ValueOf:="0", _ ByChange:= "$D$10:$R$10,$A$17:$A$113,$B$13" End If If A = 4 Or A = 5 Then 'Dual, Output SolverOk SetCell:="B9", MaxMinVal:=2, ValueOf:="0", _ ByChange:= "$D$10:$R$10,$A$17:$A$113,$B$13" End If If A = 6 Or A = 8 Or A = 10 Then 'Primal, Not Convex (Constant Returns to Scale) SolverOk SetCell:="B6", MaxMinVal:=1, ValueOf:="0", _ ByChange:= "$D$6:$R$6" End If If A = 7 Or A = 9 Or A = 11 Then 'Primal, Convex (Variable Returns to Scale SolverOk SetCell:="B6", MaxMinVal:=1, ValueOf:="0", _ ByChange:= "$D$6:$R$6,$S$6" End If Visual Basic Program 'Set Conditions If A = 0 Or A = 1 Or A = 2 Or A = 3 Or A = 4 Or A = 5 Then 'Dual SolverAdd CellRef:="$D$11:$R$11", Relation:=2, FormulaText:="$D$14:$R$14" SolverAdd CellRef:="$A$17:$A$113", Relation:=3, FormulaText:="0" SolverAdd CellRef:="$D$10:$R$10", Relation:=3, FormulaText:="0" End If If A = 1 Or A = 3 Or A = 5 Then 'Dual, Convex (Variable Returns to Scale) SolverAdd CellRef:="$A$127", Relation:=2, FormulaText:="1" End If If A = 6 Or A = 7 Or A = 8 Or A = 9 Or A = 10 Or A = 11 Then SolverAdd CellRef:="$T$12", Relation:=2, FormulaText:="1" SolverAdd CellRef:="$T$17:$T$113", Relation:=1, FormulaText:="0" End If If A = 8 Or A = 9 Or A = 10 Or A = 11 Then 'Primal, Input or Output SolverAdd CellRef:="$D$6:$R$6", Relation:=3, FormulaText:="$B$15" End If If A = 6 Or A = 7 Then 'Primal, Addition SolverAdd CellRef:="$D$6:$R$6", Relation:=3, FormulaText:="1" End If SolverOptions MaxTime:=1000, Iterations:=1000, Precision:=0.000001, _ AssumeLinear:=True, StepThru:=False, Estimates:=1, Derivatives:=1, _ SearchOption:=1, IntTolerance:=5, Scaling:=False, Convergence:=0.0001, _ AssumeNonNeg:=False Visual Basic Program For m = 1 To 97 Application.StatusBar = "Calculating Efficiency for unit " & Str(m) ' Paste unit 0's number to model worksheet Sheets("Sheet1").Select Application.Goto Reference:="unit" Selection.Value = m ' Run Solver (with the dialog box turned off) SolverSolve (True) ' Paste unit's number and name to All Results sheet Sheets("Sheet1").Select Application.Range("C12").Select Selection.Copy Sheets("Sheet3").Select Range("A2").Offset(m, 0).Select Selection.PasteSpecial Paste:=xlValues Visual Basic Program Sheets("Sheet1").Select Application.Goto Reference:="Target1" Selection.Copy Sheets("Sheet3").Select Range("A2").Offset(m, 1).Select Selection.Value = m Range("A2").Offset(m, 2).Select Selection.PasteSpecial Paste:=xlValues Sheets("Sheet1").Select Application.Goto Reference:="Results" Selection.Copy Sheets("Sheet3").Select Range("A2").Offset(m, 1).Select Selection.Value = m Range("A2").Offset(m, 3).Select Selection.PasteSpecial Paste:=xlValues Next m Application.Goto Reference:="Start" End Sub The Example of Libraries Selection of Data Input Variables (10): Collection Characteristics (Discretionary) Staff Characteristics (Discretionary) University Characteristics (Non-discretionary) Output Variables (5): Scaling of Data Constraints on Weights Results Effects of the several Variables Selection of Data The Variables Scaling of the Variables Constraints on Weights Results Efficiency Distribution The following chart display the efficiency distribution for the 97 U.S. ARL libraries. The input and output components for each institution have been multiplied by the size of the collection. Note the cluster of inefficient institutions below the 3,000,000 volumes of holdings. There appear to be three groups of institutions: The efficient ones, lying on the red line The seven that are more then 4 million and mildly inefficient Those that are less than 4 million and range in efficiency Collection*Output 13.00 11.00 9.00 7.00 5.00 3.00 1.00 1.00 3.00 5.00 7.00 9.00 Collection*Input 11.00 13.00 Sum of Projections The following chart show the distribution of the sum of the projections as a function of the Intensity. 9,00 Sum of Projections 8,00 7,00 6,00 5,00 4,00 3,00 2,00 1,00 0,00 0,00 0,20 0,40 0,60 Intensity 0,80 1,00 1,20 Distribution of Weights The following chart shows the magnitudes of the weights on each of the Input and Output components 0,25 0,20 0,15 0,10 0,05 0,00 0 -0,05 2 4 6 8 10 12 14 16 Annals of Operations Research 66 (1996) Preface Part I: DEA models, methods and interrelations Chapter 1. Introduction: Extensions and new developments in DEA W W Cooper, R.G. Thompson and R.M. Thrall Chapter 2. A generalized data envelopment analysis model: A unification and extension of existing methods for efficiency analysis of decision making units G.Yu, Q. Wet and P Binckett Extensions in DEA Covers (1) new measures of efficiency, (2) new models, and (3) new implementations. The TDT measure of “relative” efficiency takes the criterion measure (weighted output/weighted input) relative to the maximum for that measure The Pareto-Koopman measure applies the Pareto criterion (no variables can be improved without worsening others) The BCC model (variable returns to scale) is presented. Congestion arises when excess inputs interfere with outputs. It thus represents relationships among variables. Generalized DEA model Essentially, this paper does what I have been trying to do in implementation of DEA. It does so by identifying the primal and dual (P and D), the two returns to scale (fixed and variable), and three binary parameters (d1, d2, d3) in the equations d1 = d1eT + d1d2(-1)d3n+1 (for the dual) d1d2(-1)d30 (for the primal) Values of (d1, d2, d3) include: (0,-,-) the CCR model (1,0,-) the BCC model (1,1,0) the FG model (1,1,1) the ST model The relationships among the several models are discussed. Part II: Desirable properties of models, measures and solutions (1) Chapter 3. Translation invariance in data envelopment analysis: A generalization J.T Pastor Chapter 4. The lack of invariance of optimal dual solutions under translation R.M. Thrall Chapter 5. Duality, classification and slacks in DEA R.M. Thrall Translation invariance This paper proves that several of the DEA model are translation invariant (i.e., optimal solutions are not changed if the original variable values are “translated”, that is all values for a variable are replaced by some constant minus the values). Specifically, the primal additive model is translation invariant. The BCC input oriented primal model is output translation invariant. The CCR models are not translation invariant. Lack of invariance This paper supplements the prior one. It shows that in neither the BCC model nor the additive model are the optimal solutions for the dual (i.e., multipler) formulation invariant under translation. Duality, classification and slack This paper considers the role of slacks especially in the context of radial measures of efficiency. The effect of alternative optima is to make slacks difficult to deal with; the theory presented resolves the difficulties. The CCT model presented eliminates the need for nonArchimedean models and permits dealing with zero values for the variables. The concept of an “admissible virtual multiplier” is introduced and the maximizing virtual multiplier w* is the basis for categorizing efficient DMUs into 3 groups: Extreme Efficient: all variables are included in w* Efficient: the variables in w* are all positive Weak efficient: w* has at least one zero variable Similarly for non-efficient DMUs Part II: Desirable properties of models, measures and solutions (2) Chapter 6. On the construction of strong complementarity slackness solutions for DEA linear programming problems using a primal-dual interiorpoint method M.D. Gonzdlez-Ltma, R.A. Tapta and R.M. Thralt Chapter 7. DEA multiplier analytic center sensitivity with an illustrative application to independent oil companies R.C. Thompson, PS. Ditarmapala, f Diaz, M.D. Gonzdlez-Lima and R.M. Thrall Complementarity Slackness Solutions This paper proposes use of “primary-dual interior-point methods” for solution of the DEA linear programming problem (an iterative process that generates interior point that converge to the solution). The primary form minimizes C’x; the dual form maximizes B’y. The condition for solution is that C’x = B’y, called the complementarity slackness condition. These methods attempt to solve the primary and dual linear programs simultaneously. Solutions are classified as radially efficient or inefficient using the CCT model. Multiplier Sensitivity The stability of the set E of extreme efficient DMUs is examined to determine the sensitivity to changes in the data, Part III: Frontier shifts and efficiency evaluations Chapter 8. Estimating production frontier shifts: An application of DEA to technology assessment R.D. Banker and R.C. Morey Chapter 9. Moving frontier analysis: An application of data envelopment analysis for competitive analysis of a high-technology manufacturing plant K.K. Sinha Chapter 10. Profitability and productivity changes: An application to Swedish pharmacies R. Aithin, R. Fare and S. Grosskopf 219 Production Frontier Shifts This paper divides the set of DMUs into two categories (representing the use or non-use of a technology). For a DMU without the technology, comparison is made only with others without the technology; for those with the technology, comparison is made with all DMUs. The result is a basis for assessment of the impact of the technology. Moving Frontier Analysis This paper proposes a method for assessing when some data may not be available. It uses aggregate data on “best practices”. It depends upon time series data Profitability & productivity changes It is not evident how this relates to DEA. Part IV: Statistical and stochastic characterizations Chapter 11. Simulation studies of efficiency, returns to scale and misspecification with nonlinear functions in DEA RD. Banker; H. Chang and WW Cooper Chapter 12. New uses of DEA and statistical regressions for efficiency evaluation and estimation - with an illustrative application to public secondary schools in Texas VL Arnold, LR. Bardhan, WW Cooper and S.C. Kumbhakar Chapter 13. Satisficing DEA models under chance constraints W W Cooper Z Huang and S.X. Li Simulation studies Well, so be it. DEA and statistical regressions Compares the two methods. It uses a Cobb-Douglas production model (in log form) and estimates the parameters by a regression on the set of DMUs. (Actually, it does a set of regressions, one for each output variable against the uniform set of input variables.) It then applies DEA to the same set of input variables (separately for each output variable in turn). It then considers the joint outputs, taken together. Satisficing DEA models Introduces stochastic variables (characterized by probability distributions) and the concept of “stochastic efficiency”. It distinguishes between a “rule” (which has a probability of 1) and a “policy” (which has a probability between 0.5 and 1). Part V: Some new applications Chapter 14. Evaluating the efficiency of vehicle manufacturing with different products G. Zeng Chapter 15. DEA/AR analysis of the 1988-1989 performance of the Nanjing Textiles Corporation J. Zhu 311 China vehicle manufacturing Evaluates the efficiency of vehicle manufacturing in China. It deals with the problem of zero values for some variables. DEA/AR analysis Another application in China. Annals of Operations Research 73 (1997) Contents Preface Foreword Part VI: Extending Frontiers Extending the frontiers of Data Envelopment Analysis A.Y Lewin and LM. Seijord Weights restrictions and value judgements in Data Envelopment Analysis: Evolution, development and future directions R.Allen, A. Athanassopoulos, R.O. Dyson and F. Thanassoulis Extending the frontiers See earlier in this presentation Weights restrictions & value judgments See earlier in this presentation. Part VII: Applications DEA and primary care physician report cards: Deriving preferred practice cones from managed care service concepts and operating strategies IA. Chilingerian and H.D. Sherman An analysis of staffing efficiency in U.S. manufacturing: 1983 and 1989 PT Ward, J.E. Storbeck, S.L. Mangum and RE Byrnes Applications of DEA to measure the efficiency of software production at two large Canadian banks J.C. Paradi, D.N. Reese and D. Rosen Primary care physician This papers identifies “styles” of management based on ratios of input variables aimed at input cost minimizing. The example used is comparison of hospital days versus office visits Staffing efficiency Again, styles of management are identified, this time based on ratios of types of staffing (e.g., professional vs. non-professional). Industries are divided into types (batch vs. line processing industries) and “best practices” for each type are identified by DEA. software production Input to software production is taken as cost; outputs as size (measure by “function points”), quality (measured by defects or rework hours), and time to market. The DEA is compared to performance ratio analyses, such as Cost/Function, Defects/Function, Days/Function. Then, constraints on the weights are introduced. One set of constraints consisted of bounds on ratios of weights. A second set of constraints consisted of tradeoffs between variables, again represented by bounds on ratios. Part VII: Applications Restricted best practice selection in DEA: An overview with a case study evaluating the socio-economic performance of nations B.Golany and S. Thore A new measure of baseball batters using DEA T.R. Anderson and G.P Sharp Efficiency of families managing home health care CE. Smith, S. VM. Kiembeck, K. Fernengel and L.S. Mayer Economic performance of nations To apply DEA to evaluation of economic performance of nations, it is necessary to recognize some constraints: International requirements (treaties, bilateral agreements) Externalities (e.g., mandated quotas) Issues of equity These constraints are then incorporated into DEA Baseball batters Traditional methods for evaluating batters include fixed and variable weight statistics (homers, batting average, slugging average, RBI, etc.). The point in this article is that use of DEA allows one to determine the effect of changes over time. Another effect of interest is “noise”. To correct for noise, the DEA model “derates” the data for each player by a factor based on the player’s standard deviation for each variable Efficiency of families Family home health care is assessed using a “stepped procedure” in DEA. The stepped procedure involves a series of steps in which variables are successively introduced: Inputs Step 1 Direct Costs Medical Expense Step 2 Indirect Costs Training Step 1 Step 3 Caring Costs Hours/day Moths/caregiving Medication Step 2 Outputs Family Income Patient/Caregiver Step 1 Caregiver burden Caregiver esteem Part VII: Applications A DEA-based analysis of productivity change and intertemporal managerial performance E.Grifell-Tatje and C.A.K. LoveII Use of Data Envelopment Analysis in assessing Information Technology impact on firm performance C.H. Wang, R.D. Gopal and S. Zionts Productivity & managerial performance Examines the productivity of an organization over time. Information Technology impact Examines the impact of information technology on performance of firms. It divides operations into two stages: (1) Accumulation of resources and (2) Use of resources. (These are illustrated in banking by (1) the collection of funds from depositors and (2) use of those funds for generating income). It examines separately the effect of information technology (represented by ATM machines) on the two stages. Part VIII: Theoretical Extensions Comparative advantage and disadvantage in DEA A.I. Alt and CS. LeTine Model misspecification in Data Envelopment Analysis P Smith Dominant Competitive Factors for evaluating program efficiency in grouped data J.J.Rousseau and J.H. Semple DEA-based yardstick competition. The optimality of best practice regulation P Bogetoft Comparative advantage & disadvantage This paper introduces a cost function into DEA analysis as the means for calculating a comparative advantage or disadvantage as the difference between the costs of input and the income from output. It interprets the weights in each DMUs optimum as prices for the respective inputs and outputs. The result is “virtual” cost, revenue, and profit. The profit (or loss) is then compared with the maximum profit obtained by a best practice unit and that of the evaluated unit. For an efficient unit, the comparison is between the virtual profit of the valuated unit and the maximum profit across all other units. Comparative disadvantage The DEA model for determining comparative disadvantage is: Max R – C + w subject to - uY1 + R = -1 vX1 – C = 1 uY – vX = Iw <= 0 uT0 <= 0, vT1 <= 0 R, C >= 0 Min h - w - w Y1 + Y + T0r = 0 hX1 – X + T1r1 = 0 I = 1 h <= 1, w >= 1 >= 0 Comparative advantage The DEA model for determining comparative advantage is applied to the set removing the target unit: Max – R + C + w subject to - uY1 + R = – 1 vX1 + C = 1 uY1 – vX1 = Iw <= 0 uT0 <= 0, vT1 <= 0 R, C >= 0 Min h - w - w Y1 + Y1 + T0r = 0 hX1 – X1 + T1r1 = 0 I = 1 h >= 1, w <= 1 >= 0 Model mis-specification This paper examines the effects of various types of misspecifications of the DEA model. They include: Omission of a necessary input Inclusion of an extraneous variable Erroneous assumption about returns to scale Dominant Competitive Factors This paper treats DEA as a tool in game theory. One player has control over the weights applied to the variables, the other over the weights applied to the DMUs. Each tries to optimize against the other. The solution is of the pair of prime-dual problems: Player 1 Maximizes v’y0 – u’x0 subject to v’yj – u’xj <= 0 and v’y0 + u’x0 = 1, u, v >=0 Player 2 Minimizes a subject to Y + ay0 >= y0, X – ax0 <= x0, >= 0, a unrestricted Best practice regulation The use of DEA in regulatory practice is discussed. The underlying game is represented by a series of steps: Costs and demands for service are observed or identified Schemes are proposed by the regulator The schemes are rejected or accepted by the DMUs Costs are selected by the DMUs Data on performance are observed Compensations are paid The aim of the regulator is to minimize the expected costs of making the DMUs accept, fulfil, and minimize costs. The use of DEA is to determine the best practce norms. Part VIII: Theoretical Extensions A Data Envelopment Analysis approach to Discriminant Analysis D.L. Retzlaff-Roberts Derivation of the Maximum Efficiency Ratio mode from the maximum decisional efficiency principle M.D. Trouft Discriminant Analysis Discriminant analysis is a means for determining group classification for a set of similar units or observations. It determines a set of factor weights which best separate the groups, given units for which membership is already known. This paper proposes the use of DEA as a means for doing DA Maximum Efficiency Ratio Maximum efficiency ratio (MER) is intended to prioritize the DEA efficient DMUs by defining common weights. This paper supposes the existence of a ratio form criterion common to all the DMUs but not necessarily frontier oriented. Maxu,v (Minj (Suryrj/ Svixij), subject to Suryrj/ Svixij <= 1 for all j, Sur = 1, u, v >= 0 Part IX: Computational Implementation A Parallel and hierarchical decomposition approaches for solving large-scale Data Envelopment Analysis models R.S. Barr and M.L. Durchholz Part X: Abraham Charnes Abraham Charnes remembered Abraham Charnes, 1917-1992 A bibliography for Data Envelopment Analysis (19781996) LM. Setford The End