# Lecture 8 ```Efficiency and Productivity Measurement:
Bootstrapping DEA Scores
School of Economics
The University of Queensland, Australia
1
Measures of Reliability for DEA Scores
As DEA is a non-parametric and non-stochastic approach,
efficiency scores from DEA have been treated as nonstochastic.
However, there are attempts to see how DEA scores are affected
by changes in data – mainly to see the effect of outliers.
Simar and Wilson have been working on the problem of
generating standard errors for DEA scores using “bootstrap”
technique.
An alternative to the bootstrap technique is the technique of
“jackknife” which is a simpler technique.
2
Jackknife Technique
• Run DEA and get efficiency scores for each of the DMUs in
the data set.
• Drop one DMU at a time and use the remaining data to
compute DEA scores for the remaining DMUs.
• Repeat this until the full sample is covered. At this stage, we
will have M-1 efficiency scores for each of the M DMUs in
the sample.
• Compute standard deviation for each of the efficiency scores
using M-1 different estimates.
• This is a fairly mechanical procedure, but provides an
indication about the presence of outliers – in such cases
dropping a DMU may change the scores significantly.
3
THE DEA BOOTSTRAP
Monte Carlo simulation experiments are often used to estimate
the sampling distributions of econometric estimators. Such
experiments typically involve several steps:
Specify a data generating process (DGP)
1. Use the DGP to generate data (i.e., simulate).
2. Apply the estimator to the generated data.
3. Repeat from Step 2.
The distribution of the estimates obtained in step 3 approximates
the sampling distribution of the estimator. The bootstrap is a
form of Monte Carlo experiment where the DGP is unknown.
4
Alternative DEA Bootstrap Methods
Methods for conducting a DEA bootstrap have been suggested
by
• Ferrier and Hirschberg (1997)
• Lothgren and Tambour (1997)
• Simar and Wilson (1998)
We only discuss the Lothgren-Tambour (LT) method because
• Simar and Wilson (1997) identify theoretical problems with
the Ferrier-Hirschberg (FH) method.
• Lothgren (1998) provides evidence that the LT method
outperforms the Simar-Wilson (SW) method.
• the LT method is relatively straightforward.
5
The DGP
Let us consider input-oriented DEA models where the output
vectors q1, …, qI are treated as fixed. We need to specify a DGP
that will allow us to generate data on x1, …, xI.
Let  i  D I ( x i , q i ). Then x *i  x i  i is a technically-efficient
input combination capable of producing qi. Suppose the process
generating the distances for all firms is (  1 ,...,  I ) ~ iid F . Then
a DGP for x1, …, xI is completely characterised by x 1* ,..., x *I ,
q1, …, qI and F.
6
Example
(x2/q)
5
. x =ρ x
4
2
*
2 2
ρ2 = 2
= (2, 4)
3
2
. x = (1, 2)
*
2
1
0
q=1
1
2
3
4
5
(x1/q)
7
Estimating the DGP
Let ˆ i denote the DEA estimate of ρi (computed as the inverse
of the optimised value of the DEA objective function). We
estimate x 1* ,..., x *I by projecting xi onto the estimated frontier:
*
xˆ i  x i ˆ i
i = 1, …, I,
We estimate F using the empirical distribution function (EDF)
of the ˆ i :
 I  1 if i  ˆ i , i  1,..., I
Fˆ ( i )  
 0 otherw ise
8
Example cont.
(x2/q)
5
.
4
.
3
ˆ 2  1 / 0.682
x2 = (2, 4)
*
xˆ 2  x 2 ˆ 2  (1.364, 2.728)
2
q=1
1
0
1
2
3
4
5
(x1/q)
9
The Bootstrap Algorithm
To obtain B bootstrap samples:
Use the observed data to estimate the input-oriented DEA
model, and project the observed data points onto the frontier
using xˆ *i  x i ˆ i . Set b = 1.
b
b
1. Draw  1 ,...,  I independently from Fˆ and generate the
bootstrap sample x 1b ,..., x bI using x bi   ib xˆ *i .
3. Use the bootstrap sample to estimate the DEA frontier. Set
b = b + 1.
4. Repeat from Step 2 until b = B.
These B bootstrap samples can be used to construct confidence
intervals.
10
Example cont.
In the hospital example ( ˆ 1 , ˆ 1 , ˆ 1 , ˆ 1 )  (1, 1.45, 1, 1.33) and
*
xˆ 1  (1, 2)
*
xˆ 3  (3, 1.5)
*
xˆ 2  (1.364, 2.728)
*
xˆ 4  (1.5, 3.75)
To illustrate generation of the first bootstrap sample, suppose 4
drawings from the U(0,1) distribution happen to be 0.46, 0.76,
0.18 and 0.92. This implies (  11 ,  21 ,  31 ,  41 )  (1.45, 1.33, 1, 1.33)
and
1
1 *
xˆ 1   1 xˆ 1  (1.45, 2.90)
1
1 *
xˆ 3   3 xˆ 3  (3, 1.5)
1
1 *
xˆ 2   2 xˆ 2  (1.81, 3.63)
1
1 *
xˆ 4   4 xˆ 4  (2, 5)
We then solve the DEA problem using this data.
11
Bias and SE’s for DEA Scores
Let ˆ i be the computed DEA score for firm i in the
sample.
Suppose ˆ i1 ,..., ˆ iB be the scores generated from the
bootstrapped sampling procedure which is
conducted B times. Then we can compute bias and
SE as:
Est. Bias ˆ i 
1
B
B
b
 ˆ i  ˆ i
b 1
2
 1

B 
B
1
B
b  
Est. S.D. ˆ i  
  ˆ i 
 ˆ i  
B b 1  
 B  1 b 1

1/ 2
12
Some remarks
• It is a computationally intensive exercise to compute bias and
standard errors for DEA scores but the idea is quite simple.
• The analytical aspects involved in proving that the
bootstrapped bias and standard errors are consistent are quite
difficult. That is where much of the work is focused.
• The model we have looked at simply generates technical
efficiency scores using a simple random sample without
replacement – this ignores any firm-specific characteristics
that may drive inefficiencies.
• It may be possible to make use of a second stage regression
and residuals from the regression to bootstrap after taking
into account firm specific characteristics.
13
```