Measuring school efficiency in Taiwan’s remote islands: A Comparison of DEA and SFA Li-Ju Chen1 and He-Kai Chen 2 1 Professor, Department of Education, National Kaohsiung Normal University, Taiwan t1466@nknucc.nknu.edu.tw 2 Doctoral candidate, Department of Education, National Kaohsiung Normal University, Taiwan nio203@gmail.com Introduction In the field of efficiency measurement, frontiers have been estimated using many different methods over the past 40 years. The two principal methods are the Data Envelopment Analysis (DEA), and the stochastic frontier analysis (SFA) which involve mathematical programming and econometric methods, respectively (Coelli, 1996b). DEA and SFA both operate as a production function converting inputs into outputs, and shed the light on education expenditure allocation policy-making. DEA proposed by Chames, Cooper, and Rhodes (1978) is the non-parametric mathematical programming approach to frontier estimation, and has been adopted extensively for measuring school efficiency recently in Taiwan. However, the results from DEA analysis can only portray the relative efficiency among schools, not absolute efficiency of each school. Moreover, DEA ignores the effect of random shock, which could be attributed to statistical errors instead of real inefficiency of decision making unit. In contrast, the SFA proposed by Aigner, Lovell and Schmidt (1977) and Meeusen and van den Broeck (1977), is categorized as the parametric approach and is more appealing to researchers because it allows to assume that deviations from the frontier may reflect not only inefficiencies but also noise in the data, and can separate statistic error components from the inefficiency term on the process (Bogetoft & Otto, 2011). The focus of this study is to benchmark school efficiency of different sizes with Data Envelopment Analysis (DEA) and Stochastic Frontier Analysis (SFA) models. The SFA is newly applied in efficiency measuring of primary schools in Taiwan, where DEA is the prevailing methodology. The authors investigate the efficiency scores generated by the two models, and cluster schools into subgroups with different features. By comparing the characteristics of subgroups with one another, the feasibility of DEA and SFA application under different scales of schools is provided. In order to ensure data integrity and homogeneity of input, the measurement of school efficiency should base on the same set of school data. Under the premise, Penghu, the only island county in Taiwan area, which includes 40 elementary schools spreading upon 7 of 100 small islands, is chosen as the sample. The purpose of this study is as follows: 1. Measuring the school efficiency with SFA and DEA. 2. Clustering the schools based on the efficiency scores and discriminating the characteristics among subgroups. 3. Exploring the diversity between the efficiency scores and ranks generated by DEA and SFA. 4. Investigating the characteristics of parameters estimated by SFA and DEA in different subgroups. Research Design 1.Input and output variables: The data consists of school input and output during 2011-2012 school year. The input variables include school expenditures and 4th, 5th, and 6th graders’ family household expenses on education. The school expenditures conclude the educator wages(x1 ), the instructional materials(x2 ), and the award & subsidy(x3 ), while the family expenses on education contain the extra-curriculum of cram school( x4 ),and the educational expenditure(x5 ). The output variable is represented by the scores from standardized student assessment conducted annually by Penghu Education Department. 2. SFA model specification The stochastic frontier models combine the inefficiency term u and the error term v. The former can be characterized as inefficiency. The base model after a log transformation is as follow: y k = f(x k ; β) + v k − uk , v k ~N(0, σ2v ), uk ~ N+ (0, σ2u ), k =1, ..., K. The SFA program Frontier was proposed by Coelli(1996a), and can be executed under DOS environment. 3. DEA model specification The DEA model in this study belongs to input orientation and assumed constant returns to scale (CRS). The multiplier form of the linear programming problem is as follow: s hk rYrk Max r 1 m s.t. v X i 1 i ik 1 s m r 1 i 1 urYrj vi X ij 0 , j 1,, n u r ,vi 0, r 1,, s; i 1,, m Duality in linear programming Min s m Z k si sr r 1 i 1 n s.t. X j 1 j ij n y j 1 rj j xik si 0, i 1,, m sr Yrk , r 1,, s j , si , sr 0, j 1,, n i 1,, m r 1,, s The DEA program DEAP was proposed by Coelli(1996b), and can be executed under DOS environment. 4. Procedure of this research Educational expenditure provided by school Educational expenditure provided by family Input variable DEA efficiency measure DEA efficiency scores Output variable SFA efficiency measure Correlation between efficiency scores Correlation between ranks Summary of Input Slacks Peer Count Summary group1 Scores derived from Annually achievement test SFA efficiency scores Cluster analysis Parameters estimation group2 group3 group4 Characteristics analysis of each group Figure 1 Flowchart of the study Results 1.The measurement of school efficiency by DEA and SFA Table 1 school efficiency scores, ranks and clusters under DEA and SFA estimation school Number of class Number of student DFA efficiency scores DEA rank SFA efficiency scores SFA rank cluster S11 6 86 0.727 30 0.905 26 1 S12 6 55 0.605 36 0.929 22 1 S20 6 60 0.558 37 0.816 37 1 S21 6 37 0.502 38 0.931 20 1 S23 6 39 0.728 28 0.855 33 1 S24 6 14 0.182 40 0.765 40 1 S30 6 39 0.608 35 0.894 29 1 S32 6 51 0.728 28 0.923 24 1 S33 6 29 0.609 34 0.925 23 1 S36 6 42 0.858 22 0.896 28 1 S37 6 38 0.759 26 0.82 36 1 S39 6 78 0.843 24 0.813 38 1 S09 6 74 0.623 32 0.95 10 2 S16 6 29 0.469 39 0.943 14 2 S17 6 58 0.885 19 0.964 7 2 S19 6 108 0.852 23 0.938 17 2 S22 6 30 0.61 33 0.966 5 2 S25 6 121 0.886 18 0.932 19 2 S26 6 51 0.641 31 0.944 13 2 S29 6 45 0.86 21 0.982 1 2 S31 6 33 0.757 27 0.973 2 2 S34 6 54 0.776 25 0.946 11 2 S38 5 13 0.875 20 0.935 18 2 S03 26 680 1 1 0.857 32 3 S04 11 223 1 1 0.923 25 3 S07 12 234 0.927 13 0.929 21 3 S08 6 126 1 1 0.902 27 3 S10 6 127 0.91 15 0.853 34 3 S13 6 94 1 1 0.892 30 3 S18 6 119 1 1 0.868 31 3 S27 6 42 0.899 17 0.844 35 3 Group school Number of class Number of student DFA efficiency scores DEA rank SFA efficiency scores SFA rank cluster S28 6 55 1 1 0.785 39 3 S01 33 491 0.924 14 0.941 16 4 S02 23 591 0.91 15 0.955 8 4 S05 26 681 1 1 0.97 3 4 S06 6 143 0.94 12 0.944 12 4 S14 5 12 1 1 0.968 4 4 S15 9 176 1 1 0.942 15 4 S35 6 116 1 1 0.953 9 4 S40 6 50 1 1 0.965 6 4 125 0.811 Mean Group 0.911 ※Results: 1. According to DEA and SFA efficiency scores, there are 4 groups clustered by k-means analysis and each one shows different feature. 2. School lists are sorted by the clusters, each of which contains different number of schools. 3. Both the rank and efficiency scores of DEA and SFA in schools are quite different, group1 features lower both DEA and SFA scores than average, group2 features lower DEA scores and higher SFA scores, group3 features higher DEA scores and lower SFA scores, group4 features both higher DEA and SFA scores . Figure 1 The distribution of DEA and SFA efficiency scores ※Results: 1. The shape of SFA and DEA distributions are skewed to the left. 2. Most schools are measured high performance. Table 2 The Pearson coefficient of correlation among number of students, classes, output, DEA , and SFA efficiency scores Variables DEA efficiency scores SFA efficiency scores Variables DEA efficiency scores SFA efficiency scores ** Number of students .406 .129 Number of classes .309 .092 Output(Y) .209 .844** Input(X1) -.618** -.191 Input(X2) -.465** -.111 Input(X3) -.781** -.173 Input(X4) .194 .168 Input(X5) -.145 .138 DEA efficiency scores 1 .213 SFA efficiency scores .213 1 **. The correlation is significant under level 0.01 (2-tailed). ※Results: 1. DEA efficiency scores show positive correlation with number of students, while SFA does no. 2. The efficiency scores of DEA and SFA show no significant correlation with the number of classes in schools . 3. The correlation between DEA and SFA efficiency scores isn’t significantly related. 4. DEA efficiency scores show high correlation with inputs, while SFA efficiency scores show high correlation with the output. 1 0.95 S22 S09 S26 S12 S33 S16 S21 S31 S34 S32 S11 0.9 S29 S05 S14 S40 S17 S02 S35 S06 S19S38 S01 S15 S25 S07 S04 S23 0.85 S10 S27 S37 S20 S08 S13 S36 S30 S18 S03 S39 0.8 S28 S24 0.75 0.7 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Figure 2 the scatter plot of school efficiency ※Results: 1. The X-axis represents the DEA efficiency scores, the Y-axis represents SFA efficiency scores. 1 2. The lines of average values of DEA and SFA cross, and divide 40 schools into 4 groups, similar to clustering analysis. Table 3 Characteristics of input and output in different clusters Cluster Average DEA efficiency scores Average SFA efficiency scores 𝑌̅ ̅̅̅ X1 ̅̅̅2 X Group1 Group2 Group3 Group4 Average 0.642 0.749 0.971 0.972 0.811 0.873 0.952 0.873 0.955 0.911 66 75 67 75 70 401,164 358,592 174,320 234,433 305,071 3,386 3,073 1,193 1,877 2,505 ̅̅̅3 X ̅̅̅ X4 778 729 212 357 553 9,655 9,681 20,334 26,682 15,470 ̅̅̅ X5 10,491 11,020 11,494 14,280 11,620 ※Results: 1. 2. 3. The Comparison of the average scores of 4 subgroups with total schools displays the characteristics of DEA and SFA estimation. Schools in group1 and group3 have similar SFA efficiency scores and outputs, in spite of divergence of its inputs. On the contrary, the DEA efficiency scores in group1 and goup3 show that schools in group3 is more efficient than schools in group1. The same with group2 and group4. The DEA efficiency scores in schools of group3 are similar to group4, but the SFA efficiency scores of group4 are higher than group3. On the average, inputs and the output of group4 are higher than those of group3. As schools in goup1 and group2 locate at rural region, the scales of school is small. The costs per student from public sector (government) decline as school size enlarged, and student costs from private sector (family) increase, as the large-scaled schools tend to be located at county center. Table 4 Attribute of different clusters ̅ Cluster School scale 𝒀 ̅𝐗̅̅𝟏̅ ̅𝐗̅̅̅𝟐 ̅𝐗̅̅𝟑̅ ̅̅ 𝐗̅𝟒̅ ̅̅̅̅ 𝐗𝟓 Group1 Small Low High High High Low Low Group2 Small High High High High Low Low Group3 Large Low Low Low Low High High Group4 Large High Low Low Low High High ※Results: 1. The efficiency scores measured by DEA and SFA vary with the scale of schools, the higher the DEA scores, the larger the school scale . 2. DEA is more sensitive to scales. When school size is larger, the higher DEA efficiency scores are measured. On the other hand, SFA keeps stable when measuring different sizes of schools. 3. Inputs in both models are the same, but the SFA is more sensitive to output value than DEA. Table 5 One-way ANOVA on School efficiency scores in different clusters ANOVA Efficiency scores Sum of Squares Df Mean Square F Sig. DFA Between Groups .821 3 .274 Within Groups .594 36 .016 1.415 39 Between Groups .065 3 .022 Within Groups .055 36 .002 Total .120 39 Total SFA 16.589 .000 14.097 .000 ※Results: 1. Both the DEA and SFA efficiency scores of 4 subgroups exhibit significant differences between one another. Table6 The comparison of the extreme efficiency scores and parameters from each groups DEA score S18(3) S40(4) S16(2) S24(1) SFA score Number of students Y X1 X2 X3 X4 X5 1(1) 1(1) 0.496(39) 0.868(31) 0.965(6) 0.943(14) 119 50 29 65 73 75 115,720 903 247,797 1,894 530,153 3,257 202 12,672 120 0 1,034 16,029 7,803 5,655 15,551 0.182(40) 0.765(40) 14 57 1,108,315 8,667 3,000 17,333 22,900 ※Results: 1. 2 most efficient schools S18 and S40 which DEA efficiency scores equal to 1 are the main peer targets, account for 17 and 20, respectively. However, S18 performs lower SFA scores than average(see Table 3) owing to its feature of lower output. 2. DEA measures S16 and S24 the inefficient schools. However, S16 performs much better than S24 under SFA measurement. Table 7 Parameters estimation by SFA Per student β0 β1 β2 β3 β4 coefficient 3.884 0.132 -0.088 -0.008 0.011 standard-error 0.648 0.053 0.035 0.014 0.007 t-ratio 5.993 2.476 -2.513 -0.559 1.567 -0.062 0.017 0.866 0.047 0.006 0.140 -1.322 2.714 6.172 β5 2 σ 𝛄 ※Results: 1. γ=0.866 means there is 86.6% of residuals due to production inefficiency. 2. Parameters listed in Table 7 show that input(x1 ), and input(x4 ) are contributive to school efficiency(βi > 0). 3. According to negative values of parameters β2 , β3 , and β5 , Input(x2 ), input(x3 ), and input(x5 ) show negative effects on school efficiency. Conclusion Based on the results of empirical data, the measurement of school efficiency with SFA is inconsistent with DEA. These two kinds of efficiency scores provide information for clustering schools with different features on input and output. Besides, the more rural region the school locates the larger relevance appears with the public resource in student achievement while the more urbanized region the school sites, the lesser relevance. By comparing the DEA technical inefficiency patterns with SFA in each subgroup, some school improvement strategies are possible to be proposed. Some of the schools which locate at the rural area displays higher efficiency therefore we do not recommend that the authority close down those schools for their high educational cost. According to the parameters of SFA, it's shown that the input and output which this study adopts is appropriate as most of the residuals come from production inefficiency. The parameters of DEA provide information for optimum input and output, while the SFA parameters provide the information for factors that truly affect the efficiency scores. SFA is appropriate for measuring schools under a variety of school sizes, which are likely under-estimated by DEA as the per-student expenditure is high. Some schools with little input and output are also labeled efficient by DEA, though. But, it is difficult to convince the public how well these schools perform, and disobeying to the spirit of pursuing excellence. On the other hand, SFA shows high correlation with the output and is easy to explain the reasons of being the example of other schools. References Aigner, D. J., Lovell, C. A. K. & Schmidt, P. (1977). Formulation and estimation of stochastic frontier production function models. Journal of Econometrics, 6, 21-37. Bogetoft, P. & Otto, L. (2011). Benchmarking with DEA, SFA, and R. New York, NY: Springer. Chames, A., Cooper, W. W. and Rhodes, E. (1978), Measuring the Efficiency of Decision Making Units, European Journal of Operational Research, 2, 429-444. Coelli, T.J. (1996a), A Guide to FRONTIER Version 4.1: A Computer Program for Frontier Production Function Estimation, CEPA Working Paper 96/07, Department of Econometrics, University of New England, Armidale. Coelli, T.J. (1996b), A Guide to DEAP Version 2.1: A Data Envelopment Analysis (Computer) Program, CEPA Working Paper 96/08, Department of Econometrics, University of New England, Armidale. Messusen, W., & Van den Broeck, J. (1977). Efficiency estimation from Cobb-Douglas production function with composed error. International Economic Review, 18, 435-444.