Q/ T? 5 [4 AUG 181988 3^ SCALE ECONOMIES IN NEW SOFTWARE DEVELOPMENT Banker Kemerer Rajiv D. Chris F. February 1988 CISRWPNo. 167 WP No. 1995-88 Sloan Center for Information Systems Research Massachusetts Institute of Technology Sloan School of Management 77 Massachusetts Avenue Cambridge, Massachusetts, 02139 ' SCALE ECONOMIES IN NEW SOFTWARE DEVELOPMENT Banker Kemerer Rajiv D. Chris F. February 1988 CISRWPNo. 167 Sloan WP No. 1995-88 •1988 R.D. Banker, C.F. Kemerer Center for Information Systems Research Sloan School of Management Massachusetts Institute oflechnology M.I.I, ^RARIB AUG 1 REC-T u'c ( SCALE ECONOMIES IN NEW SOFTWARE DEVELOPMENT Rajiv D. Banker Professor of Management Carnegie Mellon University Chris F. Kemerer Assistant Professor of Management Science Massachusetts Institute of Technology ABSTRACT In this research we reconcile two opposing views regarding the presence of economies or diseconomies of scale in new software development Our general approach hypothesizes a production function model of software development that allows for both increasing and decreasing returns to scale, and argues that local scale economies or diseconomies depend upon the size of projects. Using eight different data sets, including several reported in previous research on the subject. we provide empirical evidence in support of our hypothesis. Through use of the nonparametric DBA technique we also show how to identify the most productive scale size that may vary across organizations. These results are extended to include the effects of nonparametric scalerelated factors, such as project duration and the number of new staff on the project team. first author was Mellon University. The supported in part by NSF grant SES-8709044 at Carnegie 1 RESEARCH PROBLEM 1. Software development practitioners are faced with the problem of how new software appropriately size productivity Unfortunately, development much research the of so projects in as area this to maximize to has arrived at apparently contradictory conclusions, namely that either economies of scale exist or that diseconomies contradictory of results the existence of a in local most the identifying environment, paper these integrates apparently framework, and empirically demonstrates economies or diseconomies depends upon the projects.' scale effects the we addition, In productive show and This consistent scale development software exist scale for size of some provide given a other scale that size of methodology a software for development related on variables productivity. A volume production process exhibits level, (local) increasing returns to scale when average Economies increasing, and scale diseconomies prevail when average productivity provided specialization of of to scale explain labor to researchers such as and the thus present presence of that like may contribute online the organizational learning costs, area productivity. (e.g., assembly Finally, ail management overhead. language projects This is scale initial may proscribe may coding) a decreasing. from range number of factors (e.g., These tools investment, both their increase certain overhead is Software engineering debuggers or code generators. require type of productivity to economies of scale, such as Larger projects may also benefit from specialized personnel, certain of the presence of a increase productivity, but the relatively large in economies as learning curves. Boehm [16] have noted software development tools may are phenomena such new software development in a given at the marginal returns of an additional unit of input exceed the average returns. Reasons if, fixed status in purchase use on small projects. whose the expertise project's investment in in a overall project meetings and reports) 1 production economics, economies of scale larel defined at specific volume levels in a production process, and It is therefore inappropriate to limit the characterization of a production process to only global economies (or diseconomies! of scale. In dealing with single input-single output pfoduction correspondences, we shall use the terms increasing returns to scale and scale economies interchangeably. In are thus best described as local. does not increase economies of scale for In with directly project and therefore can be a source of size larger projects. many authors have pointed contrast to the reasons cited above, of diseconomies of scale on large software projects. possibility out the Brooks [17] has suggested that the number of communication paths between project team members increases an (at increasing communication overhead is with rate) clear a number the case of Somewhat al. [21] suggest that systems larger development increasing conflicts number of people the among team members. also Jones [25] notes such as planning and documentation, grow Another increases. which is likely to source possible increases be larger on a analogously, Conte projects the that will face more Boehm [16] points out chances personality complex interface problems between system components. that This and hence a cost increase, nonlinear factor that could contribute to diseconomies of scale. et members.^ team of for many overhead activities, at a faster than linear rate as project size of larger diseconomies of project and may scale is project contribute to slack, reduced productivity. how Given these contradictory hypotheses, software development production development managers appropriately process? size maximize average productivity? to organized as follows. can researchers best model the how And, can practicing new software development This organizations, We the increasing returns We integrate scale, two these but decreasing returns empirical applications even the The number of peths required )S n(n-1)/2, where n more is the set in in suggest development production process believe that one reason that this has not in and notions for first new software in most exhibits (local) that very large projects. been shown by other researchers due to the simple parametric models employed. that is Section 2 presents the empirical evidence for both the notion software to projects so as paper addresses these questions and of economies of scale and the notion of diseconomies of scale development software We flexible number show in Section 3, however, parametric forms are limited of project is team members. in their ability to estimate the returns to Envelopment Analysis as an Data most productive the identify alternative scale 2. Section 4 of nonparametnc modeling technique to Finally, Section in in Section 5 presents the results of further size analysis of other scale-related variables further research are presented This motivates our use scale. the conclusions and suggestions for 6. EMPIRICAL EVIDENCE A number of researchers have collected empirical data that support increasing The general approach of these researchers has been to returns to scale theories. estimate a function of the form: y = aix)" where / of the amount of the is project, Function Points input, typically measured typically This (FP). in function is professional work-hours, and x terms of source lines estimated by taking the of the size is code (SLOC) or of both logarithms sides and then estimating the resulting linear model using regression techniques. (1) = a + bln(x) ln(y) An estimated exponent exponent greater than value, 1 returns to scale measure less than b, 1 indicates indicates diseconomies of economies of scale. while an scale, This follows because the is X dy y dx That is, (xly) if marginal productivity (dxidyj b greater than (less than) average productivity less than (greater than) one. is One of the earliest pieces of research to estimate this function of Walston and Felix [31]. indicate Vessey [30] mild was the work They collected data on 60 projects within IBM's Federal Systems Division and estimated would is increasing a function with an exponent of .91, returns to scale. Jeffery a result that and Lawrence [24] and have also reported economies of scale on small projects, although they have not published their data Using We have extended the 1978-80 data this analysis from a to a number of other published Yourdon [22] survey of 17 data projects sets. from a we variety of firms, scale. Two Bailey's study [3] estimated an exponent of .72, indicating increasing returns to other data sets that display exponents of approximately .95 are from study [14] 25 of 19 NASA/Goddard Space of projects [26] Function Point data from from number a Kemerers Society.'' yield summary, the evidence for economies of scale In sources of Assurance Life commercial data processing consulting firm a an estimated exponent of .85. comes Equitable at Center projects and Behren's Flight representing wide a of variety application environments. However, a number of researchers have provided empirical support for the set an exponent of exhibits produce identical Lehman are Belady and Wingfield's [32] exponents of s 15 project economies of Table five scale. their Note average of Table list) tend 1 is to U.S. in Function Points. in Army Two 1.49 data for data sets showing mild decreasing returns to scale 1.06, data set a large software house and Therefore, the empirical evidence new software development is at least as compelling as scale. summarizes the 1 exhibiting estimated a high exponent of [15] 33 project data set from for diseconomies of scale that for We . 24 projects from IBM measured Albrecht's [2] that 1.11 COCOMO Boehm's [16] 63 project notion of diseconomies of scale as well. ioglinear model of the nine data sets, with analysis increasing returns to scale and four exhibiting decreasing returns to that the size." data sets One in Table 1 are listed interesting result available in the approximate from even order of a cursory examination that the data sets with smaller average projects (the first four show economies projects tend to largest projects of scale, show diseconomies on the list, set (also referred to as the is while of scale. the The data last sets with larger data set) is described average data set, which contains the the only exception to this simple analysis. ABC on the in Table This data 2. Behren's data set is not reported directly irt his paper. However, a scatter graph is provided, which was enlarged end the data directly extrapolated. Note that the graph contains only 22 of the 25 reported data-points. Behren's data is only provided in terms of Function Points. Behren's data is estimated to be approximately 10. 8K. Using Albrecht's linear model, the average SLOC for The returns about theories conflicting described Section in different data scale to of Table in thus 1 indicate economies scale the that diseconomies or matched by conflicting empirical evidence obtained for We sets presence the are 1 reported results reconcile apparent this by contradiction offering the hypothesis that for most software development "production processes" there exist returns increasing That large projects than smaller is projects projects smaller average productivity is, "most productive the are that for scale to The larger.^ is scale and decreasing for very increasing as long as the project size size" and (MPSS), MPSS may actual returns be decreasing is different for for different organizational settings. The reasons for our above hypothesis stem from the conflicting arguments presented earlier diseconomies of in Section makes it likely is is (MPSS) average decreasing returns to scale the In returns productivity being productivity, next two framework of sections less and as initially the in fixed project size generally larger manage, and the marginal productivity of the project team Increasing marginal economies both of increases But the continue remains less than marginal productivity. productivity size difficult to decline. presence average productivity spread over a larger project more to the Since most projects require a significant fixed investment scale. project management overhead, overhead for 1 equals greater prevail. we restrictive than average marginal to prevail as long as average At the most productive scale and productivity, productivity, This intuitive argument is beyond MPSS is declining depicted in and Figure 1. reexamine eight* of the nine data sets within the estimation models to provide empirical support for our hypothesis. The large, or if MPSS will tend to differ across organizations. If the fixed overhead the marginal productivity does not decline rapidly, increasing returns Banker |4| provides s rigorous definition and discussion of the concept of most productive scale si2e (MPSSI. pursue this analysis further in Section 4. e Walston and Felix I311 report their estimated loglinear nnodel, but do not present the actual data. is will We MPSS continue to prevail for larger projects and the hand, if the fixed overhead sharply, then the 3. MPSS relatively small is if large. On the other the marginal productivity declines small and decreasing returns set is in lower scale at a level. PARAMETRIC PRODUCTION FUNCTION ANALYSIS The problem with the simple it or be will does not allow for the loglinear possibility of model of the previous research some returns for increasing projects and The estimated returns to scale are determined by decreasing for others. parameter, the exponent we But b. require a more that is a single general model that allows for average productivity increases as the fixed project overhead gets spread over larger and larger projects, and after reaching the most productive scale size (MPSS), it allows for declining average productivity caused by negative factors affecting large projects such as the proliferation of communication paths. parametric approach more flexible on the simple based only forms parametric that have been employed Such other production environments. loglinear a model Rather than reject the model, in that estimates we first empirical explore research MPSS would also in be of use to software development managers because they can then identify the scale size where average One productivity method possible function for is for maximized in their generalizing new software development the organization. restrictive of previous research logquadratic term as an independent variable. We loglinear is production by simply adding a can thus estimate the following translog function:^ 12) \n(HOURS) = Letting / equal scale measure p is d\r\ y (3) p = X + fi^(\n(SIZE)) HOURS + /3^{\n{S/ZE)f and x equal SIZE, it is evident that the returns to given by = d\r\ fi^ xdy — - = fi+ ydx 2;? (In X) Chnstensen, Jorgenson and Lau 20 note that the translog is a llexibl« functional form that provides second-order approximation to an arbitrary, twice-continuously-dtfferentiabie production function. 1 1 a local Therefore, X < exp the {(1 -/? )/2/? projects, y5 < > /S, then increasing returns to scale prevail for and decreasing returns prevail for project sizes greater than } MPSS estimated estimated the estimated if here given exp {(]-^ x' = by then average productivity is )/2/i however, If, }. the estimated to be increasing for smaller and decreasing for longer projects, contrary to our arguments presented earlier. Varian [29] Hildenbrand [23], that such a To provide evidence of robustness of therefore studies For instance, forms. have argued approach imposes considerable untested structure on the parametric production function. empirical and Banker and Maindiratta [9] estimate different parametrically their specified our present context an alternative specification in several results, functional may be the following quadratic form: HOURS (4) Again is = letting /?(, + ^^(SIZE) + p^[SIZE]^ / equal HOURS and x equal SIZE, the returns to scale measure p given by: (5) = /> xdy ^(/?, ydx p> Therefore, 1 if and only X* and MPSS decreasing fi,x*2fi^x^ ^^ + p^x + y are positive then the x< + 2/?^x) if is fi^x'^> the estimated values of both If fi^. given by x* = \/ returns for decreasing returns are exhibited for x> all then increasing returns are exhibited for J3^ fi^x^ x«». 0, x> and 0. and fi^ with increasing returns for , estimated If x> all B/ B fi^ if If and J3^< estimated /^g > fi >0 and then /ff j < the estimated values of both and ^^ are negative then decreasing returns correspond to small projects and increasing returns correspond to large projects, contrary to our earlier hypothesis. The empirical (translog) results for the eight available and the quadratic models are presented data in sets Tables for the logquadratic 3 and 4 respectively. 8 Two empirical problems are encountered observed that these in practice. several researchers have First, so-called flexible parametric functional forms frequently violate reasonable regularity conditions, such as monotonicity, see for instance. Caves and and Christensen [18] and Barnett Lee [13]. our In present context, the logquadratic models estimated for the Bailey and Wingfield data sets exhibit dyldx < (decreasing requirement labor for Similar violation of the monotonicity condition models; for projects by the Bailey, smaller for projects. exhibited by the estimated quadratic is by the Albrecht and Kemerer data sets and for large projects small project size) increasing COCOMO and Belady data The second empirical problem sets. more serious is for our objective of estimating The pairs of independent returns to scale for ne\N software development variables In (SIZE) and (In (SIZEjP. and (SIZE) and (SIZE)\ tend to be highly correlated. The range of pairwise 0.915 - 0.974 for (SIZE) and (SIZE)^ for the eight (SIZE)j^ and sets. (In data available This high level of collinearity implies that the confidence about interpreting the estimates of the coefficients to a change in be to and ^^ as the change >5, the independent variables and the quadratic models.® likely was 0.967 - 0.999 for In (SIZE) and correlations unstable, will in the dependent variable due be very low for both the logquadratic Consequently, the estimates of these coefficients are see for and Pindyck instance The Rubinfeld [27]. usual econometric methods, therefore, may not be appropriate for estimating the nature of returns to scale or the most productive scale size for these eight data The high importance to loglinear likely the collinearity the between of interpretation models reported in Table (SIZE) In the results of and (In the (SIZE))^ estimation The estimated coefficient b 1. of in sets.^ is also simple the this of case is to also pick up the effect of the omitted variable (In (SIZE))^, and therefore, of interpretation b as estimated the returns to scale measure may not be appropriate. 8 The standard errors of less likely to be significant the estimated coefficrents are likely to be larger, and the corresponding i-statistics are the independent variables are highly correlated. when 9 The variance of the estimates covartance of the estimates of p of jj the and returns p . to scale or MPSS measures depend on the variance and the 4. NONPARAMETRIC PRODUCTION FUNCTION ANALYSIS Given these problems, and the limited a priori knowledge about the functional form form parametric or theoretically validate these impose the theory with from deviations individual observations process and Envelopment estimation formal 29]. [8, Analysis This by due by Production economics production a Banker, function, exhibited to use production to and Data frontier extended to a [5].'° Cooper and Charnes, in of the characteristics Rhodes [19] and what approach, propose approach substantiate apparent inefficiencies we a econometric between Therefore, framework for to differentiates Cooper Charnes, economics notion nonparametric a the in specifying to immediately [9, 23, 29]. frontier a inefficiencies. (DEA), developed axioms as difficult is not is it occurring frontier the individual production DEA does treated need to employ the indicates Also, correspondence production development, correspondence statistically hypotheses, software underlying production the for restrictions on process production the of not impose a parametric form on the production function and assumes only that a monotonic and convex relationship exists between inputs and products." More following the formally, production function f(x) 1. 2. Monotonicity : If Convexity limited assumptions made about are the frontier : ./ = y = If - y' fix], f{x') y' fix), and x ^ = x', then / ^ < and fix') X >/ < 1 then (l-X)/* \y' ^ /[(1-X)x+ Xx'] 3. Envelopment 4. Minimum Extrapolation : For each observation : If a Ar, = Ar function n, 1 gi-) satisfies convexity and envelopment conditions, then ^(x) ^ The estimation of the function programming techniques, and estimates of likelihood and consistent, see can f(x) f(x Banker [12]. } be obtained '^ y^ fix) fix^) the monotonicity, for all x. accomplished in this using linear manner are maximum The most productive scale size is 10 Recent developments MO) m stochastic data envelopment analysis simultaneously consider deviations from the production frontier due to inefficiencies and also measurement errors. Evidence on the comparative application Strauss |6|. of the DEA and tranilog models is provided by Banker, Conred and 10 estimated following the via programming model as linear in Banker [4] for the general case of multiple inputs and outputs.'^ (6) min Tj ^ subject to n n k«1 ,^.X^ > (6.3) where x ^ - output = input y.^^ j, for observation for observation /, n = number of observations The fyyA' MPSS 'K,A' MPSS = • • for the and k, (/r = 1, 2, = ...,/?). mix input-output 'y^J- ^^^ ^A k, '^A' ''^lA' given • • by "^JA'' where (Yj^^xJ 'S computed y^ = as follows: '^A X '.2.\ In our present context, correspondence, production considerably simplified. i' 'A -xJy.M A' 'A where we and are interested only in problem computational the The solution to the M = max\x/y kk'k' I k='\ linear a single input-single output program n] is in the (6) is is consequently given by simply maximum observed Alternativ« models for estimating MPSS when some inputs or outputs are fixed or uncontrollable, or when some variables are measured on a categorical rather than on a continuous scale are described by Banker and Moray (71. Banker and Maindirstta 181 discuss the estimation of other non-convex technologies. 11 average project the by given across productivity If the observation, say M' evidence of local observations. x^ x^, and observed with . input-output 5 using the perspective, MPSS correspond MPSS MPSS the is largest for all also attained for pair (x^. some then y^J, , future other there is the a From a projects. both of identification and increasing Projects larger (smaller) than the sets. decreasing (increasing) returns, to From order to maximize in new software development allows also it provides a project size goal 5 shows Table respectively. and the corresponding percentile value for the range of observed output data for each of the eight data sets. the Mj chosen by each researcher. metric size decreasing returns within these empirical data the is = is calculated for the eight available data sets, and the results are average productivity of research productivity ( size represent MPSS. practitioner's viewpoint, the the xjy^ which scale constant returns to scale, and the range of project sizes between Table in for say, x,^, maximum average MPSS was The reported all size The most productive observations. all inter-quartile range for the In of the eight cases, the five observed output thus data, MPSS indicating is within that both increasing and decreasing returns are clearly present since there exist both smaller and larger projects model loglinear than may the be MPSS an at that inadequate site. It description follows therefore many of new that the software development application environments. 5. OTHER SCALE-RELATED FACTORS In size as in the our analysis in the earlier sections, we have considered only the project measured by SLOC or Function Points as the factors explaining the professional labor requirements of factors, such as the duration of a project, assigned to a project team methodology employed for our important link may earlier also different explain analysis Other scale-related projects. or the number of these variations new staff variations. members While the remains applicable and provides an to prior research, the specific results must be interpreted cautiously to the extent that such additional factors have not been included in the analysis, in 12 we section, this consider the explicit inclusion of other scale-related factors estimation models, and we in our provide empirical evidence about the importance of these factors. In addition to size as measured by SLOC or Function Points, project scale often represented by the amount of calendar time taken by a project duration, while possibly highly correlated with a SLOC example, a project containing a given number of a relatively longer period of time might be this due to situation a result in a critical Abdel-Hamid and Madnick slack. on projects that their that are not [ 1 ] on the suggest that the project "critical necessitating there more is re- work. Also, on a longer project slack This would path" having a lot of type of effect also occurs this is likely to increase the hours expended on the project for a couple of reasons. project An example have mis-estimated the amount of time required. Increasing the duration on a project long of a number of tasks dependence. logical some piece of user-signoff on number of other tasks For where one task was required to be completed before the others could commence due to might be getting a be. could be stretched out over composed For a project A projects measure, need not size is chance of changes Boehm [16] suggests a "Parkinson's number of yvork- Over the course of a in user requirements, that with the likely Law" type ("work expanding to fill thus additional the time available for its completion") effect occurs. In order to test the significance of the project's calendar time as a scale- related explanatory variable, regression analysis where the calendar time (DURATION) was was performed on available.'^ the six data sets The regression models were of the form: (7) HOURS This SIZE = /*(, + ^^{S/ZE) + fi^[SIZE? + ^pURATION) model was chosen to represent the nonlinear aspects of relationship. The variables SIZE and are (SIZE)^ For the Bailey. YourOon, Belady, and Wingfield data sals, this data is highly availabl* in the correlated Conte (211. HOURS as : noted 13 from therefore and earlier collinearity problems. which to identify analysis variation the estimation the is However, this statistically other two, in the Yourdon data any adding DURATION see Table seem variable not scale-related with the at DURATION provided that shown 95% is a significant explanation of Table in the by Of the 6. confidence SIZE level and power. (SIZE)^ data sets, six Of the four. in highly correlated with SIZE (correlation which provides a possible reason why 7) explanatory additional Another significant suffer will does not affect the objective of our this analysis are is = .71, corresponding parameters whether DURATIOM provides DURATION coefficient the HOURS, over and above in The results of variables of Only in the Wingfield is it not does the data to add to the labor requirement factor affecting communication productivity on paths suggested is project Not by the literature dealing additional overhead be related to the number of paths, but also to the efficiency of those paths working as a project the This could be affected by A a group. team, more new (8) factor HOURS = by we /Jq the number of new Rubin [28] is only could experience the project team had likely as a to require this need not be the case. for the Kemerer data set [26]. staff, currently available only this additional suggested While a larger project productivity variation. assigned NEWSTAFF, variable, been has how much a The To staff possible members on source of more people and be NEWSTAFF identify data are the impact of estimate the following model: + fi^{SIZE] + fi^iS/ZE)^ + fi ^(DURATION) + /3JN£WSTAFF) The results for the ABC data are: Hours = 16042 - 93.93 (FP) + 0.06(FP)' + 1522 (DURATION) + 9. 06 (NEWSTAFF) (1.43) (-4.10) (6.62) (2.71) The t-statistics are reported in parentheses. R* = .90 The correlation matrix for these NEWSTAFF DURATION FP variables is. (2.10) 14 -0.104 -0.165 -0.221 DURATION FP (FP)' From these 0.338 0.203 results, 0.950 apparent that the number of is it significant predictor of the variation and above the Duration. scale-related variables important to note that is It other simply the number of staff. average staff An general model is (9) HOURS Rather in increasing and terms. for = nFP\ this by 88.9%. incremental results are consistent data set, this DEA, of extension ^^pURATION] a DEA model this specific the (Function number of new staff this is members, not only .02. production function Banker [10]. see and Points)^ IMEWSTAFF and the correlation of The parametric function ^(FP) is functional form can be following (such as the assumed only to be monotone DURATION and NEWSTAFF represent while additive linear case are coefficients of 1018 for DURATION and 1163 sum of squared residuals by when we remove Similarly, NEWSTAFF from the variable residuals increases by model DURATION from each explanatory power to the model, these two variables provides in (9) residuals the model 109.9% and the sum of absolute Thus, of the 123.1% and the sum of absolute 76.6%. by a + fi^(NEWSTAFF) The deletion of the variable sum of squared increases an assuming convex, NEWSTAFF. the is specified: The results for increases the the using than quadratic), is Function Points, nonparametric estimation of alternative by For this members staff for the Kemerer data set, over of work-months to calendar months) (the ratio accomplished work-months in new in (9), residuals substantial see Banker and Morey [11]. These not dependent upon pre-selecting a functional form but are generally with the results from the quadratic model, conclusions drawn from our earlier regression analysis. thus corroborating the 15 6. CONCLUDING REMARKS In this we research have reconciled two presence of economies or diseconomies of scale Our approach general development provides allows that for both for Through use of the DEA technique productive scale production a increasing we views opposing and regarding the new software development in model function decreasing have also shown how of returns software to scale. to identify the most These results were extended to include the effects of other size. scale-related factors. For the practitioner, our results contain a number of useful implications. terms of project estimation, loglinear have traditional algorithmic models have suggested a simple model with which to estimate eventual work-hours. some limited applicability, they ignore the possibility productivity by carefully selecting the scale of the project scale as exogeneous, as most of these simple models seek to identify the most productive scale size for suggest that this MPSS also work would be organizations' ability to successfully assess the effects do, their While these models of improving project Rather than taking the managers could organization. actively Our results varies widely across different application environments, and an interesting extension to this some In manage on productivity of calendar duration and the number of new to identify factors that contribute to larger other projects. scale-related Managers could factors, project team staff members. such as 16 Table Summary of Loglinear Models 1: t-statistic DATA SET Significant n at the •"Significant at the MEAN SLOC MEAN 5 percent FP level for a one-tailed test 10 percent level for a one-tailed test H„:b=l 17 Table Project Hours 2: Kemerer [26] Data Set Func Pts KSLOC Duration Newstaff 18 Table DATA SET 3: Sutnmary of Logquadratic Models 19 Table DATA SET /?g /?, 4: Summary of Quadratic models y5j R^ MPSS' 20 Table DATA SET 5: Most Productive Scale Sizes Using DEA MPSS" Percentile 21 Table DATA SET 6; Duration Regression Results /?3 DURATION COEFFICIENT t-STATISTlC 22 Table 7: Duration Regression: DATA SET Correlation CORRELATIONS Hours Bailey .96 .46 .63 .55 .57 .95 Size .63 Size .71 .55 .74 .92 .58 .33 .64 .53 .63 .95 .46 .31 Duration .82 .85 .40 .97 .43 .42 Size .74 (Size)' .86 .24 .95 .34 .20 Size (Size)' Duration Size (Size)' Kemerer .71 (Size)' Duration Wingfieid .38 .91 .79 .60 Duration Belady (Size) Size (Size)^ COCOMO Size (Size)^ Duration Yourdon Between Independent Variables Duration 23 References [ 1 ] [2] Abdel-Hamid. Tarek and Stuart Madnick Impact of Schedule Estimation on Software Project Behavior. IEEE Software 3.70-75, July, 1986. Albrecht, Allan J. and John Gaffney, Jr. Software Function, Source Lines of Code, and Development Effort Prediction A Software Science Validation. IEEE Transactions on Software Engineering SE-9(6);639-648, November, 1983. [3] John W. and Victor R. Basiii for Software Development Resource Expenditures. Proceedings of the 5th I nter national Conference on Software Engineering, pages 107-116,. 1981. Bailey, A Meta-model In [4] Banker, Rajiv. Estimating Most Productive Scale Size Using Data Envelopment Analysis. European Journal of Operations Research 17(1):35-44, July, 1984. [5] Banker, R. D., Some Models A. Charnes and W. W. Cooper. for Estimating Technical and Scale Inefficiencies Management Science [6] 30(9); D., R. F. Conrad and R. P. Strauss. Comparative Application of DEA and Translog Methods; Study of Hospital Production. Management Science 32(11:30-44, January, 1986. An [7] Banker, R. D. and R. C. Morey. Efficiency Analysis for Exogenously Fixed Inputs and Outputs. Operations Research 34(4);5 13-521, July-August, 1986. [8] Banker, [9] Banker, Illustrative R. D. and A. Maindiratta Piecewise Loglinear Estimation of Efficient Production Surfaces. Management Science 32(1); 126- 135, January, 1986. R. D. and A. Maindiratta Nonparametric Analysis of Technical and Allocative Efficiences 1987. Forthcoming in Econometrica in Banker, Rajiv. Stochastic Data Envelopment Analysis. 1987. Carnegie Mellon University Working Paper, 1987. [11] Banker. R. D. and R. C. Evaluating Hypotheses Morey. in Efficiency Analysis; An Application. 1987. V^/orking Paper. [12] DEA. Banker, R A [10] in 1078- 1092, September, 1984. Banker, Rajiv. Maximum Likelihood, Consistency and Data Envelopment Analysis. Carnegie Mellon University Working Paper, 1987. Production. 24 [13] W. Barnett, and A. Y. W. Lee. The Global Properties of the Minflex Laurent, Generalized Leontief and Translog Flexible Functional Forms. 142 1-1437, 1985. — Econometrica [14] : Behrens, Charles A. Measuring the Productivity of Computer Systems Development Activities with Function Points. IEEE Transactions on Software Engineering SE-9(6);648-652, November, 1983. [15] Belady, L. A. and M. M. Lehman. The Characteristics of Large Systems. In [16] Wegner, P. (editor). Research Directions in Software Technology, pages 106-138. MIT Press, Cambridge, MA, 1979. Boehm, Barry W. Software Engineering Economics. Englewood Prentice-Hall, [17] Cliffs, NJ, 1981. Brooks, Frederick P. The Mythical Man-Month. Addison-Wesley, Reading, Mass., 1975. [18] Caves, D. W. and L R. Christensen. Global Properties of Flexible Functional Forms. :422-432, 1980. American Economic Review — [19] W. W. Cooper and E. Rhodes. Program and Managerial Efficiency: An Application of Data Envelopment Analysis to Program Follow Through. Charnes, A., Evaluating Management Science 27(61:668-697, [20] 1981. Christensen, L R., D. W. Jorgenson and L J. Lau Transcendental Logarithmic Production Frontiers. Review [21] June, Conte, of Economics S., H. and Dunsmore and Statistics 55:28-45, 1973. V. Shen. Software Engineering Metrics and Models. Benjamin/Cummings, Reading, MA, 1986. [22] DeMarco, Tom. Yourdon Project Survey: Final Report. Technical Report, Yourdon, [23] Inc., September, 1981. Hildenbrand, Werner. Short-Run Production Functions based on Microdata. 49(51:1095-1 125, September, 1981. £co/7o/T7er/-/ca [24] Jeffery, DR. and M. J. Lawrence. Inter-Organisational Comparison of Programming Productivity. In Proceedings of the 4th International Conference on Software An Engineering, pages 369-377. [25] Jones, Capers. Programming McGraw-Hill, Productivity. New York, 1986. 1979. [26] Kemerer, Chris F. An Empirical Validation of Software Cost Estimation Models. Communications of the ACM 30(5)416-429, May, 1987 [27] Pindyck, R.S. and D.L Rubinfeld Econometric Models and Economic Forecasts. McGraw-Hill, New York, 1981. [28] Rubin, Howard A. Macroestimation of Software Development Parameters: The Estimacs System. In SOFTFAI R Conference on Software Development Tools, Tectiniques and Alternatives. IEEE. 1983. [29] Varian. H. The Nonparametric Approach to Production Econometrica 52(3);579-597, 1984. [30] Vessey, Analysis. Iris. On Program Development Effort and Productivity. Information & Management 10:255-266, 1986. [31] Walston, C.E. and C.P. Felix. A Method IBM [32] of Programming Measurement and Estimation. Systems Journal 16(1);54-73, 1977. Wingfield, C. G. USACSC experience with SLIM. Technical Report lAWAR 360-5, U.S. Army Institute for Research Management Information and Computer Science, 1982. in Figure 1 Most Productive Scale Size (MPSS) productivity (size/hours) ^ average productivity marginal productivity MPSS project scale size hours production function MPSS project scale size INIIIIimim?'"^^ DUPi 3 TOflO 2 DDSt,?MT? b 6'iu-&^ Date Due I I '. ' 1 1U Jv Lib-26-67 BARCODE ON iMEXT TO LAST PAGE