D28 ^%1 MAR181S87 ALFRED P. WORKING PAPER SLOAN SCHOOL OF MANAGEMENT THE ECONOMICS OF SOFTV;'ARE QUALITY ASSURANCE<- A SYSTEM DYNAMICS BASED SIMULATION APPROACH Tarek K. Abdel-Hamid Stuart February 1987 E. Madnick #WP 1863-87 MASSACHUSETTS INSTITUTE OF TECHNOLOGY 50 MEMORIAL DRIVE CAMBRIDGE, MASSACHUSETTS 02139 THE ECONOMICS OF SOFTVTARE QUALITY ASSURANCE-r A SYSTEM DYNAMICS BASED SIMULATION APPROACH Tarek K. Abdel-Hainid Stuart E. Madnick February 1987 #WP 1863-87 ^BS^rxsr^rixs^'l^, MAR 1 9 19S7. THE ECONOMICS OF SOFTWARE QOALITY ASSORANCK: A SYSTEM DYNAMICS BASED SIMULATION APPROACH Abstract function has gained, in The software quality assurance (QA) recent years, the recognition of being a critical factor in the successful development, deployment, and maintenance of software However, systems. because the utilization of QA tools and techniques does tend to add significantly to the cost of developing software, the cost-effectiveness of QA has been a pressing concern to the software quality manager. As of yet, this concern has not been adequately addressed in the though, literature. Our objective in this paper is to investigate the tradeoffs between the economic benefits and costs of QA efforts in terms software project's total development cost. To do this, we a of developed an integrative System Dynamics model of the software development process. The model is comprehensive in that it integrates the multiple functions of the software development process, including both the management-type functions (e.g., planning, control, and staffing) as well as the software production-type activities (e.g., design asid coding). The model also captures the dynaunics of error generation as well as the QA activities of error detection and correction. An important utility of the model is to serve as a laboratory vehicle to conduct controlled experiments on QA policy. Experimental results pertaining to the economics of QA are presented and discussed. Introduction: quality assurance (QA) is increasingly being perceived Software as most critical factor in the successful management of software a This is happening because more and more software managers projects. are starting customer's cost QA and has to realize satisfaction, the often that but " it QA also not only holds the key to a has a direct impact on the scheduling of a project. Failure to pay attention to resulted in budget overruns, schedule delays, and failure to meet the needs of the customer" (Chow, 1985). According to the IEEE standard P730, QA has been defined as "a systematic pattern of all actions necessary to provide and planned confidence adequate key two encompasses view comprehensive conforms established to is methodologies, (including activities approaches, QA, rather than a restrictive one. In other that QA is not restricted to say a set of but rather it includes all the necessary techniques, management administrative and the definition provides for a First, ideas. of message the technical the software the requirements" (Buckley and Poston, 1984). This definition technical words, that organizational procedures) that may contribute to quality of software during the entire lifecycle of the product. Second, the definition emphasizes the importance of a plan for QA, and its systematic implementation to achieve an organization's of desired objectives of software quality. In paper, this Specifically, economic project will we benefits cost. focus pertaining issues technical) our and on is managerial (not the to the economics of the QA activity. investigate costs the of the the tradeoffs between the QA effort in terms of total Such considerations obviously lie (implicitly if not explicitly) at the heart of the QA planning process. to The utilization the cost expended in of of QA tools and techniques adds significantly developing developing software. For example, roan-hours are test cases, running test cases, conducting structured walkthroughs, etc. This added cost is ... a source of concern to everyone associated with the program, particularly the program manager and the customer ... A (more) pressing concern to the software quality manager is . cost efficient are the QA operations during the development just as all elements of the The QA organization, cycle. development process, will and should be subject to detailed and continuing scrutiny regarding the cost of doing business (Knight, 1979). how This "pressing addressed That during is, the development the been however, not, investigating studies operations has literature. the in published concern" as cost cycle. adequately of yet, there are no efficiency Paraphrasing of QA Lawler (1985): the impact of software quality on development costs has not been explored. software costing models do not Consquently, account for the impact of software quality on development costs . . . stressing In quality, Riggs managers in of software of interesting suggestion to QA finance, or money, but that the language of is is number and hours, "things" (e.g., quantity of output, number of errors). The challenge to the QA is to bridge cost of quality and the opportunity for improved quality into staff the an gaining this commitment. He suggests that the language production labor offers (1983) management top importance of top management's commitment to the dollars and important cents step in the gap between these languages, to translate terms. gaining Such the reinterpretation of quality is an involvement and support of top management in the business of improving quality. In approach the to remaining parts of this paper we propose a new research the study of the software development process in economics of QA in particular. An overview of our the and general approach software of is first presented in the section immediately following development introduction. this modeling Dynamics System integrative discussion This followed then is by detailed more a the model's QA section, and the model's experimental of results peirtaining to the economics of QA. An Integrative System Dynamics Model o f Software Development: work research Our on economics of QA is really only one the of a much broader research effort to study the dynamics of the part software development process. The overall objective of this ongoing research effort will, to findings development development several System a project of and computer Dynamics research of mostimportantly model the software of The model is currently being used capacities, one conducting for compilation a database, management. research topic the provide of vehicle laboratory QA, comprehensive a an extensive series of field of developers, software of in completion the include date interviews in software project management process. Accomplishments the of develop a scientific base, a theory if you to is of is to serve as a which experimentation in the area of this paper. Our objective in this section is to an overview of the model. A full description of the model's structure, its experiments [(Abdel-Hamid, formulation, mathematical performed 1984) and on ( it are provided and in the validation other Abdel-Hamid and Madnick, 1987)]. reports . model The multiple both functions functions planning, control, and (e.g., as well as the software production-type activities (e.g., design, coding, model's four Management reviewing, major testing). and subsystems, Subsystem; Controlling figure in the sense that it integrates the the software development process, including of management-type the staffing) The integrative is namely: Subsystem; The (1) 1 depicts the Human Resource Software Production Subsystem; The (2) Figure and (4) (3) Planning Subsystem. The The also illustrates some of the interrelations between the four subsystems The Human Resource assimilation, training, resource. Such Figure suggests, 1 level needed completion bearing carried not project's the of out human in vacuum, but, as hiring rate is a function of the workforce complete to transfer and are Subsystem captures the hiring, they are affected by the other subsystems. For project's the example, actions Management project the on a certain planned Similarly, what workforce is available has direct date. the allocation of manpower among the different software on production activities in the Software Production Subsystem. The four development, development software. using primary quality activity As software production assurance, comprises rework, both walkthroughs, to and testing. are: The the design and coding of the the software is developed, structured activities it is also reviewed, detect any errors. e.g., Errors HUMAN RESOURCE MANAGEMENT {») WORKFORCE AVAILABLE \ PROGRESS \ STATUS ^ SOFTWARE WORKFORCE NEEDED \ PRODUCTION (S) / I \ / \ / / / SCHEDULE \ /tasks ' / / \ \ / \ \ // CONTROLLING \ \ / / ^ \ / COMPLETED t \ J_ PLANNING EFFORT REMAINING IP) IC) Figure (1) quality detected through such reworked. Not errors get detected and reworked at this phase, all assurance activities are then Some "escape" detection until the end of development e.g., however. until the system testing phase. progress As is reported. should be within A made on the software production activities, it is comparison of where the project is versus where it to plan) is a control-type activity captured (according the project's Controlling Subsystem. Once an assessment is made (using available information), status of the it becomes an important input to the planning function. at Planning the In initiation the when revised, example, to schedule, Subsystem, the of project, necessary, handle plans can and then these estimates are throughout project a initial project estimates are made that the is project's perceived to life. For be behind revised to (among other things) hire more be people, extend the schedule, or do a little of both. addition In in System the software engineering Dynamics methodology. feedback control organizational, name being integrative, our modeling approach has a characteristic feature that distinguishes it from most other second work to systems field, namely, the use of the "System Dynamics is the application of principles and techniques to managerial, and socioeconomic problems" (Roberts, 1981). As its implies, System Dynamics is a method of dealing with questions 8 about the tendencies dynamic behavioral patterns system a as they complex of over generate that is, the systems, e.g., whether the time, whole is stable or unstable, growing, declining, or in equilibrium. The philosophy Dynamics System based on several premises is [Forrester (1961) and Roberts (1981)]: 1. behavior The entity structure The physical aspects, but policies the structure. its procedures, and both intangible, that dominate decision-making in and tangible by the only not includes, importantly more caused principally is history) of an organizational time (or the organizational entity. 2. belongs that takes place in a framework decision-making Managerial to class general the known as information-feedback systems. 3. Our systems intuitive judgement is unreliable about how these change will with time, even when we have good knowledge of the individual parts of the system. 4. Model experimentation makes it possible to fill the gap where our showing can the judgement way and knowledge are weakest by in which the known separate system parts interact to produce unexpected and troublesome overall system results. Based on these philosophical beliefs, two principal . 9 for operationalizing the System Dynamics technique were foundations established. These are: use The 1 information-feedback systems to model and of understand system structure (premises use The 2. computer of behavior (premises Consider, Figure that feedback This 2. impact upon arise software higher more error errors available project manpower falls behind schedule can lead to a rate. committed, is project's progress schedule pressures, Happily, project a diverted on this later.) As more and (More larger and larger chunk of the from development work and devoted rate and drops further, leading to even higher another pass around this "vicious cycle." managers do have "escape" mechanisms to break from the grip of this positive feedback loop. For example, as schedule loop), software the in to error correction and rework duties. As this happens, the instead loose activity QA project generation are the simple feedback loop of example, loop shows that the schedule pressures that often The a simulation to understand system loop portrays some of the dynamic forces the environment. as and 2). and 4). 3 an as 1 pressures persist (e.g., after several passes around the project managers could, among other things, add more people, extend the schedule, or do a combination of both. 10 Error Generation Rate Progre«« Rate Schedule Pr essure Figure (2) 11 another In ( Abdel-Hamid and Madnick, 1986) we in detail the philosophical arguments for the applicability discuss of paper feedback the systems management project concepts of System Dynamics to software and show how they do provide a powerful lens to view and understand software project behavior. Error Generation. Detection, and Correction in this section is to present in some detail the objective Our structures model's detection pertaining correction, and : and to the QA activities of error which lie, as was mentioned above, within the model's "Software Production Subsystem." Software below are errors what come in many different "flavors." Summarized Nelson (1974) delineated and described as the most prominant software design and coding errors: - Misinterpretation of specifications - Errors in developing the logic to solve the problem - Algorithm approximations that may provide insufficient accuracy or erroneous results for certain input variables - Data structure defects either in the data structure design specifications or in the implementation of the specifications - Singular or critical input values to a formula that may yield an unexpected result not accounted for in the program code - Misinterpretation of language constructions by the programmer. . . . 12 System Dynamics model it is quite feasible, and in fact any In straight forward from a technical variable such point of view, to disaggregate a as the error variable into more than one error type. However, it is not always necessary or useful: There are two (and only two) considerations for reformulating a a sequence of two or more levels: policy level (variable) as model behavior. First, is the disaggregation analysis and order for the model to be able to address required in particular policy issues? (variable) reason for disaggregating a level The second system. Does the disaggregation of involves the dynamics of the levels the potential change has to level into two or more a significantly the behavior of the model? The final arbiter should be model-based policy analysis. If the behavior the potential to alter policy has change in conclusions, then the disaggregation is essential (Richardson and Pugh, 1981) , . . Since policies software of errors into significant focus reliability, than criterion, analysis be more policy is on the managerial-type development, as opposed to the technological software of issues are model's our . type one an explicit is, disaggregation of on the basis of the policy unnecessary. On the other hand, there clearly behavioral differences among error types that must accounted for. For example, findings in the software engineering literature errors indicate generated are generated that, at a at at different points in the lifecycle, different rates, e.g., design errors are higher rate than are coding errors (Martin, 1982). Such a factor is obviously of dynamic significance, e.g., it could have a direct bearing on the allocation of the manpower resource 13 which activities, correction error for in turn would affect the software development rate and hence the project's progress. differences are implicilty captured in the model. That Such while errors generation, vary generated lifecycle the at a development higher single variable (ERRORS), the a lifecycle. For example, rate errors design (as as and correction characteristics of ERRORS do detection, throughout are formulated are is, the earlier portions of the in and do) "ERRORS" they are, on the average, harder to detect and correct (as design errors are). Figure detection, well and sectors, testing" model's the correction of structure errors. for the generation, The figure demonstrates as interrelationships between this part of the model and two the other depicts 3 namely, Also sectors. the note "software that development" these and "system sectors together three constitute the "Software Production Subsystem" of Figure 1. System Dynamics Modeling Conventions: conventions schematic The conventions used perspective all used in Figure in System Dynamics models. systems can be From are the standard 3 a System Dynamics represented in terms of "level," "rate," and "auxiliary" variables. A flows "level" level is changes or is accumulation, an or an integration, over time of that come into and go out of the level. The term intended to invoke the image of the level of a liquid 14 Figure (3) 15 accumulating called are level that errors as "DETECTED Thus, . increased the by ERRORS" is a level of DETECTION "ERROR RATE" and error "REWORK RATE." levels and shovm rates is decreased by the Rates The flows increasing and decreasing a a container. in are represented as stylized valves and tubs, emphasizing further below, analogy the between accumulation processes and the flow of a liquid. LEVEL -> RATE RATE flows The that controlled are by the rates are usually differently, depending on the type of quantity involved. diagrammed We will use the two types of arrow designators shown below: INFORMATION FLOWS OTHER FLOWS (e.g. All are tangible either accumulations information-type concepts the (e.g., policy for PEOPLE) variables Auxiliary flowing. , either levels or rates i.e., they are of previous variables, variables in on flows the or other are presently hand, are the system, and capture things like the concept of "ERROR DENSITY") and policies (e.g., allocating "DAILY MANPOWER FOR REWORK"). Auxiliary 16 variables are represented by a circular symbol variables Finally, model are sectors shown either DEVELOPMENT SOFTWARE are arcs as "SOFTWARE the (e.g. that defined in other sectors of the emanating from their respective DEVELOPMENT SECTOR) or RATE" variable from the are represented by enclosing the variable name in parenthess as shown below. VARIABLE FROM ANOTHER SECTOR Error Generation Factors: Returning GENERATION error structured factors Figure to There RATE." generation includes: staff back rate organizational techniques (Belford [e.g., et two are in 3, a al, project 1977), variable the "ERROR sets of factors that affect the software factors (Alberts, consider [e.g., project. The first set an organization's use of 1976), the overall quality of the etc.] complexity, and project-specific-type system size, programming 17 from organization they do, is software would remain captured development constant most of the above variables project, their and PER COMMITTED e.g., effect could, TASK." therefore, be the "NOMINAL NUMBER is a measure of a unit of (Task software module, a block of 30 lines of code, etc.) a variable nominal a in modeling the behavior of a It means that, by a single nominal variable, namely, ERRORS Such our modeling viewpoint, this observation From significant. quite work, tend to remain invariant during the life of any project. single OF to organization and from one project to another, however, particular even though such factors can differ that Notice etc.]. language, would require modification only when modeling different but when experimenting with policies (such as QA policy) for a not organizational settings or different projects, particular software project (which is the scenario in this paper). a certain "NOMINAL NUMBER OF ERRORS COMMITTED PER TASK," nominal error generation rate would then simply be the product Given the of developed per COMMITTED PER different error TASK" not is JOB unit COMMITTED and In order i.e., the to how much tasks are "NOMINAL NUMBER OF ERRORS capture the generation of types, the "NOMINAL NUMBER OF ERRORS COMMITTED PER formulated that is captured in the WORKED.") RATE," time, of TASK." variable dynamic phase DEVELOPMENT "SOFTWARE the The as changes a constant over number, but rather as a the project's life. (Project modelthrough the continuous variable "% OF formulation of the "NOMINAL NUMBER OF ERRORS PER TASK" therefore serves two purposes. First, its shape ) 18 different purpose types throughout the life of a project. The second error different formulation the of generation error situations (i.e., those the of project value) characteristics organization generally parameter (The absolute (its of reflects the different project the software product's characteristics as well as would obviously, settings. life reflects the relative generation rates of project's the over which in it is developed). This, change when modeling different project profile that characterizes the software used in our experimentation, is presented in the (EXAMPLE) Appendix. to the organizational and project-specific factors addition In discussed error above, there is generation rates and which, unlike the previous set, do not second a set of factors which affect invariant during the life of a project, but rather do play a remain dynamic during role software development. These include the workforce-mix and schedule pressures. workforce The types of project typically level newly employees, members (from through pass the model hired and in is disaggregated into two experienced. Newly hired outside the organization or from within it) a project orientation phase during which they are less than fully productive. The orientation process brings them up to well as the Zawacki, speed through training that covers both the social as technical 1980). While environments not yet of fully the project (Couger and trained (during this . 19 orientation than their 1976)]. ERROR not only, on the tend to be more error-prone [(Endres, as EXPERIENCED." much That findings Madnick, WORKFORCE TO by 1975) and (Myers, it arc (In from RATE.") Figure "% OF this 3, WORKFORCE It is a variable that as a function of the "% OF WORKFORCE as 100%, is, MIX." the GENERATION "ERROR to by DUE represented is EXPERIENCED" and also are The effect of this factor is captured by the "MULTIPLIER TO multiplier research employees counterparts experienced GENERATION varies, they productive, less average, hired newly period) assumed in the model [based on the is reported in ( Abdel-Hamid, 1984) and (Abdel-Hamid 1987)] that a newly hired employee is, on the average, twice as error-prone as an experienced employee would be. The schedule second factor pressure that drive can [(Mills, 1983), error generation up is (Putnam and Fitzsimmons, 1979), 1982)]. Paraphrasing DeMarco (1982): and (Radice, People under time pressure don't work better, they just work faster In the struggle to deliver any software at all, the first casuality has been consideration of the quality of the software delivered. . Two schedule . explanations pressures Schneiderman (1980) have cause been proposed in the literature for why more suggests errors to be generated. First, that the schedule pressures increase the "anxiety levels" of programmers. A high anxiety level, then interfaces (with performance), probably by reducing the size of the short-term memory available. When programmers ... . . 20 become more anxious as deadlines tend to make even more errors . explanation Another suggest They (1980). overlapping by Thibodeau and Dodson schedule pressures often result in the activities that would have been accomplished better of sequentially, provided was that approach, they (therefore) . that and this can significantly increase the chance of errors. For example. When coding has begun before the completion of design, the designers are required to communicate their results to the programmers in a raw, unqualified state, hence significantly increasing the chance of design errors This is not to suggest that systems cannot be developed with overlapping activities. Many systems have distinct parts that can be coded before the entire design is completed ... We are concerned here with the situation where the press of the development schedule or the slippage of preceding activities overlapping activities that would have been results in accomplished better sequentially (Thibodeau and Dodson, 1980). . in . The effect of schedule pressure on error generation is captured the model Figure "ERROR to there perceived value. revealed "MULTIPLIER TO SCHEDULE PRESSURE." (In Under nominal conditions schedule pressures (e.g., when the project is be on schedule), are to RATE.") no schedule As GENERATION are to deadlines leading the this multiplier is represented by the arc from "SCHEDULE 3, PRESSURE" where through approached), higher could pressures the multiplier assumes a neutral increase in a project (e.g., as the multiplier increases exponentially error generation rates, which our field studies (under severe schedule pressures) be as much as 50% higher than nominal. 21 those within these are ERRORS" DETECTABLE point which at some reworked. then are will developed, errors are conimitted within Errors tasks. "POTENTIALLY tested, tasks software as Thus, a until developed task the is reviewed and task remain as the errors do get detected, of and Usually, though, not all errors will be detected, some subsequent phases of software development, where they might then be Error Detection Factors detection The the : errors of schematic delay. the is objective of the software 3, representation), namely, that of an exponential delay is essentially a conversion process that accepts a [A given inflow rate and delivers a resulting flow rate at the output. outflow may dynamic under of into rather non-characteristic mathematical formulation (with its a special This undetected assurance (QA) activities. The "QA RATE" shown in Figure quality The pass albeit at a relatively much higher cost. caught, has and "escape" differ detailed discussion characteristics The something of the a variable amount software to be QA'ed. For a more mathematical formulation and behavior of exponential delays in System Dynamics models see . ] "characteristic" such instant from the inflow rate implies that the delay contains quantity in transit, e.g., Forrester (1961) by circumstances where the rates are changing in value. necessarily the instant as the way rate to formulate a rate of accomplishing of developing software or correcting 22 errors, to formulate it as a product of the effort allocated to is activity and the productivity at which this effort is utilized. the what our field studies uncovered (and what the exponential However, formulation delay independent we schedule temporarily usually in the form of a fixed would to be scheduled once a week. processed. be what And we were the are relaxed the in or QA suspended of schedule pressure, no backlogs develop. For walkthroughs are suspended for a while on a project, review to behavior [This literature e.g., can activities QA develop therefore, backlogs, No because when postponed. Since we studied is this: QA "QA windows," all tasks developed since the when requirement We to be within the particular "QA window," they almost always Even pipeline. the tends rate to find was that irrespective of how many tasks needed to "processed." example, members supposed are processed were allocated, periodic one surprised be and project for these previous QA periodic group-type functions. For example, a two-hour of walkthrough During the organizations the in planned is that is allocated QA effort and its productivity! What the of happening found effort captures) of errors (invisible that almost impossible to (i.e., that all explanation an objective also was tasks reported 1982) and (Mitchell, (Hart, propose affected the the is tell QA until is bypassed, by others not in the 1980).] for how and why this happens. activity they is to detect invisible are detected), it becomes whether the QA job was completely done these invisible errors were in fact detected). By 23 same the token, it completely such circumstances oneself not convenient is what for developed tasks) incentives to there actually it are only do "exposes" secondly, the at mechanisms reward the effort QA that is in terms of available time and effort), actually expended and not more (e.g., even if is called is Furthermore, insufficient. to expend (i.e., usually more easy to rationalize both to quite becomes it to management that the QA job that was "convenient" to and was do, much later in the lifecycle). Under (except done been as difficult to tell that the job has not is due to because a there otherwise. than expected workload of larger seems Firstly, be to at no significant a psychological level, dis-incentives for working harder at QA, since more of organizational place in mistakes (Weinberg, one's 1971). And level, there are seldom any real to promote really quality or quality-related activities (Cooper and Fisher, 1979). formulation The provides, we feel, execution" of developed will QA'ed) the of a QA always the good "QA as approximation activity. QA'ed be RATE" an of exponential delay this "Parkinsonian That is, software tasks that are (or, more accurately, considered after a certain delay, which is independent of the actual QA effort allocated. Therefore, under the the rate at which tasks are considered QA'ed can, such currently adopted QA practices, proceed independently of actual QA effort allocated. However, the effectiveness of QA 24 depend obviously, will, will detected errors that on be effort. function a of That is, the amount of how much QA effort is allocated for error detection. What AN As was the case with the "ERROR GENERATION RATE," there ERROR?" organizational-type are staff, well as complexity model as programming factors such as project language. And as was explained before, how single nominal variable, namely, the "NOMINAL QA at costly to (Boehm et al, being how of detect to the as was discussed above, than coding-type but they are errors [(Alberts, 1976)]. manpower needed to detect an error, in addition of error-type, must also depend on the people work. A full-time employee's 8-hour work typically not contribution QA rate, 1975), and (Myers, function a efficiency Specifically, design-type errors are not only ) higher a actual the project progresses through its lifecycle. as Appendix. more does therefore, captured in the rather it is a dynamic variable that assumes but values generated The are, are to detect, this nominal variable is not a they number, the 1976), and costly different also project NEEDED PER ERROR." Because different error types do differ constant (See software a MANPOWER day project-specific-type as and particular to factors such as the overall quality of the such factors do tend to remain invariant during the life of any all in the determinants of the "QA MANPOWER NEEDED TO DETECT are translate project. into a fully productive 8-hour Man-hours are lost on communication 25 "MULTIPLIER PRODUCTIVITY TO man-day. other In amount to a losses member) (i.e., effort be to roan-hour loss per day (for the average project 4 Under such circumstances, the impossible = remove to generous effort unlikely that reason, be all (Shooman, 1983). Thus, even when made QA, all errors" are will errors to detected be it would still be (Boehm, 1981). One is that some errors manifest themselves, and exhibited only after system integration (Shooman, 1983). At point in time one DETECTABLE "POTENTIALLY errors, suggests that "In any sizable program, it is allocations example, for actually require (in the above would it 0.8 man-days. evidence Finally, conditions of no losses), 0.4 man-days of under detected, case) 0.4 / 0.5 the the communication and motivation if needed. That is, if a design error requires, under nominal conditions could, ERRORS" therefore, than (1982) majority others. suggest of Empirical that errors view constituting as in which some are more subtle, detect Weiss words, manpower needed to detect an error becomes twice what is QA nominaly to COMMUNICATION AND MOTIVATION TO which would be half of the nominal 8-hour value, then the , actual any DUE of the multiplier would be 0.5. value can of losses are captured in the model's types which simply represents the average productive fraction of LOSSES," a These etc.). breaks, activities (e.g., personal business, coffee non-project other and a the set of hierarchy of and therefore more expensive results reported by Basili and the distribution is pyramid like, with requiring approximately a few hours to ) 26 detect, fewer still that errors requiring approximately a day to detect, and few a more requiring errors than a day to detect. (Notice results show that these few subtle errors are an order of the magnitude more expensive to detect. assume in the model that as QA activities are performed, the We obvious more becomes then it errors will be detected first. As these are detected, more and more expensive to uncover the remaining less pervasive) errors. This is achieved in more subtle (although the model through DETECTION-EFFORT multiplier MANPOWER is DUE densities, "obvious" exponential the ERROR TO captured NEEDED formulation the by DETECT TO multiplier errors DENSITY." from arc (In "MULTIPLIER Figure 3, TO this "ERROR DENSITY" to "QA AN ERROR.") At moderate-to-large error assumes detected, are fashion, the the of neutral a multiplier the value. But as the increases in an such that the remaining few subtle errors are an order of magnitude more expensive to detect. To a recapitulate, the "QA MANPOWER NEEDED TO DETECT AN ERROR" is function the in value error some level of error-type, work efficiency, and error density. As of this needed effort increases, density, of QA the number effort, e.g., due to a decrease of errors that can be detected, decreases. At any point in at time, the "POTENTIAL ERROR DETECTION RATE" (determined simply by dividing the QA allocated effort DETECT by the value of the "QA MANPOWER NEEDED TO AN ERROR"), represents the maximum possible number of errors ) 27 could that often modest, allocated errors all of generously to generated. allocations to QA are detect. to And even when effort is a few subtle errors will just be too QA, expensive prohibitaviley manpower maximum value is seldom large enough to ensure this detection the Because detected. be As a result, some errors will "escape" and pass undetected into subsequent phases inevitably of software development, as is shown in Figure 3. Error Generatio n Factors: errors that do get detected through QA are then reworked. Those rework The rate activities, rework example, if detected ERROR" is, and rework manpower needed per error. For the project the rework a function of how much effort is allocated to is members commit 10 man-days per week to errors, and the "ACTUAL REWORK MANPOWER NEEDED PER on the average, 1 man-day, then errors will be reworked at the rate of 10 per week. components. The ERROR. in is a As " function Design-type rate rework REWORK "ACTUAL The and to first is MANPOWER the " NEEDED PER ERROR" has two NOMINAL REWORK MANPOWER NEEDED PER the case of error detection, this nominal component of error-type i.e., design versus coding errors. errors, in addition both to being generated at a higher being more costly to detect, are also more costly to [(Alberts, (See the Appendix. 1976), (Boehm et al, 1975), and (Myers, 1976)]. 28 needed man-power rework actual The correct an error, in to to being a function of error-type, must also depend on the addition efficiency how people work. That is, we need to account for the of communication "MULTIPLIER and PRODUCTIVITY TO LOSSES," which man-day, is represents 0.5, losses incurred. For example, if the motivation DUE COMMUNICATION AND MOTIVATION TO average the productive fraction of a then the actual rework manpower needed to correct an error becomes twice what is nominally needed. recapitulate, To activities, they are errors as reworked. function the are The detected rate at through which the errors QA are manpower committed to the rework reworked is activity and the rework effort needed per error. The "ACTUAL REWORK MANPOWER NEEDED a PER of ERROR" is, in turn, a function of two things, error-type and work efficiency. reworking The of software errors is not, itself, an errorless activity: Human tendency is to consider the 'fix,' or correction, to a problem to be error-free itself. Unfortunately, this is all too frequently untrue in the case of fixes to errors found by inspections and by testing (Fagan, 1976). The [e.g., and problem of bad-fixes is widely documented in the literature (Endres, (Shooman, 1975), 1983)]. (Fagan, 1976), (Jones, 1978), (Myers, 1976), Shooman and Natarajan (1977) suggested some of the ways in which bad-fixes may be generated: * . 29 1. The correction is based upon faulty analysis, thus complete bug removal is not accomplished. 2. The corrections of a bug may work locally only (i.e., the global aspects of the error still remain). 3. The correction is accomplished, however, it is accomplished by the creation of a new error. Thus, corrections together Sector" with project's the captured errors reworked, some fraction of the are will be bad-fixes. The detection and correction of such bad-fixes, during detected as other in that of errors that escape QA detection development sections of phases, are activities that are the model (the "System Testing ) further For the parameter the reader is on the model's mathematical formulations, details values, refered and the research findings supporting both, to Abdel-Hamid (1984) and Abdel-Hamid and Madnick (1987). Model Experimentation: above The detection, three is and discussion correction of the structures model's (which error is generation, only one of the sectors of the "Software Production Subsystem," which in turn only one of four major subsystems in the model) should 30 demonstrate Dynamics beyond behavior The model. such of integrative System dynamic models is complex such of an of human intuition (Roberts, capacity the 1981). To handle complexity, system dynamicists rely on a powerful set of high this complexity high the computer simulation tools. particular Simulation's modeling models engineering it remarkably both more complex models and experimentation. remarkably is difficult argued have possible its greater fidelity in is more complex systems. It also allows for less costly and of time-consuming less for making processes, advantage for testing ideas to easy Because to "... propose in software hypotheses and test them" (Weiss, 1979), several authors the desirability of having such a laboratory tool and hypotheses (Thayer, 1979). Paraphrasing Forrester (1961): The effects of different assumptions and environmental factors can be tested. In the model system, unlike the real systems, the effect of changing one factor can be observed while all other factors are held unchanged. Such experimentation will yield new insights into the characteristics of the system that By using a model of a complex system, the model represents. more can be learned about internal interactions than would ever manipulation the real system. be possible through of Internally, the model provides complete control of the system's its policies, and its sensitivities organizational structure, to various events. In to the remaining part of this section we will utilize the model conduct tradeoffs a series between the of simulation experiments to investigate the economic costs and benefits of QA. The 31 prototype project called used project software in simulation experiments, the is 64,000 delivered source instructions in EXAMPLE, size (with the QA parameter profile provided in the Appendix). important relationship to investigate, obviously, is the one An between the percentage have effort QA established detection Shooman reported detecting through (i.e., would needed be testing therefore, possible from and one phase it example, in a study by was determined that error during the design phase is one-tenth the effort that user of the and maintenance a additional inventory manuals, of etc., that in the later case. A primary goal of QA, errors "that only design a because correction is project and the detect and correct it later during the system code, require (1981), activities) QA to phase specifications, would McClure in the software For errors. of correcting and a significant cost savings gained by the early the correction and in detected during development. Several studies errors of expended be detected and corrected as early as minimal amount of problems be allowed to slip the development to the next" (Tsui and Priven, of 1976). The of relationship errors Figure 4. "diminishing beyond between QA effort expended and the percentage detected obtained from model experimentation is shown in The significant returns" of QA feature exhibited of this result is the as QA expenditures extend 20-30% of development effort. This type of behavior has been 32 % of Errors Detected 10 20 30 40 QA Effort as a Development Figure (4) % of Effort 33 out as QA expenditures extend beyond 20-305i of development flattens is evident from the cost patterns of Figure 5. This effort. flatten out rework (i.e., correcting errors of exceed expenditures QA as As can and system testing (at the end of development) development) during costs combined the seen, On the other hand, 20%. that increasing QA as a percentage of the development effort notice results effort development sequence value, global estimates are then Allocations distribution project's increases. To see followed why, planning in to QA, the consider the a project's man-days is estimated. Based on project's schedule is calculated. The two the to determine the project's average staff used are then made to the project's various lifecycle decisions schedule allocated total First, (including activities is typically activities. various size. itself steps of effort development the of this happens is that as a larger reason The man-days). (in (not a linear) increase in QA's absolute an exponential in fraction this 1983) and errors that result from the application of QA, processing of cost (Shooman, literature [e.g., the the results of Figure 4 suggest is that the savings in the What be in 1981)]. (Boehm, cost others by observed is were activity). QA the typically made come (Boehm, Notice after, 1981). same not that effort before, the Thus if two different software project (e.g., project managers project EXAMPLE), and if the only thing that differentiates them is their policies on the to run the percentage of the development effort to 34 Cost in Man-Days 3000 2500- Rework & Testing Cost 2000 15001000- QA Cost 500 10 20 30 40 QA Effort as a Development Figure (5) % of Effort 35 allocate to QA, both managers would still initiate their respective projects with and overall the scenario Thus, man-day the schedule. simulation different the estimate same. of the the total man-days exactly is experiment runs well as It i.e., this type of is designed to capture. model, the project's total as the scheduled completion date remain But since increases in the QA-effort allocation mean that manpower less project our that in same global estimates, the effort will be available for development work (e.g., design and given schedule. coding), a larger will be required to meet the team larger A team means larger training and communication overheads, and hence the larger development cost. The and perhaps more interesting, final, experiments issue we addressed in the "optimal" QA-effort expenditure. concerns our QA For our prototype project EXAMPLE, the answer is shown in Figure plots EXAMPLE'S expenditures (defined which man-days). As can be cost (in man-days) against QA-effort total in 6, terms of percentage of development the optimal QA effort expenditure for seen, this project is 16% of total development effort (in man-days). Two first, to be drawn from Figure 6. The impact on a total project cost. As can be seen from the project EXAMPLE'S cost ranges from a low of 3,770 man-days, values higher. can more generalizable conclusion, is that QA policy does have significant figure, conclusions important At in the range of 5,000 man-days i.e., low values of QA expenditures, values that are 33% this increase in cost — 36 Project Cost In Man-Days 6000- 5000- 4000 3000 — I 10 20 30 40 QA % of Effort as a Development Effort Figure (6) 37 values high at hand, expenditures cost large the from results expenditures, QA of the testing phase. On the other of the excessive QA which at high values are less productive because (and of the diminishing returns phenomenon) are themselves the culprit. expenditure level significant project, software paper's this deriving course, What, our in the opinion, optimal QA really is is not its particular value, since generalized beyond but the process of deriving it, namely, rather System experimentation this Dynamics (which experiment's simulation EXAMPLE approach. would be too costly and consuming to be practically feasible), as far as we know, this time model provides costs/benefits is result integrative controlled Beyond 16%. this be of is, of about cannot this result second The first capability to quantitatively analyze the the of QA policy for software production. And this, it encouraging to note, is generalizable, in the sense that one can models for different software development environments to customize derive environment-specific optimality conditions. Summary: QA function has, in recent years, gained the recognition of The being critical a systems. However, techniques software, to the factor in the successful development of software because the utilization of QA tools and does tend to add significantly to the cost of developing the cost-effectiveness of QA has been a pressing concern software quality manager. As of yet, though, this concern 38 has not been adequately addressed in the literature. objective Our between the economic benefits and costs of QA efforts in terms of a software project's total development cost. To do this, we developed integrative an System model process. The multiple functions both Dynamics model comprehensive is of the software devlopment in that it integrates the the software development process, including of management-type the functions (e.g., planning, control, and as well as the software production-type activities (e.g., staffing) design paper was to investigate the tradeoffs this in coding) and generation as . well The model also captures the dynamics of error the as activities of error detection and QA correction. important An vehicle to Experimental this paper specific 16% only utility conduct results of the model is to serve as a laboratory controlled experiments on QA policy. pertaining to the economics of QA reported in show how QA policy impacts total project costs. For the example analyzed, the optimal QA effort was shown to be of the total development effort. Although that particular value applies to the example system dynamics is generalizable. project, the analysis process using 2 39 APPENDIX QA Parameter Profile for Software Pro.lect EXAMPLE represent a simple way of expressing Table equations relationships, particularly nonlinear relations, between variables Table equations have the following System Dynamics model. in a format: Y-variable = TABLE ( Table-name , X-variable L , , H , I ) The above equation indicates a functional relationship between independent X-variable and a dependent Y-variable. L, H, and I describe the low end L, high end H, and interval between points in a set of values of the independent X-variable. Table-name is the or set of constant values, of the name of an associated table, dependent Y-variable that correspond to each of the values of the X-variable. Thus, = TABLE 5 1 Table-1 X Y Table-1 an { , , =3/7/9/11/13/14 , , ) would represent the following functional relationship: 12 X Y 3 9 7 3 4 11 13 14 5 shown below, is as functions are used, table Such characterize the QA parameter profile of project EXAMPLE: 1. NOMINAL NUMBER OF ERRORS COMMITTED PER TASK (NERPK) NERPK Table-1 2. to = = 100 TABLE (Table-1, "% OF JOB WORKED", ERRORS/KDSI 25/23.86/21.59/15.9/13.6/12.5 , , 20) NOMINAL QA MANPOWER NEEDED PER ERROR (NQAMPE) 100 10) NQAMPE = TABLE (Table-2, "% OF JOB WORKED", = Table-2 4/. 4/. 39/. 375/. 35/. 3/. 25/. 225/. 21/. 2/. Man -Days /ERROR , , . 3. NOMINAL REWORK MANPOWER NEEDED PER ERROR (NRWMPE) NRWMPE Table-3 = = 100 TABLE (Table-3, "% OF JOB WORKED", Man -Days /ERROR 6/ 575/ 5/. 4/. 325/. 3 , . . . , 20) . 40 Bibliography: 1. Abdel-Hamid, T.K. "The Dynamics of Software Development Project Management: An Integrative System Dynamics Perspective." Unpublished Ph.D. dissertation, Sloan School of Management, MIT, January, 1984. 2. Abdel-Hamid, T.K. and Madnick, S.E. Software Project Management Englewood Cliffs, New Jersey: Prentice-Hall, Inc., to be published in 1987, . 3. Abdel-Hamid and Madnick, S.E. "An Integrative System Dynamics Perspective to Software Project Management: Arguments for an Alternative Research Paradigm." Submitted for publication to the MIS Quarterly December, 1986. . 4. Alberts, D.S. "The Economics of Software Quality Assurance." National Computer Conference 1976. . 5. Belford, P.C, et al. "An Evaluation of the Effectiveness of Software Engineering Techniques." IEEE COMPCON Fall, . 1977. 6. Boehm, B.W. Software Engineering Economics Englewood Cliffs, New Jersey: Prentice-Hall, Inc., 1981. 7. Boehm, B.W., et al "Some Experiences with Automated Aids to the Design of Large-Scale Reliable Software." Proceedings of the International Conference on Reliable Software April, 1975. . . . 8. Buckley, F. and Poston, R. "Software Quality Assurance." IEEE (January, 1984), 36-41. Trans, on Software Engineering . 9 Chow , T S . . ( ed Approach . . Software Quality Assurance: A Practical Silver Spring, MD: IEEE Computer Society Press, ) 1985. 10. and Fisher, M.J., (eds.) Software Quality Management New York: Petrocelli Book, Inc., 1979. Cooper, J.D. . 11. Cougar, J.D. and Zawacki, R.A. Motivating and Managing Computer Personnel New york: John Wiley & Sons, 1980. . 12. DeMarco, T. Controlling Software Projects Press, Inc. 1982. , . New York: Yourdon 41 13. Endres, A.B. "An Analysis of Errors and their Causes in System Programs." IEEE Tra nsactions on So ftware Engineering June, 1975, 140-149. . 14. "Design and Code Inspections to Reduce Errors in Program Development." IBM Systems Journal Vol. 15, No. Fagan, M.E. 3, . 1976. Cambridge, Mass: The 15. Forrester, J.W. Industrial Dynamics MIT Press, 1961. 16. Hart, J.J. "The Effectiveness of Design and Code Walkthroughs." The Sixth International Computer Software and Applications Conference (COMPSAC) November, 1982. . . 17. Jones, T.C. "Measuring Programming Quality and Productivity." IBM Systems Journal Vol. 17, No. 1, 1978, 39-63. . 18. Knight, B.M. "Organizational Planning for Software Quality." In Software Quality Management Edited by J.D. Cooper and M.J. Fisher. New York: Petrocelli Books, Inc., 1979. . 19. "System Perspective on Software Quality." In Software Quality Assurance: A Practical Approach Edited by T.S. Chow. Silver Spring, MD: IEEE Computer Society Lawler, R.W. . Press, 1985. 20. McClure, C.L. Managing Software Development and Maintenance New York: Van Nostrand Reinhold Company, 1981. 21. Mitchell, J.R. "Observations on the Use of Seven Structured Programming Techniques." IEEE 1980. . . 22. Myers, G.J. Software Reliability: Principles and Practices York: John Wiley & Sons, Inc., 1976. 23. Nelson, E.C. "Software Reliability, Verification and Validation." Proceedings of the TRW Synposium on Reliable Cost Effective Secure Software Redondo Beach, CA: TRW, Inc., 1974. . 24. . New . Putnam, L.H. and Fitzsimmons, A. "Estimating Software Costs," Parts I, II, and III. Datamation Sept., Oct., and Nov., . 1979. 25. "Productivity Measures in Software." The Economics of Information Processing Volume (2): Operations. Radice, A. Programming and Software Models. Edited by R. Goldgerg and H. Lorin. New York: John Wiley & Sons, Inc., 1982. 42 26. Richardson, Q.P. and Pugh, G.L. III. Introduction to System Dynamics Modeling with Dvnamo Cambridge, Mass: The MIT Press, 1981. . 27. Riggs, H.E. Managing High-Technology Companies. Belmont, CA: Lifetime Learning Publications, 1983. 28. Roberts, E.B. (ed.). Managerial Applications of System Dynamics Cambridge, Mass: The MIT Press, 1981. . 29. Shneiderman, B. Software Psychology - Human Factors in Computer and Information Systems Cambridge, MA: Winthrop Publishers, Inc., 1980. . 30. Shooman, M.L. Software Engineering - Design. Reliabil ity and Management New York: McGraw-Hill, Inc., 1983. . 31. Shooman, M.L. and Natarajan, S. "Effect of Manpower Development and Bug Generation on Software Error Models." Rome Air Development Center, RADC-TR-76-400 Jan., 1977. , "Modeling a Software Engineering Project Managaement System." Unpublished Ph.D. dissertation, University of California, Santa Barbara, 1979. 32. Thayer, R.H. 33. Thibodeau, R. and Dodson, E.N. "Life Cycle Phase Interrelationships." Journal of Systems and Software 1, 1980, 203-211. . and Priven, L. "Implementation of Quality Control in Software Development." National Compute r Conference. 1976. 34. Tsui, F. 35. Weinberg, G.M. The Psychology of Compute r Programming. New York: Litton Educational Publishing, Inc., 1971. 36. Weiss, D.M. "Evaluating Software Development by Error Analysis." Journal of Systems and Software Vol. 57-70. . 37. Vol. 1, 1979, Weiss, D.M. A Comparison of Errors in Different Software Development Environments A Report for the Naval Research Lab, Washington, D.C., July, 1982. . ',., '^953 055 Date Due SEP 123^ HOW '88 MAR Lt l ' 1 7 ^ ^m v0. • , «^' Ol9«^ Lib-26-67 l|j||||i|||| 3 TDflD DO ||lin|||iiii 4 III iNi|i|i|ii| ?31 lb