Categories of software defects See accompanying Word file “Software defects 2” Software warranties • Four general categories of software defects tend to be significant in product liability litigation: – Errors of commission (something is done that is wrong) – Errors of omission (something was left out by accident) – Errors of clarity and ambiguity, (two people reach different interpretations of what is meant) – Errors of speed or capacity (the application works, but not fast enough) Origins of defects – – – – – – – Errors in Errors in Errors in Errors in Errors due to Errors in Errors in Requirements Design Source Code User Documentation "Bad Fixes" Data and Tables Test Cases Requirements defects (1) • All four categories of defect are found in requirements • The two most common problems are errors of omission and errors of clarity and ambiguity. • If requirements errors are not prevented or removed, they flow downstream into design, code, and user manuals. • Errors which originate in requirements tend to be the most expensive and troublesome to eliminate later. • For reducing requirements defects, prevention is usually more effective than defect removal. Requirements defects (1) ... • Can requirements for large systems can ever be complete, given the observed rate of creeping requirements during the development cycle. • Since requirements grow at rates between 1% and 3% per month during development, the initial requirements often describe less than 50% of the features that end up in the final version when it is delivered. • Once deployed, applications continue to change at rates that approximate 5% to 8% new features every year, and perhaps 10% modification to existing features. Formal requirements methods • Requirements based on natural language will always be troublesome. • Several formal requirements methods have been developed (but are not widely used) • DeMarco's structured English IBM's HIPO diagrams (Hierarchy plus Input, Processing, Output), the Problems Statement Language (M), WarnierOrr Diagrams, the Structured Analysis and Design (SADT) technique Design defects (2) • Design ranks next to requirements as a source of very troublesome, and very expensive errors. • All four categories of defects are found in software design and specifications, as might be expected. • The most common forms of design defects are errors of omission where things are left out, and of commission, where something is stated that later turns out to be wrong. • Errors of clarity and ambiguity are also common, and many performance related problems originate in the design process as well. Design defects & omission (2) • It is not clear if it is technically possible to specify all of the features and functions in a large software system. • MS Windows 95 or IBM's MVS in the 90,000 function point size range were fully specified, the volume would exceed 500,000 pages. • The completeness of software specifications declines as system size increases (over 1000fp.) Coding defects (3) • All four categories of defects can be found in source code, with errors of commission being dominant while code is under development. • Perhaps the most surprising aspect of coding defects when they are studied carefully is that more than 50% of the serious bugs or errors found in the source code did not truly originate in the source code. • A majority of so-called programming errors are really due to the programmer not understanding the design, or the design not correctly interpreting a requirement. • This is not a surprising situation. Software is one of the most difficult products to visualise prior to having to build it. • Built-in syntax checkers and editors associated with modern programming languages can find many "true" programming errors (such as missed parentheses or looping problems) • Even poor structure and excessive branching can now be measured and corrected automatically. • The kinds of errors that are not easily found are deeper problems in algorithms or those associated with misinterpretation of design. Language levels • Defects in low-level Procedural Languages • Defects in High-Level Non-Procedural Languages – It is interesting that there is no solid empirical data that stronglytyped languages have lower defect rates than weakly-typed languages, although there is no counter evidence either. • Defects in Object-Oriented Programming Languages – Since OO analysis and design has a steep learning curve and is difficult to absorb, some OO projects suffer from worse than average quality levels due to problems originating in the design. Documentation defects (4) • User documentation in the form of both manuals and online information can contain errors of omission and errors of commission • The most common kind of problem is that of errors of clarity and ambiguity. • Performance-related errors are seldom encountered in user information. Fix defects (5) • The phrase "bad fixes" refers to attempts to repair an error which, although the original error may be fixed, introduce a new secondary bug into the application. • Bad fixes are usually errors of commission, and they are found in every major deliverable although most troublesome for requirements, design, and source code. • Bad fixes are very common and can be both annoying and serious. • From about 5% to more than 20% of attempts to repair bugs may create a new secondary bug. • For code repairs, bad fixes correlate strongly with high complexity levels, as might be expected. • Repairs to ageing legacy applications where the code is poorly structured tend to achieve higher than average bad fix injection rates. • Often bad fixes are the result of haste or schedule pressures which cause the developers to skimp on things like inspecting or testing the repairs. Bad fix examples • when attempting to correct a loop problem such as going through the loop one time too often, the repair goes through the loop one time short of the correct amount; and • when correcting a branching problem that goes to the wrong subroutine, the repair goes to a different wrong subroutine. Data defects (6) • The topic of data quality and data defects is usually outside the domain of software quality assurance • Since one of the most common business uses of computers is specifically to hold databases, repositories, and data warehouses the topic of data quality is becoming a major issue. • Data errors can be very serious and they also interact with software errors to create many expensive and troublesome problems. • Many of the most frustrating problems that human beings note with computerised applications can be traced back to data problems. • Errors in utility bills, financial statements, tax errors, motor vehicle registrations, and a host of others are often data errors. Test-case defects (7) • Exploratory research carried out by IBM's software quality assurance group on regression test libraries noted some disturbing findings: – About 30% of the regression test cases were duplicates that could be removed without reducing testing effectiveness. – About 12% of the regression test cases contained errors of some kind. – Coverage of the regression test library ranged between about 40% and 70% of the code; i.e., there were notable gaps which none of the regression test cases managed to reach. Regression testing • For software regression means slipping backward, and usually refers to an error made while attempting to add new features or fix bugs in an existing application. • A regression test means a set of test cases that are run after changes are made to an application. • The test cases are intended to ensure that every prior feature of the application still works, and that the new materials have not caused errors in existing portions of the application.