Engineering "Just Right" Reliability By Priya Singh (Assistant Professor, Dept of SE, DTU) 1. Background Importance of quantifying reliability• Defining what we mean by "necessary" reliability for a product in quantitative terms is one of the key steps in achieving the benefits of software reliability engineering. • The quantitative definition of reliability makes it possible for us to balance customer needs for reliability, delivery date, and cost precisely and to develop and test the product more efficiently. • A failure is a departure of system behavior in execution from user needs; it is a user-oriented concept. • A fault is the defect that causes or can potentially cause the failure when executed, a developeroriented concept. • A fault doesn't necessarily result in a failure, but a failure can only occur if a fault exists. To resolve a failure you must find the fault. • Some failures have more impact than others. For this reason, projects typically assign failure severity classes to differentiate them from each other. • A failure severity class is a set of failures that have the same per failure impact on users. By Priya Singh (Assistant Professor, Dept of SE, DTU) • Extensive experience with software-based products has shown that it is often more convenient to express failure intensity as failures per natural unit. A natural unit is a unit that is related to the output of a software-based product and hence the amount of processing done. E.g. pages of output (l failure/K pages printed), transactions such as reservations, sales, or deposits (l failure/K transactions), and telephone calls (1 failure/ K calls). • Users prefer natural units because they express reliability in terms that are oriented toward and important to their business. • The measurement of natural units is often easier to implement than that of execution time, especially for distributed systems, since otherwise, we must deal with a complex of execution times. • We have been talking about failure intensity as an alternative way of expressing reliability. Thus, the units we choose for expressing failure intensity are used for expressing reliability as well. For example, if we speak of a failure intensity of 5 failures / 1000 printed pages, we will express the reliability in terms of some specified number of printed pages. By Priya Singh (Assistant Professor, Dept of SE, DTU) 2. Steps in engineering "just right" reliability 1. 2. 3. 4. Define what you mean by "failure" Choose a common reference unit for all failure intensities Set a system failure intensity objective for each associated system For any software you develop: a. Find the developed software failure intensity objective b. Choose software reliability strategies to optimally meet the developed software failure intensity objective. • Suppliers who are system integrators only need Steps 1,2 and 3. They just acquire and assemble components and do no software development. • Although system engineers and system architects traditionally performed all the foregoing activities, including testers in the activities provides a strong basis for a better testing effort. • Similarly, you should involve users in failure definition and in setting system failure intensity objectives. By Priya Singh (Assistant Professor, Dept of SE, DTU) 2.1 Defining "failure" for the product • Defining failures implies establishing negative requirements on program behavior, as desired by users. • This sharpens the definition of the function of the system by providing the perspective of what the system should not be doing. • Traditionally we specify only positive requirements for systems, but negative requirements are important because they amplify and clarify the positive requirements. • They indicate the product behaviors that the customer cannot accept. • Even if negative requirements are not complete, they are still valuable. • The degree of completeness is typically greater for legacy systems, and it increases with the age of the product as you gain more experience. • Always remember that you must focus on your users' definition of failure. You should make sure that the definition is consistent over the life of a product release. • You also define failure severity classes with examples at this point for later use in prioritizing failure resolution (and hence fault removal), but this is not part of the core software reliability engineering process. By Priya Singh (Assistant Professor, Dept of SE, DTU) 2.2 Choosing a common measure for all associated systems • When you choose a common reference unit for all failure intensities, failures per natural unit are normally preferable. • It expresses the frequency of failure in user terms. • Thus using natural units improves communication with your users. • In some cases, a product may have multiple natural units, each related to an important (from the viewpoint of use and criticality) set of operations of the product. If your product has this situation, select one natural unit as a reference and convert the others to it. If this is not possible, the best solution in theory would be to use failures per unit execution time as the common measure. • To choose among alternate natural units, consider (in order): 1. Average (over a typical failure interval) amount of processing for the natural unit (which can be viewed as the number of instructions executed, summed over all machines in the case of distributed processing) should be reasonably constant. 2. The natural unit should be meaningful to users 3. The natural unit should be easy to measure • It is desirable but not essential that a natural unit represents the execution of an operation. "Execution of an operation" means that it runs through to completion. By Priya Singh (Assistant Professor, Dept of SE, DTU) 2.3 Setting system failure intensity objectives • Our next step is to set the system failure intensity objective for each associated system. • How we do this will depend on whether your product has supersystems or not. • If supersystems exist, then you follow the left-hand path in the figure. 1. You choose the failure intensity objectives of the supersystems and from them determine the failure intensity objectives of the base product and its variations. 2. If you have no supersystems, then you follow the right-hand path in the figure. You directly choose the failure intensity objective of each standalone base product or variation. Choose failure intensity objectives of supersystem • In either case, the way in which you choose the failure intensity objectives is the same. • The only difference when supersystems exist is that you derive the failure intensity objectives for the base product and its variations from those for the related supersystems. By Priya Singh (Assistant Professor, Dept of SE, DTU) Choose failure intensity objectives of each BP and variation • It requires three steps to choose a system failure intensity objective: 1. Determine whether your users need reliability or availability or both. 2. Determine the overall (for all operations) reliability and/or availability objectives. 3. Find the common failure intensity objective for the reliability and availability objectives. By Priya Singh (Assistant Professor, Dept of SE, DTU) Reliability and Availability • Reliabilityi. ii. The probability that a system or a capability of a system will continue to function without failure for a specified period in a specified environment. "Failure" means the program in its functioning has not met user requirements in some way. "Not functioning to meet user requirements" is really a very broad definition. iii. Thus, reliability incorporates many of the properties that can be associated with the execution of the program. E.g. it includes correctness, safety, and the operational aspects of usability and user-friendliness. iv. Note that safety is actually a specialized subcategory of software reliability. v. Reliability does not include portability, modifiability, or understandability of documentation. By Priya Singh (Assistant Professor, Dept of SE, DTU) Natural Unit• It is a unit other than the time that is related to the amount of processing performed by a software-based product. • E.g. runs, pages of output, transactions, telephone calls, jobs, semiconductor wafers, queries, or API calls. • Other possible natural units include database accesses, active user hours, and packets. Customers generally prefer natural units. • Failure intensity, an alternative way of expressing reliability, is stated in failures per natural or time unit. By Priya Singh (Assistant Professor, Dept of SE, DTU) Availabilityi. It is the average (over time) probability that a system or a capability of a system is currently functional in a specified environment. ii. We usually define software availability as the expected fraction of operating time during which a software component or system is functioning acceptably. iii. Availability depends on the probability of failure and the length of downtime when a failure occurs. iv. Assume that the program is operational and that we are not modifying it with new features or repairs. Then it has a constant failure intensity and constant availability. We can compute availability for software as we do for hardware. v. It is the ratio of uptime to the sum of uptime plus downtime, as the time interval over which the measurement is made approaches infinity. The downtime for a given interval is the product of the length of the interval, the failure intensity, and the meantime to repair (MTTR). vi. Therefore, we ordinarily determine MTTR as the average time required to restore the data for a program, reload the program, and resume execution. vii. If we wish to determine the availability of a system containing both hardware and software components, we find the MTTR as the maximum of the hardware repair and software restoration times. By Priya Singh (Assistant Professor, Dept of SE, DTU) 2.4 Determining developed software failure intensity objectives • If you are developing any software for the product or its variations, then in each case you will need to set the developed software failure intensity objective. • You need the developed software failure intensity objectives so you can choose the software reliability strategies you will need to use and so that you can track reliability growth during system tests with the failure intensity to failure intensity objective ratio. • The software you are developing may be all or a component of the base product or its variations. • Note that suppliers who simply integrate software components will not need developed software failure intensity objectives unless the control program that links the components is sufficiently large that we should consider it in itself as developed software. • You first find the find expected acquired failure intensity and then compute the developed software failure intensity objective for the base product and each variation. By Priya Singh (Assistant Professor, Dept of SE, DTU) • To find the expected acquired failure intensity, you must find the failure intensity for each acquired component of the product and its variations. • The acquired components include the hardware and the acquired software components. • The estimates should be based (in order of preference) on: 1. Operational data 2. Vendor warranty or specification by your customer (if your customer is purchasing these components separately and they are not under your contractual control) 3. Experience of experts By Priya Singh (Assistant Professor, Dept of SE, DTU) 3. Engineering software reliability strategies • Once you have set developed software failure intensity objectives for the product and its variations, then in each case engineer the right balance among software reliability strategies so as to meet the developed software failure intensity and schedule objectives with the lowest development cost. • The quality and success of this engineering effort depend on the data you collect about your software engineering process and fine-tune with feedback from your results. • A software reliability strategy is a development activity that reduces failure intensity, incurring development cost and perhaps development time in doing so. • Since the objectives for the product and its variations are often the same, the strategies will often be the same. • You usually choose the reliability strategies when you plan the first release of a product. Faulttolerant features are generally designed and implemented at that time and then retained through all the subsequent releases. Hence reliability strategies must also be chosen at that time. • We plan the software reliability strategies for a new release of a product in its requirements phase. By Priya Singh (Assistant Professor, Dept of SE, DTU) • A software reliability strategy may be selectable (requirements, design, or code reviews) or controllable (amount of system test, amount of fault tolerance). • A selectable software reliability strategy is determined in a binary fashion, you either employ it or you don’t. • You can apply a "measured amount" of a controllable strategy. • It is not at present practical to specify the "amount" of review that you undertake; we have no reliable way to quantify the amount and relate it to the amount of failure intensity reduction that you will obtain. • However, the foregoing does not prevent you from maximizing the efficiency of the review by allocating it among operations by use of the operational profile. • In choosing software reliability strategies, we only make decisions that are reasonably optional (it is not optional to skip written requirements). These optional decisions currently are: i. use of requirements reviews, ii. use of design reviews, iii. use of code reviews, iv. degree of fault tolerance designed into the system, and v. amount of system test. By Priya Singh (Assistant Professor, Dept of SE, DTU) • We only consider the basic failure intensity for the new operations because we assume that the failure intensity objective has already been achieved for the previous release. • If the failure intensity objective changes between releases, the choice process for software reliability strategies as presented here now gives only an approximation to optimal choice. • However, it is probably impractical to make the adjustments required for optimality. • You will end up with more system testing required if the failure intensity objective is lower; less, if higher. By Priya Singh (Assistant Professor, Dept of SE, DTU) • The prediction of basic failure intensity is possible because the basic failure intensity depends on parameters we know or can determine: 1. Fault exposure ratio (the probability that a pass through the program will cause a fault in the program to produce a failure) 2. Fault density per source instruction at the start of the system test 3. Fraction of developed software code that is new 4. Average throughput (object instructions executed per unit execution time) 5. Ratio of object to source instructions By Priya Singh (Assistant Professor, Dept of SE, DTU) Procedure for choosing strategies• The procedure for choosing software reliability strategies is to first determine the required failure intensity reduction objective and then to allocate the failure intensity reduction objective among the software reliability strategies available with the present technology. • To determine the failure intensity reduction objective: 1. Express the developed software failure intensity objective in execution time. 2. Compute the basic failure intensity. 3. Compute the failure intensity reduction objective. • Usually you can identify software failure intensity objectives from the performance analysis for the system. • Divide the developed software failure intensity objective in natural or operating time units by the execution time per natural or operating time unit. By Priya Singh (Assistant Professor, Dept of SE, DTU) Preparing for Tests By Priya Singh (Assistant Professor, Dept of SE, DTU) 1. Background • In preparing for the test, we apply the operational profile information we have developed to plan for an efficient test. • Preparing for tests includes preparing test cases and test procedures and planning for any automated tools you wish to use. • We must prepare for each system of the product that we are testing. • However, we can often take advantage of commonalities that may exist. By Priya Singh (Assistant Professor, Dept of SE, DTU) • Software reliability engineering helps guide feature, load, and regression test. • Feature testoccurs first. It consists of single executions of operations, with interactions between the operations minimized. The focus is on whether the operation executes properly. • Load testwhich attempts to represent the field use and environment as accurately as possible, with operations executing simultaneously and interacting. Interactions can occur directly, through the data, or as a result of conflicts for resources. • Regression testconsists of feature test that you conduct after every build involving significant change. It is often periodic (a week is a common period, although intervals can be as short as a day or as long as a month (depends on factors such as system size and volatility and the degree to which a system must be ready for rapid release when market conditions change.) The focus is to reveal faults that may have been spawned in the process of change. • Beta test does not require much preparation, except for any work needed to record results, since you expose your product directly to the field environment. By Priya Singh (Assistant Professor, Dept of SE, DTU) 2. Some Terminologies • Input space- The input space is the complete set of possible input states. • Direct input variable- A direct input variable is an input variable that is external to an operation and controls the execution of the operation in an easily traceable way. This permits you to recognize the relationship between values of the variable and the processing that results, and to use this information to optimally select values to test. It results from a deliberate design decision. Some examples are an argument, a menu selection, or entered data • Indirect input variable- An indirect input variable is an input variable that is external to an operation and that influences execution of the operation in a way that is not easily traceable. This makes it impractical to recognize the relationship between values of the variable and the processing that results, and hence to optimally select values to test. An indirect input variable usually involves an environmental influence. Some examples are amount of data degradation, traffic level and history, and run sequence. We "select" values of indirect input variables indirectly by testing over time with as accurate a representation of field conditions as possible. By Priya Singh (Assistant Professor, Dept of SE, DTU) • Test procedure- A test procedure is the test controller for load test that sets up environmental conditions and invokes at various times test cases that it randomly selects from the test case set. • Test operational profile- The test operational profile is the operational profile modified to ensure that critical operations are tested adequately and to account for reused operations. • Note- In load test, we bring the full influence of the indirect input variables to bear. We drive load test with a test procedure. By Priya Singh (Assistant Professor, Dept of SE, DTU) • The concept of run recognizes any difference in functionality. Two runs within the same operation may differ in failure behavior, but this is less likely. • A run involves the execution of a test case. A test case is a partial specification of a run, characterized by a complete set of direct input variables with values. • Software reliability engineering differentiates "test case" from "run" because we want to be able to specify test cases independently of the environment in which we are executing. Incidentally, note that a test case is a specification and is not the same thing as a test script. • We control the indirect input variables carefully during feature and regression test so that they have little influence. This is necessary to limit the possible causes of failure until we know that the operation itself is performing reliably. Otherwise, debugging would be unnecessarily complex. For example, in the case of feature or regression test for Fone Follower, reinitialization would give us the same data state for each run. The resource state would be irrelevant because we would confine test to just one operation at a time and there would be no conflicts or queues for resources. By Priya Singh (Assistant Professor, Dept of SE, DTU) 3. Procedure for preparing for test • We must, for each base product and variation: 1. Specify the new test cases for new operations for the current release, based on the operational profile 2. Specify the test procedure, based on the test operational profile and traffic level. In addition, we must provide for: 1. Setup and invocation of test cases. 2. Recording each run executed (operation, test case, invocation time) and outputs. 3. Recording of the number of natural or time units at frequent intervals so we can establish when failures occurred. 4. Cleanup. Note: Recording would not be necessary if we found all failures at the time of test and none were found by later examination of test results, but this rarely occurs in practice. By Priya Singh (Assistant Professor, Dept of SE, DTU) 4. Preparing test cases Way-1 • In theory, one could prepare test cases by simply recording in the field all the input variables (user and other interface interactions and environmental variables) needed to initiate the runs that make up the field execution of the software. • This might result in less effort and cost than developing the operational profile and preparing the test cases. Reasons to avoid recording• Recording would necessarily be of a previous release, so it would not adequately represent the load expected for the new system with its new features. • Also, you may miss some rarely occurring but critical runs not experienced in the recording period. • Finally, one loses the possibility of improving test efficiency through careful test selection, using equivalence classes. This option is not possible when using record and playback. By Priya Singh (Assistant Professor, Dept of SE, DTU) Way-2 To specify the new test cases for the new operations of the current release, you must: 1. Estimate the number of new test cases that will be available for the current release 2. Allocate the new test cases among the base product and variations (supersystems use the same test cases as their base products or variations) 3. Distribute the test cases of the base product and each variation among its new operations 4. Detail the new test cases for each new operation. It shall be followed byOnce you have done this, you add the new test cases to the test cases from previous releases, removing the test cases of operations that you removed from the current release. Definition of new operationWe consider an operation "new" if we modify it in any nontrivial way. Deciding whether an operation in a new release is new or reused hinges on whether there has been any change that might introduce one or more faults. If you have any doubt, be conservative and designate it as new. By Priya Singh (Assistant Professor, Dept of SE, DTU) 4.1 Planning number of new test cases for current release • It involves first estimating the number of new test cases that you need, and then the number that you have the capacity to prepare. • From these two figures you determine the number of new test cases you will prepare. • You typically estimate the number of new test cases you will prepare late in the requirements phase when the general outline of the requirements becomes clear. • If the reliability of previous releases has been inadequate, you should consider both increasing the number of new test cases for new operations and providing additional test cases for reused operations as well. • You will also need more time for load test of software with high-reliability requirements so that you cover invocation of more different sequences of test cases and more different values of indirect input variables. • If most use of the product involves reused operations, then it will generally be better to estimate the number of new test cases needed by more directly taking advantage of experience from previous releases of your product. By Priya Singh (Assistant Professor, Dept of SE, DTU) • Let T be the total cumulative test cases from all previous releases. If experience from these releases indicates that reliability was too low, then increase this number; if too high, decrease it. Let N be the total occurrence probability of the new operations for the current release, and R be the total occurrence probability of the reused operations for the current release. Compute the new test cases needed for the current release as (N / R) x T. • Note: We should always have N + R = 1 or something is wrong. By Priya Singh (Assistant Professor, Dept of SE, DTU) • Factors in estimating the number of new test casesi. time and ii. cost. • In order to account for both factors, compute the number of test cases you have time to prepare and the number you can afford to prepare. • Take the minimum of these two numbers as the number of test cases you can prepare. • The Preparing for Test activity does not include any execution of tests. By Priya Singh (Assistant Professor, Dept of SE, DTU) 4.2 Allocating new test cases • The next step is to allocate the new test cases among the base product and its variations. • In doing this, we focus on variations that are due to functionality rather than implementation differences. • Functionality differences result in new and separate operations or modified operations. • These will all need new test cases. Variations due to implementation differences will not. • We will base allocation of new test cases on the unique new use for the associated system (base product or variation) we are currently considering. By Priya Singh (Assistant Professor, Dept of SE, DTU) • We designate the unique new use by the symbol U. • Let F be the expected fraction of field use for the associated system we are considering. This fraction is the ratio of its use in the field to total use in the field. • Let S be the sum of the occurrence probabilities of the new operations that are functionally different from those of previously considered associated systems. • You can now compute unique new use U for base product and each variation in tum as U=FxS • Thus the fraction of field use depends on the number of sites and the amount of field use per site. • We now compute a new test case allocation fraction L for the base product and each variation by dividing its unique new use U by the total of the Us for the base product and all the variations. • Finally, compute the new test cases C for the base product and each variation by multiplying the total new test cases Cα (considering the available staff would permit only 600 test cases to be developed, so what we could develop was limited to that number. ) by the allocation fraction L for the base product and each variation. By Priya Singh (Assistant Professor, Dept of SE, DTU) 4. 3 Distributing new test cases among new operations • We will now distribute the new test cases allocated to each associated system (base product or variation) among its new operations. • For each of these associated systems, we first determine an occurrence proportion for each new operation. • An occurrence proportion is the proportion of occurrences of a new operation with respect to the occurrences of all new operations for a release. • For an initial release, the occurrence proportions are the occurrence probabilities of the operational profile. This is because, in the first release, all operations are new operations. • For subsequent releases, divide the occurrence probability of each new operation by the total of the occurrence probabilities of the new operations. By Priya Singh (Assistant Professor, Dept of SE, DTU) • We distribute the new test cases to each new operation initially by multiplying the number allocated to the system by the occurrence proportion, rounding to the nearest integer. • We must distribute a minimum of one test case for each new operation so that we can test it. • Next we identify critical new operations (the one for which successful execution adds a great deal of extra value and failure causes a great deal of impact with respect to human life, cost, or system capability). • To identify critical new operations, look for operations that need much higher much reliabilities and hence much lower failure intensity objectives than the system failure intensity objective. • Then we compute the acceleration factor A, which is given by A = FIO(system) / FIO(operation). We will use this factor to increase the occurrence proportion, which will increase the number of new test cases distributed to the operation by the same factor. • We compute the modified number of new test cases for critical operations by first multiplying their occurrence proportions by their acceleration factors. Then we multiply the new occurrence proportions by the number of new test cases allocated to the system, rounding to the nearest integer. By Priya Singh (Assistant Professor, Dept of SE, DTU) 5. Preparing test procedures • To specify the test procedure, we must specify the test operational profile and the traffic level and reproduce any other significant environmental conditions necessary to make load test represent field use. • You specify the traffic level, which is indicated by the average total operation occurrence rate, as a function of time. • To specify the average total operation occurrence rate as a function of time, you specify a list of times from the start of execution of the system at which the average total operation occurrence rate changed, along with the new average total operation occurrence rates. • You can simplify this list by selecting the times at a constant period such as one hour, but you should use whatever frequency is appropriate and practical for your application. By Priya Singh (Assistant Professor, Dept of SE, DTU) • To specify a test operational profile for an associated system, we first start with the corresponding operational profile for the release. • We modify the occurrence probabilities of the operational profile for the critical operations by multiplying the occurrence probability for each critical operation by its acceleration. • We then normalize the results (so that the sum of the modified occurrence probabilities is 1) by dividing the modified probabilities by the sum of the modified probabilities. By Priya Singh (Assistant Professor, Dept of SE, DTU) Executing Test By Priya Singh (Assistant Professor, Dept of SE, DTU) 1. Background • Types of software reliability engineering test1. reliability growth test and 2. certification test. • These types are not related to phases of test such as unit test, subsystem test, system test, or beta test, but rather to the objectives of test. By Priya Singh (Assistant Professor, Dept of SE, DTU) 2. Reliability Growth Test • The main objective of reliability growth test is to find and remove faults. • During it, you use software reliability engineering to estimate and track failure intensity. Testers and development managers apply the failure intensity information to guide development and to guide release. • You typically use a reliability growth test for the system test phase of software you develop in your own organization. • You can also use it in beta test if you are resolving failures (removing the faults causing them) as you test. • To obtain "good" (with moderate ranges of uncertainty) estimates of failure intensity, you need a minimum number of failures in your sample, often 10 to 20. • You may follow reliability growth test by certification test as a rehearsal if your customer will be conducting an acceptance test. By Priya Singh (Assistant Professor, Dept of SE, DTU) 3. Certification Test • With the certification test, you make a binary decision: accept the software, or reject the software and return it to its supplier for the rework. • In the certification test, you require a much smaller sample of failures. In fact, you can make decisions without any failures occurring if the period of execution without failure is sufficiently long. • We generally use certification tests only for load tests (not feature or regression tests). • If we are simply integrating the product from components, we will conduct only a certification test of the integrated base product and its variations and supersystems. • Certification test does not involve debugging. • There is no attempt to "resolve" failures you identify by determining the faults that are causing them and removing the faults. • The system must be stable. • No changes can be occurring, either due to new features or fault removal. By Priya Singh (Assistant Professor, Dept of SE, DTU) • Feature test is the test that executes all the new test cases for the new operations of the current release, independently of each other, with interactions and effects of the field environment minimized. • Feature test can start after you implement the operation you will test, even if the entire system is not complete. • Load test is the test that executes all valid test cases from all releases together, with full interactions. • Invocations occur at the same rates and with the same other environmental conditions as will occur in the field. • Its purpose is to identify failures resulting from interactions among the test cases, overloading of and queueing for resources, and data degradation. • Among these failures are concurrency failures (deadlock, race, etc.) where different test cases are changing the same variable. • Acceptance tests and performance tests are types of load test By Priya Singh (Assistant Professor, Dept of SE, DTU) • Regression test is the test that executes a subset of all valid test cases of all releases at each system build with significant change, independently of each other, with interactions and effects of the field environment minimized. • Whether "significant change" occurred requires a judgment call; one criterion might be the percent of code changed. • The subset can consist of all the test cases. When less that the total set, the sample changes for each build and it includes all the critical test cases. • Its purpose is to reveal functional failures caused by faults introduced by program changes since the prior build. Note that this type of test does not check the effects of indirect input variables. By Priya Singh (Assistant Professor, Dept of SE, DTU) 4. Steps in executing tests • In executing the test, we will use the test cases and test procedures developed in preparing for test activity in such a way that we realize the goal of the efficient test that we desire. • Executing test involves three main sub-activities: i. determining and allocating test time, ii. invoking the tests, and iii. identifying failures that occur. • You will use the system failure data in the Guiding Test activity By Priya Singh (Assistant Professor, Dept of SE, DTU) 4.1 Planning and allocating test time for the current release What is test time? • Test time is the time required on test units to set up, execute, record results of, and clean up after tests whose purpose is to find previously undiscovered failures or the absence thereof. • It includes immediate preparation such as setup and immediate follow-up such as cleanup and identification of failures but it doesn't include long-term preparation such as the development of test cases and procedures. • It does not include the time required for failure resolution (finding and removing the faults, causing the failures, and demonstrating that the failures have been resolved). By Priya Singh (Assistant Professor, Dept of SE, DTU) Planning test time: • To plan the total amount of test time you will use, start by determining the test time you need. • We can determine the failure intensity reduction we must achieve through system test, which depends on the failure intensity objective and the software reliability strategies we implement. • We do not presently know how to predict the amount of test needed to achieve this failure intensity reduction. • It certainly depends heavily on how closely the operational profile we use in the system test matches the operational profile we will find in the field and how successful we are inaccurately identifying equivalence classes. • At present, we choose the amounts of time we think we need as a result of past experience with previous releases or similar products. • The best indicator of similarity of the product appears to be a similar failure intensity reduction that is to be achieved through a system test. By Priya Singh (Assistant Professor, Dept of SE, DTU) Estimating test time: • Next estimate your test time capacity by multiplying the system test period by the number of test units, where a test unit is a facility needed and available for test. • One test unit includes all the processors and hardware needed and is available to test an associated system. • Sometimes we reconfigure groups of processors during test, resulting in a different number of test units at different times. • In this case, use an average value for the test period. • If you have multiple test units, test can proceed at a more rapid rate By Priya Singh (Assistant Professor, Dept of SE, DTU) Finalize the test time • We then decide on the test time we will plan for. • If the test time we need and test time capacity are reasonably close to one another, we will plan based on the test time capacity. • If they differ greatly, we must add test units or increase the system test period. • You only negotiate for more test units or a longer system test period when the need is substantially greater than capacity because estimation algorithms are approximate and may be challenged. • You typically estimate the amount of test time you will plan late in the requirements phase, when the general outline of the requirements becomes clear. By Priya Singh (Assistant Professor, Dept of SE, DTU) Allocation of test time for a release proceeds in two steps: 1. Allocation among the associated systems to be tested (base product, variations, and supersystems of both base product and variations) 2. Allocation among feature, regression, and load test for reliability growth test of each base product or variation, after assigning time for certification test of each of these associated systems that need it (for example, if a customer is performing acceptance tests) Note that the test of supersystems is entirely the certification test. The certification test of any associated system consists entirely of load test. By Priya Singh (Assistant Professor, Dept of SE, DTU) 4.2 Invoking test • Software reliability engineering test starts after the units of the system being tested have been tested or otherwise verified and integrated so that the operations of the system can execute completely. • If this condition isn't at least approximately met, it will be difficult to drive tests with the operational profile and create a reasonable representation of field conditions. • We can test the base product and its variations in parallel. Each is followed by the test of its supersystems. • For the base product and each of its variations, test starts with feature test followed by load test, with regression test at each build during the load test period. • Reliability growth of the developed software is tracked during the test of the base product and each variation. There may also be a period of certification test at the end of each of the customers do acceptance test. By Priya Singh (Assistant Professor, Dept of SE, DTU) Feature Tests • During feature test, you invoke each test case after the previously invoked test case has completed execution. • Each test case is selected randomly from the set of new test cases for the release. • Do not replace the test case in the group available for selection after execution. You must provide setup and cleanup for each execution. • Feature tests will slightly overemphasize critical operations, but any resulting distortion of the failure intensity estimates will usually be negligible. • Since we have selected test cases in accordance with the operational profile, feature test will be a true representation of field operation except that failures due to data degradation, traffic level, environmental conditions, and interaction among operations will not occur. By Priya Singh (Assistant Professor, Dept of SE, DTU) Load Test • During load test, you invoke test cases with the test procedure, pacing them at an average rate based on the traffic level specified for the current time. • This approach flushes out failures related to traffic level as a proportion of capacity. Invocation of test cases at different times results in test cases being executed under a wide variety of states of data degradation. • Usually you invoke test cases at random times but you can invoke them at fixed times. • We provide for test case invocation to occur at random times to ensure that the occurrence probabilities are stationary (unchanging with time). • We do not want to duplicate runs, because a duplicate run provides no new information about failure behavior of the system. Thus duplication is wasteful of testing resources. By Priya Singh (Assistant Professor, Dept of SE, DTU) • Invoking test cases at random times will cause them to execute with different indirect input variables, resulting in the runs being different. • The number of test cases invoked will be determined automatically, based on the test time allocated and the occurrence rate. By Priya Singh (Assistant Professor, Dept of SE, DTU) Regression Tests • In regression test, you invoke each test case after the previously-invoked test case has completed execution. • You choose a subset of test cases for each build, consisting of all valid critical test cases and a specified number of test cases chosen randomly from all valid noncritical test cases, with the latter test cases not replaced until exhausted. • The subset must be sufficiently large such that all valid test cases will be executed over the set of builds for the release. • A valid test case is one that tests an operation that is valid or is still active and considered a feature of the current release. By Priya Singh (Assistant Professor, Dept of SE, DTU) • You can select a subset consisting of the entire set of test cases if you wish; however, the testing that results will be inefficient and can be justified only for low failure intensity objectives. • Since we have selected test cases in accordance with the operational profile, regression test will be a true representation of field operation except that failures due to data degradation, environmental conditions, and interaction among operations will not occur. • Operational Profile- Testing driven by an operational profile is very efficient because it identifies failures (and hence the faults causing them), on average, in the order of how often they occur. This approach rapidly reduces failure intensity as test proceeds, and the faults that cause frequent failures are found and removed first. Users will also detect failures on average in order of their frequency, if they have not already been found in test. By Priya Singh (Assistant Professor, Dept of SE, DTU) 4.3 Identifying failures • We identify system failures by analyzing test output promptly for deviations, detennining which deviations are failures, and establishing when the failures occurred. i. Analyzing test output for deviations ii. Determining which deviations are failures iii. Establishing when failures occurred By Priya Singh (Assistant Professor, Dept of SE, DTU) 4.3.1 Analyzing test output for deviations • A deviation is any departure of system behavior in execution from expected behavior. It differs from a failure in that a failure must make a user unhappy. • There are two reasons why we developed the concept of "deviation" in addition to that of "failure": deviations can often be detected automatically but failures usually require the judgment of a person, and deviations provide a nice way of describing fault tolerance. • We can usually automatically detect several types of deviations. First, many standard types are readily detected by the operating system: inter-process communication failures, illegal memory references, return codes indicating deviant behavior, deadlocks, resource threshold overruns, and process crashes or hangs. • Then we have easily recognized application behaviors (for example, incomplete calls or undelivered messages in telecommunications systems). • Finally, you can examine the output automatically with built-in audit code or an external checker. By Priya Singh (Assistant Professor, Dept of SE, DTU) 4.3.2 Determining which deviations are failures • In order to determine which deviations are failures, note that a deviation must violate user needs to be a failure. Manual analysis of deviations is often necessary to determine whether they violate user requirements and hence represent failures. • For example, identifying an incorrect display is hard to automate. However, usually failures of higher severity involve easily observable effects that would unquestionably violate user requirements. • The deviations in fault-tolerant systems are not failures if the system takes action to prevent them from violating user needs. But intolerance of deviations by a fault-tolerant system that is supposed to be tolerant of them may be a failure. By Priya Singh (Assistant Professor, Dept of SE, DTU) 4.3.3 Establishing when failures occurred • All the failures for an associated system are grouped together in a common sequence in the order in which they occur, regardless of whether they occurred in a feature test, regression test, or load test. • We will use as our measuring unit of when a failure occurred the common reference unit (natural or time units) chosen for the project to set failure intensity objectives and describe failure intensities. • There is one exception. If you are using an operating time unit, and the average computer utilization varies from one period compared to a failure interval to another period comparable to a failure interval, the time intervals measured will be influenced by this variation. By Priya Singh (Assistant Professor, Dept of SE, DTU)