Measuring the Risk Factor

Measuring the Risk Factor By Yamini Munipalli Jul 25, 2005 Summary: The concept of risk is inherent in any development effort. Therefore, the best way to deal with risk is to contain it. One way to contain risk is through risk management, which involves three aspects: identifying risks, analyzing exposure to the risks in a development effort, and execution of the risk management plan. In this article, Yamini Munipalli details one way of assigning and managing risk to a software development plan. This version of risk analysis, drawn from many schools of thought, remains flexible enough to use within any company for any project. Risk management is tricky because the process involves subjective thinking on the part of individuals in the organization. Identification of risks is generally based on an individual's experience and knowledge of the system. Since experience and knowledge are unique to each individual, it is important to employ a wide range of individuals on the risk management team. Risk management also involves an assessment of the risk tolerance level in the organization. Companies that are more tolerant of risk will be less likely to develop a risk management approach. However, in some industries like the medical industry, there is little tolerance for risk. While risk management can be applied to any type of industry, Yamini discusses a software risk management technique: risk analysis. What is Risk Analysis? Risk analysis is part of an organization's overall risk management strategy. One definition of risk analysis is the "process of exploring risks on the list, determining, and documenting their relative importance." It is a method used to assess the probability of a bad event; and it can be done by businesses as part of disaster recovery planning as well as part of the software development lifecycle. The analysis usually involves assessing the expected impact of a bad event such as a hurricane or tornado. Furthermore, an assessment of the likelihood of that bad event's occurrence is also taken. Proposed methods of risk analysis include different indicators. Since one method might not fit perfectly for your project, I suggest pooling together a multitude of expert insight to see if one, or a combination of many methods, works well for you. One suggestion from Payson Hall recommends a risk analysis matrix that includes expected impact, probability, and surprise or the difficulty of timely detection of the risk. Rex Black proposes using the indicators of severity, priority, and likelihood of failure to complete a risk analysis. Johanna Rothman suggests using severity and probability of occurrence as indicators in doing a risk analysis. In the article, "Software Risk Management Makes Good Business Sense," Steve Goodwin advises using severity as the only indicator of risk. Dr. Ingrid B. Ottevanger multiplies the "chance of failure X damage." (This is essentially the likelihood of failure multiplied by the expected impact.) James Bach also considers likelihood of failure and impact of failure as good indicators of the magnitude of risk, while Geoff Horne proposes using the indicators of expected impact and likelihood of failure. The method adopted here modifies Rick Craig and Stefan Jaskiel's work in Systematic Software Testing, presenting a method to complete software risk analysis using other indicators than "expected impact" and "likelihood of failure." Before we do a risk analysis, however, we must understand what is meant by the term "risk." Definitions of Risk Risk is the probability that a loss will occur, "a weighted pattern of possible outcomes and their associated consequences." It indicates "the probability that a software project will experience undesirable events, such as schedule delays, cost overruns, or outright cancellation. Risk is proportional to size and inversely proportional to skill and technology levels." Thus, the larger the project the greater the risk. These definitions indicate that risk involves possible outcomes and consequences of those outcomes. Potential outcomes include both negative and positive outcomes. Negative outcomes such as undesirable events can occur, and when they occur, there will be a loss to someone. The loss can occur in terms of money, lives, or damage to property. Risk reduction strategies differ based on the level of maturity of the organization. The more mature the organization, the less likely it will be to take risks. Thus, the more mature the organization, the more likely it is for a software team in that organization to do risk analysis of software. This leads us to the justification for doing a risk analysis. Why Perform a Risk Analysis? In the medical industry, risk analysis is done for the following reasons: 1. Risk analysis is required by law. 2. Identification of device design problems prior to distribution eliminates costs associated with recalls. 3. Risk analysis offers a measure of protection from product liability damage awards. 4. Regulatory submissions checklists (PMA and 510k) used by the FDA now call for inclusion of risk analysis. 5. It is the right thing to do. Some of these reasons could also apply to software risk analysis and disaster recovery planning in that risk analysis offers protection from product liability damages. Also, it is cheaper to fix a software defect if found in the development stage rather than if a customer finds the defect. The risk analysis process "provides the foundation for the entire recovery planning effort." Similarly, in software development, risk analysis provides the foundation for the entire test planning effort. It should be included as an integral part of the test plan as a method to guide the test team in determining the order of testing. The argument here is that testing reduces risks associated with software development and the software risk analysis allows us to prioritize those features and requirements with the highest risk. Testing high-risk items first reduces the overall risk of the software release significantly. Risk analysis also allows the test team to set expectations about what can be tested in the given amount of time. What are the risks associated with software development? Jones mentions sixty software risks in his book, Assessment and Control of Software Risks. Among these are cost overruns, canceled projects, high maintenance costs, false productivity claims, low quality, missed schedules, and low user satisfaction. The cause of low user satisfaction comes from inadequate requirements in the sense that software may have been built without adequately considering the needs of the user community. This leads us to the scope of risk analysis. Scope of the Risk Analysis The method I present is limited to assessments of software requirement specifications and features. It does not refer to the more general software risks mentioned above. Who performs the software risk analysis? Typically, everyone involved with the software development lifecycle. The users, business analysts, developers, and software testers are all involved in conducting risk analysis. However, it is not always possible to have everyone's input, especially the users. In that case, the testers should conduct the software risk analysis as early as possible in the software development lifecycle. Typically, risk analysis is done in the requirements stage of the software development lifecycle. Two indicators have been proposed as indicators of risk: the expected impact of failure and the likelihood of failure. Let's talk about these in turn. Expected Impact Indicator The software team should ask the question, "What would be the impact on the user if this feature or attribute failed to operate correctly?" Impact is usually expressed in terms of money or the cost of failure. For each requirement or feature, it is possible to assign a value in terms of the expected impact of the failure of that requirement or feature. Assign a value of high, medium, or low for each requirement as a measure of the expected impact. You should concentrate your focus only on those features and attributes that directly impact the user, not necessarily on the testing effort. If you run into the situation where every feature or requirement is ranked the same, then limit the number of values each user can assign. Let's look at the expected impact and likelihood of failure for a hypothetical Login system: Table 1 – Expected Impact and Likelihood of Failure for the Login Functionality The requirement that the "UserId shall be 4 characters" has a low expected impact of failure because there is not much of an impact to a user if the userid is more or less than 4 characters. The same reasoning can be applied to the requirement that the "Password shall be 5 characters." However, the requirement that the "System shall validate each userID and password for uniqueness" has a high impact of failure because there could be multiple users with the same userID and password. If the developer does not code for this, security is at risk. Likelihood of Failure Indicator As part of the risk analysis process, the software team should assign an indicator for the relative likelihood of failure of each requirement or feature. Assign H for a relatively high likelihood of failure, M for medium, and L for low. When the software team assigns a value for each feature, they should be answering the question, "Based on our current knowledge of the system, what is the likelihood that this feature or attribute will fail or fail to operate correctly?" At this point, Craig and I differ in that he argues that complexity is a systemic characteristic and should be included as part of the likelihood indicator. My argument is that complexity should be an indicator on its own. Furthermore, severity should also be considered. Four indicators provide more granularity and detail than just the two typical indicators. In Table 2, I have shown that if the prioritization is the same between two different requirements then it is not possible to discern which requirement is more risky. If we have three or more indicators we are in a better position to evaluate risk. Complexity Indicator Something that is complex is intricate and complicated. The argument here is that the greater the complexity of the feature, the greater the risk. More interfaces means that there will be more risk involved with each interface as well as the overall system. According to Craig and Jaskiel, Tom McCabe devised a metric known as cyclomatic complexity that is based on the number of decisions in a program. His studies have shown a correlation between a program's cyclomatic complexity and its error frequency. "A low cyclomatic complexity contributes to a program's understandability and indicates it is amenable to modification at lower risk than a more complex program." He, along with others, has shown that those parts of the system with high cyclomatic complexity are more prone to defects than those with a lower value. Cyclomatic complexity can be used in the test planning phase in that "Mathematical analysis has shown that cyclomatic complexity gives the exact number of tests needed to test every decision point in a program for each outcome. Thus, the analysis can be used for test planning. An excessively complex module will require a prohibitive number of test steps; that number can be reduced to a practical size by breaking the module into smaller, less-complex sub-modules." There are other measures of complexity that can be used for risk analysis. These are the Halstead Complexity Measures, Henry and Kafura metrics, and Bowles metrics. Assign a value of H for high, M for medium, or L for low for each requirement based on its complexity. Severity Indicator My approach is also different from Craig and Jaskiel in another way. I consider the severity of failure as a separate indicator. Severity is defined as "harshness" of the failure. Harshness of failure indicates how much damage there will be to the user community and also implies that there will be some suffering on the part of the user if the failure is realized. This suffering could be in the form of money, emotional stress, poor health, death, etc. Consider the following case of a software failure that resulted in deaths. In 1986, two cancer patients at the East Texas Cancer Center in Tyler received fatal radiation overdoses from the Therac-25, a computer-controlled radiation-therapy machine. There were several errors, among them the failure of the programmer to detect a race condition (i.e., miscoordination between concurrent tasks). Or consider the case in which a New Jersey inmate escaped from computer-monitored house arrest in the spring of 1992. He removed the rivets holding his electronic anklet together and went off to commit a murder. A computer detected the tampering. However, when it called a second computer to report the incident, the first computer received a busy signal and never called back. These examples illustrate that software failures can both be fatal and cause suffering to those whose lives are affected by the deaths of loved ones. Thus, severity is different from expected impact in that expected impact does not consider the suffering imposed on the user but merely considers the effect of the failure. Therefore, I argue that the greater the severity, the higher the risk. Assign a value of H for high, M for medium, or L for low for each requirement based on its severity. Table 2 – Expected Impact, Likelihood of Failure, Complexity, and Severity for the Login Functionality The Method of Risk Analysis At this point, the software team should assign a number to each high, medium, or low value for likelihood, expected impact, complexity, and severity indicators. It is possible to use a range of 1 to 3 with 3 being the highest or 1 to 5 with 5 being the highest. If you use the 1 to 5 range, there will be more detail. To keep the technique simple, let's use a range of 1 to 3 with 3 for high, 2 for medium, and 1 for low. As Craig and Jaskiel state, "Once a scale has been selected, you must use that same scale throughout the entire risk analysis." Furthermore, they state that "If your system is safetycritical, it's important that those features that can cause death or loss of limb are always assigned a high priority for test even if the overall risk was low due to an exceptionally low likelihood of failure." Next, the values assigned to likelihood of failure, expected impact, complexity, and severity should be added together. If a value of 3 for high, 2 for medium, and 1 for low is used, then 9 risk priority levels are possible ( i.e., 12, 11, 10, 9, 8, 7, 6, 5, 4). Table 3 - RiskPriority Cube Notice that the requirement "System shall validate each userID and password for uniqueness" has a relatively high likelihood of failure, high degree of complexity, high expected impact of failure, and high severity of failure which would give it a risk priority of twelve (3 + 3 + 3 + 3). The requirement that each password be five characters long has a risk priority of seven. The next step is to reorganize the list of requirements in order of risk priority. This sorted list provides clear insight into which requirements to test first. As Craig and Jaskiel point out, however, this technique, "doesn't take into account the testing dependencies." Table 4 – Sorted Priorities for the Login Function After this, the software team should establish a "cut line" to indicate the line below which features will be tested less. Table 5 – "Cut Line" for Login Function Requirements Table 5 indicates that the requirement, "Upon successful login, a welcome screen shall be presented" will be tested less in the current release of the software. An optional issue to consider is mitigation of risk. For example, the mitigation strategy for the highest priority risk in Table 5 may be to make code reviews a mandatory part of the software development process. Conclusion Risk analysis should be done early in the software development lifecycle. While there are many indicators of risk, I propose that expected impact, likelihood of failure, complexity, and severity should all be considered as good indicators of risk. Risk analysis allows you to prioritize those requirements that should be tested first. The process allows the test team to set expectations about what can be tested within the project deadline. The risk analysis method presented here is flexible and easy to adopt. Many different indicators can be used. It is also possible to use different rankings rather than one through three. The higher the scale, the more granular the analysis. Further Reading               "Risk-Based Testing" by James Bach "The Risks to System Quality" by Rex Black "Systematic Software Testing" by Rick Craig and Stefan Jaskiel "Waltzing with Bears: Managing Risk on Software Projects" by Tom DeMarco and Timothy Lister "Software Risk Management Makes Good Business Sense" by Steve Goodwin "A Calculated Gamble" by Payson Hall "Testing in a Squeezed, Squeezed World" by Geoff Horne "How Software Doesn't Work" by Alan Joch and Oliver Sharp "Assessment and Control of Software Risks" by Capers Jones "An Introduction to Risk/Hazard Analysis for Medical Devices" by Daniel Kamm Ottevanger, Dr. Ingrid B. "A Risk-Based Test Strategy," IQUIP Informatica B.V. 3 (November 22, 2000). "Risk Analysis Basics" by Johanna Rothman "Cyclomatic Complexity" by Edmond VanDoren Wold, Geoffrey H., and Robert F. Shriver. "Risk Analysis Techniques," Disaster Recovery Journal: Vol. 7 no. 3. About the Author Yamini Munipalli is a CSTE-certified SQA analyst who has worked in software development for the last seven years at Landstar System, Inc., a $2 billion, nonasset-based transportation services company serving nearly 10,000 small-business owners. Yamini holds a Masters degree in Political Science and teaches evening classes in American Government at Florida Community College at Jacksonville in Florida.

Measuring the Risk Factor

Related documents

Products

Support

Measuring the Risk Factor

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib