Testing Metrics and Measurement (not in textbook)

Why Metrics in Software Testing? • How would you answer questions such as: – Project oriented questions • How long would it take to test? • How much will it cost to test? – Product oriented questions • How bad/good is the product ? • How many problems still remain in the software? – Test activities oriented questions • Will testing be completed on time? • Was the testing effective? • How much effort went into testing • All these Questions require some type of measurements and record keeping in order to answer properly. Some Basic Concepts on Measurement • What do we need before we can measure something? – Clear understanding and definition of the attribute/characteristic that we are trying to gauge – The metric that may be used to gauge that attribute – The methodology for performing the measurement. (often forgotten once we get the first two done ---- including yours truly.) 1. Clarifying & Defining the Attribute to be Measured • Characterizing the attribute of interest – Size Attribute: • Physical height is a size sub-attribute of many items. – Height of a building, person, tree - - - not a problem – Height of a ball or ocean ? - - - not comfortable? Why? • Physical weight is a size sub-attribute of many items • What is the size attribute for software? What does it address? – The source statements - - - with screens? with db tables? – The storage space that the object code occupies in memory ? – Quality Attribute: • For a car ? - - - how fast it can accelerate? Number of times the car stalled? Number times the lights don’t work? • For software? - - - how many times we need to “re-boot”?, how good does the screen look? How many times do we need to call help-line? Or (# of times not Meeting customer requirements) 2. Metric for Gauging the Attribute • Metric – a unit used for describing or for measuring an attribute – Inches is a metric used for measuring the length attribute (simple metric) – Miles per hour is a metric for measuring the speed attribute (complex metric – requires 2 metrics) – Lines of code is a metric for measuring the size attribute of software (not a very good one) – Problems found per thousand lines of source code is a metric for defect discovery rate attribute of software. (or is this for software quality attribute? 3. Conducting the Measurement • Once the attribute is defined and the associated metric is defined, the actual methodology to determine the extent of an attribute using that metric has to be spelled out. – How do you measure the length of a person using inches? – How do you measure the distance from earth to the moon using inches? – How do you measure the size of the computer program using bytes? – How do you measure the defects in a program using problems found during program testing? ( note: problems found may be counted in many ways - - - unique ones, accepted ones, etc.) Some General, Test Measurements • Time is used to measure the length of period expended for testing – Time to setup and conduct (run) a test or a set of tests • Units of measurement in minutes or hours – Time to design and document test cases • Units of measurement in minutes or hours • Keeping track of time gives us one parameter to help us plan for future testing; but time must be balanced with the “size” of the test. – 2 seconds to run a simple query – 5 seconds to run a complete purchase transaction with confirmation • “Size” of test is needed to make “time of test” more meaningful or conversely can amount of “test time” be used as a metric for size of test attribute? Size of Test • Test size attribute may use different metrics: – Amount of time to run test: (bit convoluted ?) • Small size : less than or equal to 3 seconds • Medium size: between 3 seconds and 1 minute • Large size: 1 minute or above – Number of lines of statements to document the test case: • Small size: less than or equal to 3 statements • Medium size: between 4 and 7 statements • Large size: 8 or more statements Any suggestions - - - - ? Number of test cases? --- or --- type of test such as unit test versus integration test ?---- Quality : # of Problems • The attribute , Quality, is often measured with the metric of number of problems found; but number of problems alone does not tell the whole story - - - consider – Severity of problems • High • Medium • low – Type of problems • • • • UI Database Network outage Etc. Quality (cont.) • Both Severity and Type are important – – – – – # of problems found by severity # of problems found by type # of problems found when (when during development) # of problems found when (months after release) # of problems found where (UI,DB, Logic, Network, etc.) • Quality Information is relevant to both: – Software providers – Customers/users Why important to users? What would they do with it? Problem Find Rate Problem Find Rate # of Problems Found per hour The Weibull probability density curve: f(t) = (m/t) (t/2)m e –z where z = (t/c)m - for m= 1, the curve looks as dotted line - for m = 2, the curve looks as solid line and is called Rayleigh Time Day 1 Day Day Day 2 3 4 Day 5 Does severity of problem matter here? (it should , but not considered here) Problem Fix Rate Problem Fix Rate Problem Find Rate During Functional Test # of Problems Fixed per hour Problem Fix Rate During Functional Test Time Day 1 Day Day Day 2 3 4 Day 5 Would this fix rate present a problem ? Would you also want to keep a backlog # by day ? Problem Density Density Note: Just the # of problems found by area does not normalize the measurement; we need the per KLOC. 6 5 # of problems found per KLOC 4 3 2 1 Area Module 1 Module 2 Module 3 Module 4 Test Coverage Rate • Not all the planned test cases are actually run. – # of test cases executed / # of test cases planned • By functional areas • By test phases – # of source statements executed / total # of source statements • By functional areas • By modules Test Activity Effectiveness • Defect discovery and eradication activities occur at all phases of development. To see which is more effective one may use: – # of problems found / total # of problems found • By development phase (req. rev., design rev., func. test, system, etc.) – # of problems found / person-days of effort • By test activities (e.g. boundary value testing, branch testing, d-u testing, etc.) Fix Effectiveness • Not all problem fixes resolve the problems. – # of fixes that worked / total # of fixes • The first time – # of fixes that required more than 1 fix / total number of fixes Fix Cost • Fix cost is usually measured by amount of effort expended. – # of person-hours expended / fix • By severity • By areas • By phase type (including post-release) If the fix cost for post-release is higher than that of all of the pre-release phases, then that will be one reason for test and reviews. Problem Cost Comparison • Effort expended in discovering a problem and the effort expended in fixing that problem is the “test” cost during pre-release. • Effort expended in fixing a problem and releasing it to the customer is the “support” (problem resolution) cost during post-release. • Compare: (effort in people hours) effort expended / problem found and “fixed” (pre-release) .vs. effort expended / problem “resolved” (post-release) Post-release resolution usually cost more How “Big” is it (testing w/o fix) ? How would you answer this? 1. Assume --- # of test cases planned by size (or complexity): • • • 2. large – 35 test cases Medium – 200 test cases small – 40 test cases Assume --- average effort required to design and test • • • large – 1 person hour Medium – 15 person minutes small – 5 minutes Then ---- “How Big is Testing?” may be answered 3. • (35X60) + (200x15) + (40x5) = 5,330 person-minutes or 88.33 person-hours So, In this case --- how big is testing? - It is 275 test cases. - It is 88.33 person hours of effort. How Long Would it take? • Use the same example of 88.33 person-hours of test planning and execution effort. • You need to make some assumptions: – assume 2 testers of about equal ability – split the work effort evenly – 88.33people-hours/2 people = 44.17 hours – further assume that each person works 6 hours a day – 44.17 hours/ 6hours-perday = 7.3 days • So this will take 2 testers working 6 hours a day for 7.3 days

Testing Metrics and Measurement (not in textbook)

Related documents

Products

Support

Testing Metrics and Measurement (not in textbook)

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib