Test coverage Tor Stålhane What is test coverage Let c denote the unit type that is considered – e.g. requirements or statements. We then have Cc = (unitsc tested) / (number of unitsc) Coverage categories Broadly speaking, there are two categories of test coverage: • Program based coverage. This category is concerned with coverage related to the software code. • Specification based coverage. This category is concerned with coverage related to specification or requirements Test coverage For software code, we have three basic types of test coverage: • Statement coverage – percentage of statements tested. • Branch coverage – percentage of branches tested. • Basic path coverage – percentage of basic paths tested. Finite applicability – 1 That a test criterion has finite applicability means that it can be satisfied by a finite test set. In the general case, the test criteria that we will discuss are not finitely applicable. The main reason for this is the possibility of “dead code” – e.g infeasible branches. Finite applicability – 2 We will make the test coverage criteria that we use finitely applicable by relating them to only feasible code. Thus, when we later speak of all branches or all code statements, we will tacitly interpret this as all “feasible branches” or all “feasible code”. Statement coverage This is the simplest coverage measure: Cstat = percentage of statements tested Path diagram P1 S1 <empty> P2 S2 <empty> P3 S3 <empty> S4 Predicates P1 P2 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 P3 0 1 0 1 0 1 0 1 Paths S4 S3, S4 S2, S4 S2, S3, S4 S1, S4 S1, S3, S4 S1, S2, S4 S1, S2, S3, S4 Branch coverage Branch coverage tells us how many of the possible paths that has been tested. Cbranch = percentage of branches tested Basis path coverage The basis set of paths is the smallest set of paths that can be combined to create every other path through the code. The size of this set is equal to v(G) – McCabe’s cyclomatic number. Cbasis = percentage of basis paths tested Use of test coverage There are several ways to use the coverage values. We will look at two of them coverage used • As a test acceptance criteria • For estimation of one or more quality factors, e.g. reliability Test acceptance criteria At a high level this is a simple acceptance criterion: • Run a test suite. • Have we reached our acceptance criteria – e.g. 95% branch coverage? – Yes – stop testing – No – write more tests. If we have tool that shows us what has not been tested, this will help us in selecting the new test cases. Avoid redundancy If we use a test coverage measure as an acceptance criterion, we will only get credit for tests that exercise new parts of the code. In this way, a test coverage measure will help us to • Directly identify untested code • Indirectly help us to identify new test cases Fault seeding – 1 The concept “fault seeding” is used as follows: • Insert a set of faults into the code • Run the current test set • One out of two things can happen: – All seeded faults are discovered, causing observable errors – One or more seeded faults are not discovered Fault seeding – 2 The fact that one or more seeded errors are not found by the current test set tells us which parts of the code that have not yet been tested – e.g. which component, code chunk or domain. This info will help us to define the new test cases. Fault seeding – 3 Fault seeding has one problem – where and how to seed the faults. There are at least two solutions to this: • Save and seed faults identified during earlier project activities • Draw faults to seed from an experience database containing typical faults and their position in the code. Fault seeding and estimation – 1 Seeded fault X X Test domain X X X X Real fault X X Input domain Fault seeding and estimation – 2 We will use the following notation: • N0: number of faults in the code • N: number of faults found using a specified test set • S0: number of seeded faults • S: number of seeded faults found using a specified test set Fault seeding and estimation – 3 Seeded fault X Test domain X X X Real fault X X X X Input domain N0 / N = S0 / S and thus N 0 = N * S0 / S or N0 = N * S0 / max{S, 0.5} Capture – recapture One way to get around the problem of fault seeding is to use whatever errors are found in a capture – recapture model. This model requires that we use two test groups. • The first group finds M errors • The second group finds n errors • m defects are in both groups m / n = M / N => N = Mn / m Capture – recapture No Customer 1 Customer 2 Common N 1 25 36 17 52 2 29 30 11 79 3 23 21 13 37 4 0-1 0-2 0 0-4 Output coverage – 1 All the coverage types that we have looked at so far have been related to input data. It is also possible to define coverage based on output data. The idea is as follows: • Identify all output specifications • Run the current test set • One out of two things can happen: – All types of output has been generated – One or more types of output have not been generated Output coverage – 2 The fact that one or more types of output has not been generated by the current test set tells us which parts of the code that have not yet been tested – e.g. which component, code chunk or domain. This info will help us to define the new test cases. Output coverage – 3 The main challenge with using this type of coverage measure is that output can be defined at several levels of details, e.g.: • An account summary • An account summary for a special type of customer • An account summary for a special event – e.g. overdraft Specification based coverage – 1 Specification based test coverage is in most cases requirements based test coverage. We face the same type of problem here as we do for output coverage – the level of details considered in the requirements. In many cases, we do not even have a detailed list of requirements. This is for instance the case for user stories frequently used in agile development. Specification based coverage – 2 The situation where this is most easy is for systems where there exist a detailed specification, e.g. as a set of textual use cases. Use case name (Re-)Schedule train Use case actor Control central operator User action System action 1. Request to enter schedule info 2. Show the scheduling screen 3. Enter the schedule (train-ID, start and stop place and time, as well as timing for intermediate points) 4 Check that the schedule does not conflict with other existing schedules; display entered schedule for confirmation 5. Confirm schedule Quality factor estimation The value of the coverage achieved can be used to estimate important quality characteristics like • Number of remaining fault • Extra test time needed to achieve a certain number of remaining faults • System reliability Basic assumptions In order to use a test coverage value to estimate the number of remaining faults, we need to assume that: • All faults are counted only once. • Each fault will only give rise to one error • All test case have the same size Choice of models – errors We will use the notation • N(n): number of errors reported after n executions • N0: initial number of faults There exists more than a dozen models for N(n) = f(N0, n, Q). It can be shown that when we have N(n) -> N0, we have N(n) = N0(1 – exp(-Qn)] Choice of models – coverage (1) We will use the notation • C(n): the coverage achieved after n tests • C0: final coverage. We will assume this to be 1 – no “dead” code. Further more, we will assume that C(n) = 1 / [1 + A exp( – an)] Choice of models – coverage (2) C(n) 1 1 / (1 + A) n Parameters We need the following parameters: • For the N(n) model we need – N0: total number of defects Q: mean number of tests to find a defect • For the C(n) model we need – A: first test coverage – a: coverage growth parameter All four parameters can be estimated from observations using the Log Likelihood estimator. Final model We can use the C(n) expression to get an expression for n as a function of C(n). By substituting this into the N(n) expression we get an estimate for the number of remaining fault as a function of the coverage: N 0 N (n) 1 C (n) N0 AC ( n ) Q a