Team Stochastic Central Washington University Software Test Plan Project Name: Cluster Sampling for Tail Estimation of Probability (CSTEP) Editor: Alan Chandler Authors: Alan Chandler Eric Brown Nathan Wood Temourshah Ahmady Faculty Advisor: Dr. James Schwing Client: Dr. Yvonne Chueh Due Date: 02/07/11 Contents 1 Introduction .......................................................................................................................5 2 Relationship to other documents ........................................................................................6 3 System Overview ................................................................................................................7 3.1 Unit Testing ................................................................................................................................... 7 3.1.1 Lua Functions ........................................................................................................................ 7 3.1.2 Population and Samples Classes ........................................................................................... 7 3.1.3 Significance Method ............................................................................................................. 7 3.1.4 Pivot Method ........................................................................................................................ 8 3.1.5 Main Function in Data Processor .......................................................................................... 8 3.2 Integration Testing ........................................................................................................................ 8 3.2.1 Data Transfer from GUI to Data Processor ........................................................................... 8 3.2.2 Population, Samples, and LuaFunction Constructors ........................................................... 8 4 Features to be tested/not tested .........................................................................................9 4.1 GUI Features to be Tested .................................................................................................................. 9 4.2 Data Processor Features to be Tested .......................................................................................... 9 5 Pass/Fail Criteria ............................................................................................................... 10 6 Approach .......................................................................................................................... 11 7 Suspension and Resumption ............................................................................................. 12 8 Testing Materials .............................................................................................................. 13 8.1 Hardware .................................................................................................................................... 13 8.2 Software ...................................................................................................................................... 13 8.3 Other Requirements ................................................................................................................... 13 9 Test Cases ......................................................................................................................... 14 9.1 Test Case 1 .................................................................................................................................. 14 9.2 Test Case 2 .................................................................................................................................. 14 9.3 Test Case 3 .................................................................................................................................. 14 9.4 Test Case 4 .................................................................................................................................. 14 9.5 Test Case 5 .................................................................................................................................. 15 9.6 Test Case 6 .................................................................................................................................. 15 9.7 Test Case 7 .................................................................................................................................. 15 9.8 Test Case 8 .................................................................................................................................. 15 9.9 Test Case 9 .................................................................................................................................. 16 9.10 Test Case 10 ................................................................................................................................ 16 9.11 Test Case 11 ................................................................................................................................ 16 9.12 Test Case 12 ................................................................................................................................ 17 9.13 Test Case 13 ................................................................................................................................ 17 9.14 Test Case 14 ................................................................................................................................ 17 9.15 Test Case 15 ................................................................................................................................ 17 9.16 Test Case 16 ................................................................................................................................ 18 9.17 Test Case 17 ................................................................................................................................ 18 9.18 Test Case 18 ................................................................................................................................ 18 9.19 Test Case 19 ................................................................................................................................ 18 9.20 Test Case 20 ................................................................................................................................ 19 9.21 Test Case 21 ................................................................................................................................ 19 9.22 Test Case 22 ................................................................................................................................ 19 9.23 Test Case 23 ................................................................................................................................ 19 9.24 Test Case 24 ................................................................................................................................ 20 9.25 Test Case 25 ................................................................................................................................ 20 9.26 Test Case 26 ................................................................................................................................ 20 10 Testing Schedule ............................................................................................................... 21 10.1 Week of February 14th, 2011 ...................................................................................................... 21 10.2 Week of February 21st, 2011 ....................................................................................................... 21 10.3 Week of February 28th, 2011 ...................................................................................................... 21 11 Appendices ....................................................................................................................... 22 11.1 Test case specification identifier................................................................................................. 22 Test Case Number ............................................................................................................................... 22 Title: .................................................................................................................................................... 22 Date: .................................................................................................................................................... 22 Steps:................................................................................................................................................... 22 Input: ................................................................................................................................................... 22 Expected-Result: ................................................................................................................................. 22 Status: ................................................................................................................................................. 22 11.2 Test items .................................................................................................................................... 22 11.3 Input specifications ..................................................................................................................... 22 Data: .................................................................................................................................................... 22 11.3.1 Parameters: ......................................................................................................................... 22 11.4 Output specifications .................................................................................................................. 22 11.5 Environmental needs .................................................................................................................. 23 11.5.1 Software: ............................................................................................................................. 23 11.5.2 Hardware: ........................................................................................................................... 23 11.6 Special procedural requirements................................................................................................ 23 Real-world data: .................................................................................................................................. 23 Synthetic data: .................................................................................................................................... 23 11.7 Inter-case dependencies ............................................................................................................. 23 1 Introduction This document is a Test Plan for CSTEP System. CSTEP is stand for Cluster Sampling for Tail Estimation of Probability. It is a desktop based application that helps actuaries to provide stochastic models that are more accurate and reliable. It will allow statisticians, both actuaries and academicians to estimate and analyze the probability of a large block of business in a short amount of time. The output of this program, which will be cluster samples for tail estimation of probability, helps the valuation actuaries to analyze and guide decisions on economic values for various lines of business. This document describes the testing strategy and guidelines that our team Stochastic would follow in the process of testing the functionality and quality of the CSTEP system. It also contains various resources required for the successful completion of the CSTEP project. The main goals for the testing process of the CSTEP system are: 1. Assuring that the system meets the requirements specified on the Requirement Specification Document 2. Achieving a good quality standard that is acceptable by our client: Dr. Von Chueh 3. Finding and fixing important bugs on the system. 5 2 Relationship to other documents This document has a significant relationship to the requirement specification document, quality assurance document, and software design specification document of the CSTEP system. It describes strategies that how each individual requirement of the system is verified and tested. This document contains test cases that are designed based on the requirement specification document. It is implemented based on all the standards and procedures that are defined in CSTEP Quality Assurance Document. 6 3 System Overview Here we examine the system design in relation to unit and integration testing and how they will ensure the correctness of the components and their interactions. Unit testing thoroughly tests the individual methods and algorithms within each component, confirming that they are doing what they should be doing. Integration testing checks that each component is communicating properly with all other components. Once the extensive unit and integration testing phase is finished, the developers may have relative confidence in their code. However, with each major change to the system, the developers must retest every component to ensure the integrity of the software. 3.1 Unit Testing 3.1.1 Lua Functions We will start the unit testing on the most basic pieces of code, the Lua functions. This includes the Lua function files and the LuaFunction class, since there is no other way to access the Lua functions themselves except through the LuaFunction class. The Lua functions accept the C values and one or two scenarios and return a floating-point value determined by the significance or distance method formulas they represent. We can test these functions using two different techniques. One is to rewrite each function in C#, a language we better understand, and compare the results to make sure both languages are producing the same results. However, this assumes that we are properly implementing each formula. The second technique is to run each function, both the Lua and C# versions, using specific scenarios and we compare the results with ones we obtain by hand. Though this technique is more time-consuming, it is the best way to ensure that the functions are giving the proper results. To test the initial correctness of the Lua functions, we will use the second technique. Then we test the functions on various kinds of scenarios, including ones with negative values, to ensure the Lua functions process data in the same way C# does. 3.1.2 Population and Samples Classes The Population class imports scenarios, collections of floating-point values, from a .CSV file, and the Samples class writes scenarios to a .CSV file. To test that the former function is working properly, we need to import .CSV files with various separators and various kinds of floating-point values. We will test the latter function using various combinations of floating-point values. Unfortunately, the best way to confirm the data is imported and exported correctly is by looking at each file and the data taken from or written to each file, which means the process cannot be automated. Both classes also have setter and getter methods that must be tested for accurate getting and setting. Unit tests check that the value returned by a getter is the value desired and the value assigned by a setter is the value specified. 3.1.3 Significance Method The processData function in the SigMethodAlgorithm class finds the significance rank of each scenario and sorts the scenarios according to their ranks, selecting a sample every so often. Because this process is long and complicated, to test the whole thing, we will need to compare the results from our 7 SigMethodAlgorithm class to trustworthy results. Such trustworthy results will come from the client, the Salms software provided by the client, or the official software used by the American Academy of Actuaries. It may be possible to automate the process by having the trusted software generate results from a population, have the SigMethodAlgorithm class process the same data, and use a program to check both results for equality. Doing this process with many different populations will ensure the significance method is as trustworthy as the other software. 3.1.4 Pivot Method The processData function in the PivotAlgorithm class finds pivot samples that represent clusters of scenarios. Because this process is longer and more complicated than the significance method, we will need to compare the results from our PivotAlgorithm class to trustworthy results. Such trustworthy results will come from the client or the Salms software provided by the client. Because this method is fairly new, the American Academy of Actuaries does not use software that implements this method. Though we can track the method through debugging to ensure it is selecting the right pivots, an automated process that can work on large populations and many samples is best. It may be possible to automate the process by having the Salms software generate results from a population, have the PivotAlgorithm class process the same data, and use a program to check both results for equality. Doing this process with many different populations will ensure the pivot method is trustworthy. 3.1.5 Main Function in Data Processor The Main function accepts the input values, but it also determines the algorithm to use from the .LUA files, so we must test that the Main function is using the correct algorithm. To do this, we can print to the command prompt which algorithm was called when the algorithm actually runs. 3.2 Integration Testing 3.2.1 Data Transfer from GUI to Data Processor The GUI passes arguments to the DataProcessor executable and then uses a pipeline to send an array of floating-point values. We must test this process to ensure the values being passes are correct. To do this, we can pass specific values to the DataProcessor executable and print the values to the command prompt to check they are correct. 3.2.2 Population, Samples, and LuaFunction Constructors When the Population and Samples constructors are called, the caller function passes parameters that define the fields in the objects. We must test this process to ensure the values being passes are correct. To do this, we can pass specific values to the constructors and print the values to the command prompt to check they are correct. 8 4 Features to be tested/not tested 4.1 GUI Features to be tested The most critical GUI feature to be tested is its ability to process a complete data set. It also needs to be able to display time remaining. We will also test its ability to report errors in various scenarios, including malformed input and general crashes in the data processor process. Additionally, we intend to test the general usability of the GUI application. We will also test the ability to enter parameters, properly input and display incoming data. The GUI must be fault tolerant, and not crash regardless of invalid input. We also intend to test the output pages, and ensure that output data is being displayed correctly. We will check both the graph output and the spreadsheet mode. We will also ensure that the time estimates offered by the GUI are reasonably accurate. 4.2 Data Processor Features to be tested The most important part of testing for the data processor will be to ensure that the data produced by the three algorithms implemented by it is correct. We will also individually test the Lua component to ensure that the outputs from it are correct. We will also test that the data processor returns timing information correctly so the GUI can update its progress bar. We will also test the data processor’s ability to recognize malformed CSV files and return an error. We will also test the argument input functionality of the data processor, so if the wrong number of arguments are inputted it will return an error and not crash. We also intend to test the data processor’s error reporting facility. 9 5 Pass/Fail Criteria For the purposes of the GUI, for most of the tests, a success will be correct data output. A failure will typically be a crash, but subtle data corruption is also possible, and will need to be checked for. When data corruption occurs, it will be important to isolate in which subsystem the problem originated. It could have started in the GUI or the data processor component, or in one of the file parsing subsystems. For the data processor, a failure will typically be corrupted data. A success will be data that matches the known good data. The data processor can also crash, or throw exceptions. Typically, these are also failure modes, except in tests where an error is the expected result. An abnormal termination with no error should ideally never occur, however. 10 6 Approach The approach we will take with the testing process is that of unit testing the individual parts of the program and then bringing them together for integration testing. To do this we are writing test cases that ensure that each method works individually on the task it is supposed to do. We test how these are done using white box testing to determine the proper things to test for each unit. Within each of these tests may be several smaller tests the function as a whole. This will be repeated with every method in the program to ensure each part is working. For each unit to pass, the code must pass our testing criteria that we have set out in this paper. For example, the Lua scripts will be tested in unit testing separately of everything else, and then they will be integrated with the C++ to test that it works as a whole. 11 7 Suspension and Resumption The group has decided that to determine whether or not we will suspend the test will be determined by the group as a whole. We will work through the test and choose as a group if the test has passed and we can continue. If the test does not seem like completion is possible with the current parameters, we will suspend the test till a later date. To determine this we will compile a list of factors that would lead to suspension, such as: incomplete precondition test, lack of data to input, etc. Upon suspension of a test we will put a time limit on how long we can wait to resume testing of this unit. This unit must be completed within this time so we can continue on to integration testing. Most tests are representative of a requirement that we need to finish in order for our program to succeed and satisfy our client’s requests. Knowing that most tests we do will make the difference between whether or not we will succeed in our project. Resumption will therefore need to be done on a regular basis if we determine to suspend a test. The time limit we set will need to be representative on how long we have to finish the project. Another condition we will have on resumption is whether or not we even can resume the test. If a test has preconditions that are not met we will not be able to resume until these previous tests are complete. 12 8 Testing Materials 8.1 Hardware The hardware needed for testing will consist of a variety of PCs with hardware considered representative of typical machines used by businesses. These will consist of 2002-2007 era PCs, with both dual core and single core processors. They should include both x86 and x64 computers. The computers should have at least 1GB of RAM and 2GB free hard disk space. The computers should have CD drives so that the software can be easily installed. 8.2 Software The computers used for testing should have a cross section of modern versions of Microsoft Windows. These versions should include Windows XP, Windows Vista and Windows 7. Ideally, both x86 and x64 versions of these operating systems should be tested, but in practice Windows XP x64 is very uncommon and can probably be dropped if time becomes an issue. To facilitate testing, these computers should have the .NET framework 3.5 installed. Running the application will also require the Visual C++ Runtime 2008. The application should be tested with User Account Control enabled and disabled. Testing should also occur with high-contrast mode on and with it off as well, and the same test should be performed with large font mode. Visual Studio will be used to unit test the C# component, so this will be required during unit testing. 8.3 Other Requirements We will need some space to set up the test machines, although the physical test environment is not particularly important. We also intend to bring in beta testers to test the program’s usability. We will need a vacant classroom for this part of the testing. 13 9 Test Cases 9.1 Test Case 1 Requirement(s) Tested: F1 Steps: Load the GUI. Select a set of data with known good output for the significance method. Start processing. Criteria: If the program crashes, it is a failure. If the data does not match the known good output, it is a failure. 9.2 Test Case 2 Requirement(s) Tested: F2 Steps: Load the GUI. Select a set of data with known good output for the Euclidian distance method. Start processing. Criteria: If the program crashes, it is a failure. If the data does not match the known good output, it is a failure. 9.3 Test Case 3 Requirement(s) Tested: F3 Steps: Load the GUI. Select a set of data with known good output for the present value distance method. Start processing. Criteria: If the program crashes, it is a failure. If the data does not match the known good output, it is a failure. 9.4 Test Case 4 Requirement(s) Tested: F4 Steps: Load the GUI. Select import to load in a CSV file. Check the view universe button. Criteria: If the data in the spreadsheet matches the data in the file, it is a success. 14 9.5 Test Case 5 Requirement(s) Tested: F5 Steps: Load the GUI. Select a set of data with known good output for the significance method with 10, 50, and 200 results. Select the following numbers of results: 10, 50, and 200. Criteria: If the program crashes, it is a failure. If the data does not match the known good output, it is a failure. 9.6 Test Case 6 Requirement(s) Tested: F6 Steps: Load the GUI. Select an input set with 500,000 records. Run processing. Criteria: If a message box warning about performance appears, it is a success. 9.7 Test Case 7 Requirement(s) Tested: F7 Steps: Load the GUI. Select a set of data with known good output for the significance method, starting at column 10 and row 10. Select column 10 and row 10 for the starting point. Start processing. Criteria: If the data matches the known good data, the test is successful. 9.8 Test Case 8 Requirement(s) Tested: F8 Steps: Load the GUI. Select a set of data with known good output for the significance method, of horizon 15. Select 15 for the horizon. 15 Start processing. Criteria: If the program crashes, it is a failure. If the data does not match the known good output, it is a failure. 9.9 Test Case 9 Requirement(s) Tested: F9 Steps: Load the GUI. Select a set of data with known good output for the significance method, with horizon of 10. Enter view universe. Select the first 10 records of the first row. Start processing. Criteria: If the program crashes, it is a failure. If the data does not match the known good output, it is a failure. 9.10 Test Case 10 Requirement(s) Tested: F10 Steps: Load the GUI. Select a set of data with known good output for the significance method, with 5000 inputs. Enter 5000 into the scenario number input box. Start processing. Criteria: If the program crashes, it is a failure. If the data does not match the known good output, it is a failure. 9.11 Test Case 11 Requirement(s) Tested: F11 Steps: Load the GUI. Select a set of data with known good output for the Euclidian distance method, with V set to 0.5. Enter 0.5 into the text box for V. Start processing. Criteria: If the program crashes, it is a failure. If the data does not match the known good output, it is a failure. 16 9.12 Test Case 12 Requirement(s) Tested: F12 Steps: Load the GUI. Select a set of data with known good output for the present value distance method, with the following C values: 0.1, 0.2, 0.3, 0.7, 0.9, and horizon 5. Enter 0.1, 0.2, 0.3, 0.7, and 0.9 into the C values system. Set horizon to 5. Start processing. Criteria: If the program crashes, it is a failure. If the data does not match the known good output, it is a failure. 9.13 Test Case 13 Requirement(s) Tested: F13 Steps: Load the GUI. Select any universe. Go to custom mode, and load the following Lua file: function getRank(a, h) return a[0]; end Start processing. Criteria: The output should be a CSV file containing the scenarios in ascending order, by the first year’s rate. 9.14 Test Case 14 Requirement(s) Tested: F14 Steps: Load the GUI. Select a set of data with known good output for the significance method, with 4 nested samples Set nested samples to 4. Start processing. Criteria: If the program crashes, it is a failure. If the data does not match the known good output, it is a failure. 9.15 Test Case 15 Requirement(s) Tested: F15 Steps: Load the GUI. Select a set of data with known good output for any method. 17 Start processing. Load the results in Microsoft Excel. Criteria: If the values in Excel match the known good data, the test is successful. 9.16 Test Case 16 Requirement(s) Tested: F16 Steps: Load the GUI. Select any set of data, and select any method. Start processing. Go to the output tab. Criteria: If the graph matches the data in the spreadsheet, the test is successful. 9.17 Test Case 17 Requirement(s) Tested: F18 Steps: Load the GUI. Run several processing runs with different data sets and methods, timing each one. Criteria: If the recorded time is within +-50% of the estimated time, the test is successful. 9.18 Test Case 18 Requirement(s) Tested: F19 Steps: Take a good data set, blank out one field. Load the GUI. Load the data set. Start processing. Criteria: If an error message specifying where the problem is and what is wrong is generated, the test is successful. 9.19 Test Case 19 Requirement(s) Tested: F21 Steps: Load the GUI. Load the GUI again. 18 Criteria: If a warning is issued about multiple running instances, the test is successful. 9.20 Test Case 20 Requirement(s) Tested: F1 Steps: Load the GUI. Select a set where good output is known if the upper-leftmost field is changed to 0.75 and then run with the significance method. Open the view universe window. Modify the upper-left field to 0.75. Start processing. Criteria: If the program crashes, it is a failure. If the data does not match the known good output, it is a failure. 9.21 Test Case 21 Requirement(s) Tested: NF1 Steps: Load the GUI. Select a set of data with known good output for the significance method, with 128 years. Set the number of years to 128. Start processing. Criteria: If the program crashes, it is a failure. If the data does not match the known good output, it is a failure. 9.22 Test Case 22 Requirement(s) Tested: NF3 Steps: Do the following on Windows XP, Windows Vista, Windows Vista x64, Windows 7 and Windows 7 x64. Run a cross section of the other tests. Criteria: If the program crashes, it is a failure. If the data does not match the known good output, it is a failure. 9.23 Test Case 23 Requirement(s) Tested: NF6 19 Steps: Load the GUI. For the Significance method, the Euclidian distance method and the Present Value distance method, do: Load a data set with 10,000 records, 50 years, and 50 outputs. Start processing. Criteria: If any run takes longer than 5 minutes, the test is a failure. 9.24 Test Case 24 Requirement(s) Tested: NF7 Steps: Load the GUI. Select a set of data with known good output. Set the number of outputs to 1024. Start processing. Criteria: If the program crashes, it is a failure. If the data does not match the known good output, it is a failure. 9.25 Test Case 25 Requirement(s) Tested: NF9 Steps: Load the GUI. Run a dataset consisting entirely of 1E-12. Start processing. Criteria: If the program crashes, it is a failure. If the data does not match the known good output, it is a failure. 9.26 Test Case 26 Requirement(s) Tested: NF10 Steps: Load the GUI. Select a set of data with known good output for the significance method, with 131,070 scenarios. Start processing. Criteria: If the program crashes, it is a failure. If the data does not match the known good output, it is a failure. 20 10 Testing Schedule This section presents the tentative testing schedule used to organize the testing phase. This schedule describes the kinds of testing executed within certain weeks. 10.1 Week of February 14th, 2011 The construction phase ends, and the testing phase begins. Rigorous unit testing begins, testing all functional code. Acceptance testing also begins. The client examines GUI and provides feedback. The developers sign up actuarial science students for user testing. 10.2 Week of February 21st, 2011 Unit testing continues. Integration testing begins. The client provides more feedback. The developers hold user testing sessions with actuarial science students. Also, colleges of the client from a company in Seattle visit February 24th. The developers present software to the client’s colleges and receive feedback. 10.3 Week of February 28th, 2011 Unit testing and integration testing end. Operating system testing begins and end. Acceptance testing continues, and the client provides more feedback. Testing phase draws to a close as quality assurance officer and client check that the software meets the requirements. 21 11 Appendices 11.1 Test case specification identifier Test Case Number: The test case number is a two digit identifier of the following form: TC ## where TC is stand for test case and ## is for a two digit number. Title: Is the title or the name of the test case. Date: Is the date of the last revision to the test case. Steps: Are the individual steps that the test is going through. Input: What input is needed for the test? Expected-Result: Is the expected-result to the test case. Status: Is the pass and fail status of the test. 11.2 Test items All the individual functions, components, and the system as a whole are the test items of this test plan and each one of them has to pass the designated test cases both individually and when integrated to the other components. 11.3 Input specifications Data: The input data is stochastically generated scenarios saved in Excel SVN file. This input scenario may be one-year interest rates from a yield curve matrix covering 30 years of time horizon. It may also be the stocks returns for a certain years of research time horizon. 11.3.1 Parameters: - scenario vector dimension - size of the universe of all scenarios considered - size of the sample desired - distance formula choices Note: The program must accept 128 years of input. 11.4 Output specifications The output of the CSTEP program is cluster samples for tail estimation of probability. It will be used by actuaries to analyze and guide decisions on economic values for various lines of business. It will consist the following: - Required run-time for the distance sampling process Selected (sampled) pivot (or representative) scenario list Probability list of the pivot scenarios Possible plots or summaries of the pivot scenarios selected 22 Note: each piece of data in the output has 12 digit precision. 11.5 Environmental needs 11.5.1 Software: One of the following windows operating systems is needed for implementing this test plan: - Windows XP either 32-bit or 64-bit versions, both work Windows Vista either 32-bit or 64-bit versions, both work Windows 7 either 32-bit or 64-bit versions, both work 11.5.2 Hardware: All the test case within this test plan can be implemented on any hardware that runs the required windows operating system. 11.6 Special procedural requirements One of the critical constraints to implement this test plan is testing the output of the system for correctness. The CSTEP system must be able to input a file of 10000 data scenario and process it within 5 minutes. There is no real world-data this big to test the program for correct output, and without having real-world data, it is challenging to test this requirement. Therefore, our team decided to categorize the data into two different categories: Real-world data: This data is obtained from our client, but it is not large enough to test the performance requirement, and we only use it to test the program output for correctness. Synthetic data: This data is randomly generated and will be large enough to test the performance of the CSTEP system. Finding a user who uses the CSTEP system practically in the real-world is another constraint to implement the usability testing procedure of this plan. Therefore, we asked our client if she can provide one of her students or someone who knows about the actuarial sciences especially about the domain of the CSTEP system so we can perform the usability testing. 11.7 Inter-case dependencies In order to implement this plan as a whole, all the components of the CSTEP system should be integrated and finished. Both the real-world and synthetic data that are used to implement this test plan should be provided and updated before performing the test. 23