CIS (C++) Mutation Application CISMA USER DOCUMENTATION J. C. Wesley CISMA - The Mutation Program (C++ Program) Explanation CISMA is designed for the sole purpose of providing a graphical user interface in which users may manipulate the underlying mutation program. It is therefore essential that said program be understood. The mutation program (which is executed by the UNIX command, “mutate”) mutates (or alters) any *.cpp file that is provided as input. Alterations are applied at random based from a set of standard mutation types (to be explained later on). In theory this program has the ability to make alterations to any file. Of course said file would have to contain C++ statements in order for the output to be meaningful. Input This program allows for two methods of input (file extensions must be specified in both methods: Method 1: Simply invoke the “mutate” command for the UNIX console. The system will then prompt for a *.cpp file to be mutated as well as a *.mnl (parameter list) and *.mpv (parameter value) file. *.cpp Any file remotely resembling a *.cpp file is acceptable. This includes *.h, *.c, or even *.java files. It is vital to note however, that the output becomes much less meaningful as files deviate further for the *.cpp file format. *.mnl This file is of no real concern to the user except for the fact that it is necessary in order for the mutation program to run correctly. It simply contains a list of all the possible mutation types which are used at runtime as entities in the probability range table (again of no real concern to the end user as the file is typically supplied). In the event that it is not however the contents should be as follows: pMut $ pStmtMut pCforM pCifM pCasM pCwhM pCswM pCdclM $ pLineMut pLdupM pLdelM pLcomM pLucomM $ pTokenMut pTopMut pTidMut pTtypeMut pTconMut $ *For clarity this mutation types are explained in–depth in the attached Appendix.* *.mpv This file should be of primary concern to the user (besides the .cpp file itself). This file contains the probabilities of all of the given mutation types. In other words, the *.mpv file dictates the chance of each mutation type occurring assuming a mutation occurs at all (values can be set to 0). Additionally, the file specifies the total number of mutations that may occur and the amount of times a specific line of code may be mutated (for simplicity purposes however, it is recommended that these values are set to 1). Here is an example of a typical *.mpv: pMut 50 pLineMut 40 pLdupM 25 pLcomM 25 pLdelM 25 pLucomM 25 pStmtMut 30 pCforM 20 pCifM 20 pCasM 20 pCwhM 20 pCswM 10 pCdclM 10 pTokenMut 30 pTopMut 25 pTidMut 25 pTtypeMut 25 pTconMut 25 gnMut 100 gnLineMut 50 gnStmtMut 50 gnTokenMut 50 snLineMut 1 snStmtMut 1 snTokenMut 1 *All values must be present and set to some integer value* Method 2: The input file, *.mnl, and *.mpv may be provide immediately when the mutation program is invoked. For example: >mutate test.cpp def.mnl def.mpv Such a procedure is order sensitive, mutate must be followed by the input file, the *.mnl file and the *.mpv file in that order. Output Whenever the mutation program is successfully run the system outputs a table that basically shows all of the provided parameters (or mutation types) and corresponding values as well as the total number of mutations performed. Also, two output files are produced: mut_filename and log_filename_txt. mut_filename: Essentially the altered file. log_filename_txt: A log of all the mutations and attempted mutations performed on the provided file. How The Program Actually Runs The program examines the provided code line by line and randomly applies mutation to the given line. For instance, if the program examines one line and determines there is a 20% chance a mutation will be performed at all (based on the *.mpv file) it will randomly generate a number. If this number is less than 20, a mutation will be performed. It then figures out the chance of a line, statement or token mutation occurring (only one of these may occur at a given time). In the instance that all these values are set to 33, if the randomly generated number falls between 0 and 33 a line mutation is performed. Likewise, if the random number is between 33 and 66 (33 + 33) or 66 and 99 (33 + 33 + 33), a statement or token mutation is performed respectively. It is therefore desirable that the line, token, and statement mutation values within the *.mpv file add up to 100. The same rule applies to the sub-probabilities (the probabilities that determine the explicit mutation to be performed) under each major class (line, token and statement mutation). For further clarity please examine the figure below: pStmtMut pTokenMut pCforM pTopMut pCifM pTidMut pCasM pTtypeMut pCwhM pTconMut pCswM pCdclM *Values for the probability parents and each group of corresponding children should add up to 100, respectively. * Probability Parents Corresponding Children pLineMut pLdupM pLcomM pLdelM pLucomM CISMA – The Mutator Application (Java Interface) Explanation The Mutator Application is designed to simplify the use of the mutation program for the enduser. The only two things the user need be concerned about is the input file and the *.mpv file (the *.mnl is created automatically). When the application is invoked (>java Mutator from the UNIX console) a title screen appears offering two options, Create/Update MPV File or Run Mutator. The user may exit out of any window by clicking the X in the corresponding upper right hand corner. Create/Update MPV File This is basically an *.mpv file editor, here users may tweak the values that would normally appear in a *.mpv file. The parameters are presented in plain English to promote clarity (note: Mut. = mutation, Prob. = Probability). Additionally, they are listed in the same order that they would appear in an actually *.mpv file. Users are allowed 4 options: Open New MPV File, Apply To File, Reset To Default, and Run Mutator. Open New MPV File: Allows user to search any directory for a *.mpv file or a file written in a similar format. The selected file is then loaded into the editor. The editor verifies that all the values are in the appropriate range (nonnegative, not more than 100). The editor also verifies that the various probability sets (pLine, pStmt, and pToken for example add up to 100). In the event that the values are out of range or the probability sets do not add up to 100, corrections are automatically made. Apply To File: Allows user to save *.mpv file but only in the directory from which the mutator application was invoked. The editor verifies that all the values are in the appropriate range (nonnegative, not more than 100). The editor also verifies that the various probability sets (pLine, pStmt, and pToken for example add up to 100). In the event that the values are out of range or the probability sets do not add up to 100, corrections are automatically made. Reset To Default: Resets values to system default. Run Mutator: Allows user to run the mutation program. Run Mutator: This allows user to actually run the mutation program from the graphical interface. It prompts to the user for the file to be mutated (preferably a *.cpp or similar file, must exist in the directory in which the mutator application was invoked), the *.mpv file (must exist in the directory in which the mutator application was invoked) and the number of mutated programs to produce. The *.mnl file is created and applied automatically. Users then simply click MUTATE for the mutation program to be invoked. Outputs Whenever the MUTATE command is successfully run the system outputs a table that basically shows all of the provided parameters (or mutation types) and corresponding values as well as the total number of mutations performed. Also, a variable number (2 x the indicated number of mutated programs) output files are produced: mut#_filename and log#_filename_txt. mut#_filename: Essentially the altered file. log#_filename_txt: A log of all the mutations and attempted mutations performed on the provided file. Appendix (WIP) =========================================== Interpretation of mutation probabilities =========================================== NOTE: The top level choices represent mutation modes. For each mode there exist multiple transformations that are possible depending upon context (the current source line) and transformation probabilities. pMut -- probability of a mutation of any type. When pMut=0, no mutation is applied to the source file. pLineMut -- probability that a line-oriented mutation is applied. pLineMut>0 only if pMut>0. When a line mutation is applied to a source line, no other type of mutation is permitted. pStmtMut -- probability that a construct-oriented mutation is applied. pLineMut>0 only if pMut>0. A statement mutation permits focus on errors specific to programming constructs such as loops, conditional, function prototypes, function calls. Statement mutations require tokenizing one or more source lines. When a construct mutation is applied, no other type of mutation is permitted on the affected lines. pTokenMut -- probability that a token-oriented mutation is applied. The scope of this mutation mode is the current line. Zero or more tokens are transformed. =============== GLOBAL COUNTS =============== gnMut -- maximum number of mutation transformations. gnLineMut -- maximum number of line mutation transformations. gnStmtMut -- maximum number of statement (construct) mutation transformations. gnTokenMut -- maximum number of token mutation transformations. ======================= SCOPE-SPECIFIC COUNTS ======================= snLineMut -- maximum number of line mutation transformations within the current LineMmutation mode. snStmtMut -- maximum number of statement (construct) mutation transformations within the current StatementMutation mode. snTokenMut -- maximum number of token mutation transformations within the current Token Mutation mode.