Parasoft® C++test Comprehensive Code Quality Tools for C/C++ Development Inroduction Parasoft® C++test™ is an integrated solution for automating a broad range of tools to improve software development, team productivity and software quality for C and C++: Static analysis – static code analysis, data flow static analysis, and metrics analysis Peer code review process automation–preparation, notification, and tracking Unit testing – unit test creation, execution, optimization, and maintenance Runtime error detection – memory access errors, leaks, corruptions, and more 236800 - Parasoft® C++test by Alon Bialik 2 Introduction – features Some of the features : Static analysis of code for compliance with user-selected coding standards Graphical RuleWizard editor for creating custom coding rules Static code path simulation for identifying potential runtime errors Automated code review with a graphical interface and progress tracking Application monitoring/memory analysis Automated generation and execution of unit and component-level tests Flexible stub framework Full support for regression testing Code coverage analysis with code highlighting Runtime memory error checking during unit test execution Full team deployment infrastructure for desktop and command line usage 236800 - Parasoft® C++test by Alon Bialik 3 Introduction – Cross Platform Cross Platform Supported Host Environments: Host Platforms Windows NT/2000/XP/2003/Vista/7 Linux kernel 2.4 Linux kernel 2.6 Solaris 7, 8, 9, 10 IBM AIX 5.3 and a PowerPC processor IDEs Eclipse for C/C++ Developers 3.2, 3.3, 3.4, 3.5 (32-bit) Microsoft Visual Studio .NET 2003, 2005, 2008 with Microsoft Visual C++ Wind River Workbench 2.6, 3.0, 3.1, 3.2 Texas Instruments Code Composer Studio 4.x ARM Workbench IDE for RVDS 3.0, 3.1, 4.0 QNX Momentics IDE 4.5 (QNX Software Development Platform 6.4) Host Compilers Windows Microsoft Visual C++ 6.0, .NET (7.0), .NET 2003 (7.1), 2005 (8.0), 2008 (9.0) GNU and MingW gcc/g++ 2.95.x, 3.2.x, 3.3.x, 3.4.x GNU gcc/g++ 4.0.x, 4.1.x, 4.2.x, 4.3.x Green Hills MULTI for Windows x86 Native v4.0.x Linux (x86 target platform) GNU gcc/g++ 2.95.x, 3.2.x, 3.3.x, 3.4.x, 4.0.x, 4.1.x, 4.2.x, 4.3.x Linux (x86_64 target platform) GNU gcc/g++ 3.4.x, 4.0.x, 4.1.x, 4.2.x, 4.3.x Target/Cross Compilers ARM (Windows hosted) ARM RVCT 2.2, 3.x, 4.x ARM ADS 1.2 Embedded Linux (Windows, Linux, Solaris hosted) GNU gcc/g++ 2.95.x, 3.2.x, 3.3.x, 3.4.x, 4.0.x, 4.1.x, 4.2.x, 4.3.x Green Hills (Windows, Solaris hosted) Green Hills optimized compilers line 4.0.x IAR (Windows hosted) IAR ANSI C/C++ Compiler V5.30 for ARM (C only) Keil (Windows hosted) ARM/Thumb C/C++ Compiler, RVCT3.1 for uVision C51 Compiler V8.18 (static analysis only) Microsoft (Windows hosted) Microsoft Visual C++ for Windows Mobile 8.0, 9.0 Microsoft Embedded Visual C++ 4.0 QNX (Windows hosted) GCC 2.9.x, 3.3.x, 4.2.x STMicroelectronics (Windows hosted) ST20 (static analysis only) ST40 (static analysis only) Texas Instruments (Windows hosted) TMS320C6x C/C++ Compiler v5.1 TMS320C6x C/C++ Compiler v6.0 TMS320C2000 C/C++ Compiler v4.1 (static analysis only) Source Control AccuRev SCM Borland StarTeam CVS IBM/Rational ClearCase Microsoft Team Foundation Server Microsoft Visual SourceSafe Perforce SCM Serena Dimensions Subversion (SVN) Telelogic Synergy 236800 - Parasoft® C++test by Alon Bialik 4 Embedded and Cross-Platform Development For embedded and cross-platform development, C++test can be used in both host-based and target-based code analysis and test flows. C++test’s customizable workflow allows users to test code as it’s developed, then use the same tests to validate functionality in target environments 236800 - Parasoft® C++test by Alon Bialik 5 Overview - Static analysis What is Static Analysis? Static analysis is the term applied to the analysis of computer software that is performed without actually executing programs. - wikipedia Low-tech static analysis: • Software inspection • Simple syntactic standards and manual checks High-tech static analysis • Enforced syntactic checks • Well-formedness checks in specifications, designs, and code (e.g., matching connectors in design diagrams) • Automated program analyses Often based on data flow analysis • Finite-state verification and other “high-power” analyses of models 236800 - Parasoft® C++test by Alon Bialik 6 C++test - Static analysis Automate Code Analysis for Monitoring Compliance a properly implemented coding policy can eliminate entire classes of programming errors by establishing preventive coding conventions. C++test statically analyzes code to check compliance with such a policy. static code analysis tool monitors whether code follows industry-standard or customized rules for ensuring that code meets uniform expectations around security, reliability, performance, and maintainability. User can choose out from over 1400 built-in rules, custom existing rules or define new ones. 236800 - Parasoft® C++test by Alon Bialik 7 C++test - Static analysis cont’d example Let’s review this example class A { public: A(int xval, int yval) : _x(xval), _y(yval) {} friend A& operator+(const A& p1, const A& p2); private: int _x, _y; }; A& operator+(const A& p1, const A& p2) { A *result = new A(p1._x + p2._x, p1._y + p2._y); return *result; // Violation } Returning a reference to a local object or a dereferenced pointer initialized by new within the function may cause a memory leak. C++ test has a built-in rule you can select: Never return a dereferenced local pointer initialized by new in function scope 236800 - Parasoft® C++test by Alon Bialik 8 C++test - Static analysis cont’d example cont’d class A { public: A(int xval, int yval) : _x(xval), _y(yval) {} friend A& operator+(const A& p1, const A& p2); private: int _x, _y; }; A& operator+(const A& p1, const A& p2) { A *result = new A(p1._x + p2._x, p1._y + p2._y); return *result; // Violation } Implementation of this rule is advised by Scott Meyers in his book "Effective C++: 50 Specific Ways to Improve Your Programs and Design“. 236800 - Parasoft® C++test by Alon Bialik 9 C++test - Static analysis cont’d Define your own rule sets with built-in and custom rules. This is a good way to enforce standards of coding techniques like naming conventions that are customary in a programming team, variables initialization etc. Little anecdote for those who took MATAM before 2008, there is a rule Prefer initialization to assignment in constructors. <string> Where #include using namespace std; class A { public: A( const char* file, const char* path ) { myFile = file; // Violation myPath = path; // Violation } private: string myFile; string myPath; }; and A( const char* file, const char* path ) : myFile(file), myPath(path) {} // OK 236800 - Parasoft® C++test by Alon Bialik 10 C++test - Static analysis cont’d C++Test - Resources Herb Sutter, Andrei Alexandrescu, C++ Coding Standards Scott Meyers – Effective C++ & More Effective C++ Ellemtel Coding Standards (1990) MISRA-C 2004, MISRA-C++ 2008 Motorola Coding Standards Meyers-Klaus Rules JSF Coding Standards (2005) 236800 - Parasoft® C++test by Alon Bialik 11 C++test - Static analysis cont’d Rules Categories Coding Convention Rules OOP Rules Comments Rules Optimization Rules int small = 20000; Coding Conventions: OOP Rules Exceptions Rules 2; Portability Rulesor -25536? int big = small * // 40000 For example: For example: Formatting Multiple Possible Bugs Rules Exception rules ifRules (small < big) {} Magic numbers inheritance (diamond inheritance) Initialization Rules Physical Org. Rules For example: Default in Switch-case Avoid callingFile virtual-methods/global-data Metrics Rules from Qt Best Rules Not from …the (prevent stack unwinding) Initialization rules Q non-const :D’tor Will if’s body be executed? Const conversion C’tor Practices / D’tor MISRA 2004 Rules Security For example: Naming conventions Avoid publicRules data members Memory and Resource STL Best-Practices Rules A static, : Depends… Globals, member in C’tor etc down-casting Avoid Rule:Rules Use: UINT8, INT16, UINT32 etc…has virtual Management Rules functions it shall have a If Templates a class Naming Convention Rules Bug Detective (*) virtual D’tor 236800 - Parasoft® C++test by Alon Bialik 12 C++test - Static analysis cont’d Define your test Configure your static analysis test C++test configurations 236800 - Parasoft® C++test by Alon Bialik 13 C++test - Static analysis cont’d Define your test 236800 - Parasoft® C++test by Alon Bialik 14 C++test - Static analysis cont’d Define your test 236800 - Parasoft® C++test by Alon Bialik 15 C++test - Static analysis cont’d Define your test 236800 - Parasoft® C++test by Alon Bialik 16 C++test - Static analysis cont’d Define your test Configure your static analysis test Rule Editor 236800 - Parasoft® C++test by Alon Bialik 17 18 C++test - Static analysis cont’d Run Test Run static analysis test on you code 236800 - Parasoft® C++test by Alon Bialik 19 C++test - Static analysis cont’d Test your code – test summary 236800 - Parasoft® C++test by Alon Bialik 20 C++test - Static analysis cont’d Test your code – static analysis output 236800 - Parasoft® C++test by Alon Bialik 21 C++test - Static analysis cont’d Define your test Configure your static analysis test Bug Detective 236800 - Parasoft® C++test by Alon Bialik 22 C++test - Static analysis cont’d Bug Detective 236800 - Parasoft® C++test by Alon Bialik 23 C++test - Static analysis cont’d Bug Detective – cont’d 236800 - Parasoft® C++test by Alon Bialik 24 C++test - Static analysis cont’d Bug Detective – cont’d 236800 - Parasoft® C++test by Alon Bialik 25 C++test - Static analysis cont’d Bug Detective – cont’d Resource Leaks Allocation misuse of memory, pipes, file descriptors, and other system resources. Bugs Runtime errors such as division by zero, array bounding and indexing flaws, NULL pointer dereferencing, and data initialization errors. Security Vulnerabilities Detect read, write or indexing of potentially tainted data. 236800 - Parasoft® C++test by Alon Bialik 26 C++test - Static analysis cont’d Bug Detective – cont’d 236800 - Parasoft® C++test by Alon Bialik 27 C++test - Static analysis cont’d Bug Detective - Examples Buffer Overflow Security example void example(int src[100], int dest[100]) { int size; scanf("%d", &size); memcpy(dest, src, size); // VIOLATION ("size" is an arbitrary value possibly < 0 or > 100) } Dereferencing a NULL Pointer int main(int argc, char* argv[]) { Point* point = 0; if (argc > 3) { point = new Point(atoi(argv[1]), atoi(argv[2])); } point->reflectAcrossX(); // VIOLATION ("point" might be NULL at this point) return 0; } 236800 - Parasoft® C++test by Alon Bialik 28 C++test - Static analysis cont’d Run Test Run Bug Detective test on you code 236800 - Parasoft® C++test by Alon Bialik 29 C++test - Static analysis cont’d Test your code – test summary 236800 - Parasoft® C++test by Alon Bialik 30 C++test - Static analysis How does it work? An educated guess 236800 - Parasoft® C++test by Alon Bialik 31 C++test - Static analysis how it is done? C++ EBNF C++test C/C++ Source program Lexical analyzer Symbol table Syntax analyzer IDE Error Handler Semantic analyzer Properties table User chosen rules 236800 - Parasoft® C++test by Alon Bialik 32 C++test - Static analysis Lexical analysis in a nut shell Lex Lex is a program (generator) that generates lexical analyzers It reads the input stream (specifying the lexical analyzer ) and outputs source code implementing the lexical analyzer in the C programming language. Lex will read patterns (regular expressions) then produces C code for a lexical analyzer that scans for identifiers. Stream of characters Lexical analyzer 236800 - Parasoft® C++test by Alon Bialik Stream of Tokens 33 C++test - Static analysis Lexical analysis in a nut shell – cont’d Lex #include <iostream> #include <string> #include <cctype> // for std::isspace(), etc. void someFunc(const std::string &data); Lexical analyzer Line:4 Column:6 Text:some Func [#] "include" [<] "iostream" [>] [#] "include" [<] "string" [>] [#] Etc.. "include" [<] "cctype" [>] [/] [/] "for" "std" [:] [:] "isspace" [(] [)] [,] "etc" [.] "void" "print“ “someFunc" [(] "const" "std" [:] [:] "string" [&] "data" [)] [;] "int" "main" [(] [)] [{] "std" [:] [:] "string" 236800 - Parasoft® C++test by Alon Bialik 37 C++test - Static analysis Syntax analyzer in a nut shell YACC reads the Grammars written in Backus Normal Form (BNF) . and generate C code from Lex BNF grammar used to express context-free languages uses bottom-up or shift-reduce parsing Generate Symbols table Reports to IDE about syntax errors Stream of Tokens BNF grammer rules Syntax analyzer 236800 - Parasoft® C++test by Alon Bialik Symbol table Semantic analyzer 38 C++test - Static analysis Syntax analyzer in a nut shell YACC – an example %% statement : expression { printf (“ = %g\n”, $1); } expression : expression ‘+’ expression { $$ = $1 + $3; } | expression ‘-’ expression { $$ = $1 - $3; } | NUMBER { $$ = $1; } %% statement According these two productions, 5 + 4 – 3 + 2 is parsed into: expression expression expression number expression expression number 5 + 4 - expression expression number number 3 + 2 39 C++test - Static analysis Syntax analyzer syntax analyzer At this point the syntax analyzer builds the symbol table and save all the properties of a symbol For example: Class symbol Declared ? Pure virtual? Name: Name Object Static? Variables List Variable .. Methods List Variable .. Virtual? 236800 - Parasoft® C++test by Alon Bialik 46 C++test - Static analysis Syntax analyzer in a nut shell Semantic analyzer Semantic analyzer test the generated symbol table against the defined rules and matches unwanted paterns Class symbol Declared ? Pure virtual? Name: Name Object Static? Variables List Variable .. Methods List Variable .. Virtual? 236800 - Parasoft® C++test by Alon Bialik 47 C++test – Bug Detective How does it work? An educated guess 236800 - Parasoft® C++test by Alon Bialik 48 C++test - Static analysis Bug Detective … … Tree structure Semantic Analyzer Semantics-safe Tree structure Intermediate Representation Tree Bug Detective Data Flow analysis CFG CFG Generator Control Flow analysis 236800 - Parasoft® C++test by Alon Bialik 49 C++test - Static analysis Bug Detective Data Flow Analysis Compile-time reasoning about the run-time flow of values in the program Represent facts about the run-time behavior Represent effect of executing each basic block Propagate facts around the control flow graph 236800 - Parasoft® C++test by Alon Bialik 50 C++test - Static analysis Bug Detective Data Flow Analysis – cont’d Formulated as a set of simultaneous equations - Sets attached to the nodes and edges - Lattice to describe the relation between values - Usually represented as a bit or bit vectors Solve equations using iterative framework - Start with initial guess of facts at each node - Propagate until stabilizes at maximal fixed point. - Would like meet over all paths (MOP) solution 236800 - Parasoft® C++test by Alon Bialik 51 C++test - Static analysis Bug Detective Data Flow analysis equation properties Data-flow analyses equations are distinguished by Direction Gen Kill May/Must Merge Flow values (initial guess, type) 236800 - Parasoft® C++test by Alon Bialik 52 C++test - Static analysis Bug Detective Reaching definitions A definition of a variable x is a statement that may assign a value to x A definition may reach a program point p if there exists some path from the point immediately following the definition to p such that the assignment is not killed along that path - A definition of a variable x is killed if there is any other definition of x anywhere along the path Concept: relationship between definitions and uses 236800 - Parasoft® C++test by Alon Bialik 53 C++test - Static analysis Bug Detective Reachability Analysis: Step 1 For each block, compute local (block level) information - DEDef(B): the set of downward-exposed definitions in B o Those for which the defined name is not subsequently redefined by the exit from B - DEFKill(B): the set of definitions that are obscured by a definition of the same name in B o Also consider definition points outside B This information does not take control flow between blocks into account 236800 - Parasoft® C++test by Alon Bialik 56 C++test - Static analysis Bug Detective Reaching Definitions Example DEDef = 4,5 DEFKill = 1,2,7 d1 i = m – 1 d2 j = n d3 a = u1 B1 d4 i = i + 1 d5 j = j - 1 B2 DEDef = 1,2,3 DEFKill = 4,5,6,7 B3 d6 a = u2 DEDef = 6 DEFKill = 3 B4 d7 i = u2 DEDef = 7 DEFKill = 1,4 DEFKill need to consider the set of all definition points: {1,2,3,4,5,6,7} 236800 - Parasoft® C++test by Alon Bialik 57 C++test - Static analysis Bug Detective Reachability Analysis: Step 2 Compute REACHES set for each block in a forward direction - REACHES(b): the set of definitions that reach the entry to a block b - Start with REACHES(n0) = Ø - REACHES(b)= Definitions that reach the exit point of predecessor x xpred(b)(DEDef(x)(REACHES(x)-DEFKill(x))) Iterative algorithm: keep computing REACHES sets until a fixed point is reached Locally defined in x Propagated into x and not killed by any definition in x 236800 - Parasoft® C++test by Alon Bialik 58 C++test - Static analysis Bug Detective Reachability Analysis: Step 2-cont’d Compute REACHES set for each block in a forward direction - REACHES(b): the set of definitions that reach the entry to a block b - Start with REACHES(n0) = Ø Information propagated across blocks - REACHES(b)= xpred(b)OUT(x) - OUT(x) = DEDef(x)(REACHES(x)-DEFKill(x)) Information propagated within blocks o OUT(x) is the set of definitions that reach the exit from a block x, which include definitions that are - Either generated within the block (DEDef(x)), or - Reach on entry to x and not killed by any definition in x (REACHES(x)-DEFKill(x)) 236800 - Parasoft® C++test by Alon Bialik 59 C++test - Static analysis Bug Detective Array out of boundaries detection using Reaching Definition int a[10] DEDef = 4,5 DEFKill = 1,2,7 d1 i = 0 d2 j = n d3 k = a[i] B1 d4 i = i + 1 d5 j = j - 1 B2 DEDef = 1,2,3 DEFKill = 4,5,6,7 B3 d6 i = 0 DEDef = 6 DEFKill = 1,4 B4 d7 k = a[i] 236800 - Parasoft® C++test by Alon Bialik DEDef = 7 DEFKill = 3 60 C++test - Static analysis Bug Detective Array out of boundaries detection using Reaching Definition – cont’d For every use of operator[] , i.e a[i] on block b, we check all paths leading to block b for possible definitions of i that are out of a’s bounderies May – true on some path (set union) Disadvantage of this way: search of all paths can yield not feasible paths Advantage of this way: at most cases reaching definitions that sets value of indexes are very short and not complicated 236800 - Parasoft® C++test by Alon Bialik 61 C++test - Static analysis Bug Detective “Conditions that always evaluated the same” detection using Reaching Definition In a similar way we could go over the CFG and look for Boolean conditions and check if it always (on all paths evaluates the same) Must – true on all paths (set intersection) Reaching Definition can also be used the same way to detect Division by zero, Unreachable switch branches And more.. 236800 - Parasoft® C++test by Alon Bialik 62 C++test - Static analysis Bug Detective Control Flow Analysis The CFG contains all function calls, uses of global variables, uses of parameter pointer variables, and optionally uses of all local variables, concurrency operations. The CFG includes the symbolic information for these objects, such as their names, types, whether an access is read or write, whether a variable is a parameter or not, whether a function or variable is static or not, the line number, etc. 236800 - Parasoft® C++test by Alon Bialik 63 C++test - Static analysis Bug Detective Control Flow Analysis Traverse the whole system CFG can find - Dead code. - Resources not freed. - Not allocated/initilized memory access. and more.. 236800 - Parasoft® C++test by Alon Bialik 64 C++test - Static analysis Comparing models Tool Static /dynamic Completeness Soundness customizable OS Blast Static Yes No –false alarms No windows CBMC Static No Yes No windows highly Windows/ Linux/ Solaris and more C++test Static+dynamic No C+ test does Yes – all not proof you violations can code happen 236800 - Parasoft® C++test by Alon Bialik 65 Conclusion Disadvantages… Slow… Not Open-Source No “Quick-Fix” Expensive… Does not prove your code Advantages… Easy to operate Highly customizable Can verify your code meets coding standards Prevents errors that compromise security, reliability, and performance 236800 - Parasoft® C++test by Alon Bialik 66