Control Flow Resolution in Dynamic Language Author: Štěpán Šindelář Supervisor: Filip Zavoral, Ph.D. Outline ▪ The aim of the project ▪ Our Approach ▪ Type Analysis ▪ Type Analysis: Data Flow Values ▪ Type Analysis: Interprocedural Analysis ▪ Evaluation ▪ Conclusion 3/19/2016 The Aim: Static Analysis for Phalanger ▪ Phalanger: PHP compiler for .NET, written in C#, compiles PHP to MSIL ▪ Started at the Department of Software Engineering in 2004 ▪ PHP is a dynamic language → no type information, one variable: multiple types. ▪ Static Analysis: analyse the code without executing it. ▪ Aim: static analysis framework for Phalanger ▪ optimizations for compiler, ▪ integrated development environments ▪ implement analyses: ▪ dead code elimination, constant propagation, type inference Outline ▪ The aim of the project ▪ Our Approach ▪ Type Analysis ▪ Type Analysis: Data Flow Values ▪ Type Analysis: Interprocedural Analysis ▪ Evaluation ▪ Conclusion 3/19/2016 Our Approach: Static Analysis ▪ Data-flow Analysis (DFA) - the de-facto standard for optimizing compilers and basis for other approaches ▪ control flow graph ▪ data-flow equations ▪ iterative algorithm to solve the equations ▪ Our task ▪ Design and implement the framework ▪ Implement desired analyses ▪ Dead code elimination, constant propagation, type inference 5 Our Approach: Type Analysis ▪ The most complex analysis implemented “When developers are given a dynamically typed programming language, it does not mean that they will write dynamically typed programs.” ▪ Our aim: ▪ Type inference for local variables and global elements ▪ Static and global variables, static and instance fields ▪ Support for PHPDoc ▪ Our task ▪ choose the data flow values (domain) ▪ deal with intraprocedural analysis ▪ deal with real world programming language - PHP 3/19/2016 Type Analysis: Data Flow Values ▪ Map from variable names to subsets of types ▪ Must form a finite lattice ▪ Support for “type hints” ▪ Inheritance ▪ Efficient representation ▪ memory consumption ▪ fast meet operator 7 Type Analysis: Interprocedural Analysis ▪ Analysis of global and static variables, static and instance fields. ▪ Return types of routines ▪ Heap memory ▪ Modularity 8 Outline ▪ The aim of the project ▪ Our Approach ▪ Type Analysis ▪ Type Analysis: Data Flow Values ▪ Type Analysis: Interprocedural Analysis ▪ Evaluation ▪ Conclusion 3/19/2016 Evaluation ▪ PHP open source projects ▪ Zebra_Image, PHPUnit, Nette ▪ Run the analysis on their source code ▪ fix bugs where possible ▪ record actual errors discovered by the tool ▪ categorize found issues 3/19/2016 Evaluation: performance Q: spotřebu paměti se mi podařilo od Odevzdání vylepšit. Můžu zahrnout nový graf? ▪ PHP open source projects 3/19/2016 Outline ▪ The aim of the project ▪ Our Approach ▪ Type Analysis ▪ Type Analysis: Data Flow Values ▪ Type Analysis: Interprocedural Analysis ▪ Evaluation ▪ Conclusion 3/19/2016 Conclusion ▪ Generic data-flow analysis analysis framework for Phalanger ▪ Type Analysis ▪ Effective data-flow values representation with bit-vectors, PHPDoc support ▪ Modular approach for interprocedural analysis ▪ Evaluation: ▪ capable of discovering several real issues with a good ratio of false positives ▪ without expensive context sensitive analysis → scalability ▪ Future Work: ▪ intergration with the compiler, performance evaluation of the emitted code ▪ arrays support 3/19/2016 Thank you for you attention QA 3/19/2016 PHPDoc 3/19/2016