Runtime Feedback in a Meta-Tracing JIT for Efficient Dynamic Languages Writer: Carl Friedrich Bolz Introduced by Ryotaro IKEDA at 2011/09/06 Overview This paper describes about… How to make it more efficient to apply JIT compiler with PyPy PyPy : Well-known as fast Python implementation. However, in actual, it is one of framework to implement interpreter with JIT and GC! ( Python implementation is just a demo! ) What is PyPy? Framework which enables to write interpreter implementation with Restricted Python The project mainly intends to give environments to implement dynamic interpreter much efficient PyPy’s JIT Automatic Implementation Architecture It is implemented by PyPy user! Target code that is written in any language Any interpreter that is written in RPython Run PyPy’s RPython interpreter Give some “hints” to enable to run JIT compiler efficiently The most bottom one performs JIT compilation and optimization to the middle one In result, JIT compiler that is suitable for any language is automatically implemented How to treat non-language-specific JIT compilation Typical JIT Compiler Uses language-specific feature because each JIT compiler is dedicated to compile only one language PyPy’s JIT Compiler Though it is for RPython, PyPy can’t use any language-specific feature which PyPy user want to implement. It is what we called “ Meta-Tracing “ How can we make it much faster with applying efficient method for Meta-Tracing?? = Objective What Merit Using PyPy Rather than JIT of other implementation Widen compilation / optimization area Typical JIT Implementation It is too challenging for JIT compiler to target data structure operation PyPy’s JIT Implementation It traces ,and only looks to whole RPython code, so it can target data structure operation which written in RPython by developer. Hinting Mechanism Code Hint RPython Main concept PyPy Giving hints to enable JIT compiler to compile efficiently is the most important MAIN HINTS ☆ A hint to turn arbitrary variables into constants in the trace by feeding back runtime information into compilation ☆ A way to annotate operations which the constant folding optimization then recognizes and exploits. ☆ General techniques for refactoring code to expose constant folding opportunities of likely runtime constants. PyPy’s Meta-Tracing JIT Compilers Tracing To check and determine which control path to compile x = 100 y = 200 Cycle: Trace Cond Op Optimizations are also performed during this trace form Op x=x+y Also constant-folded Trace (cycles, to be compiled): Cond -> x = x + 200 -> Cond …. PyPy’s Tracer Trace Area PyPy (can / by default) traces only “hot” paths. -> Trace will be invoked frequently executed path When it crosses threshold, it is regarded as “hot” Counter 1000 It indicates how many times the loop is executed for x in sequence : t = x + …. … … ☆ As mentioned before, PyPy’s tracer doesn’t trace user program directly, but interpreter implementation written in RPython instead. Optimization Passes • Remove/simplify operations in the trace – Constant folding – Common subexpression elimination – Allocation removal – Store/load propagation – Loop invariant code motion These can be applied because traces are absolutely linear form Operate during RPython form Running Example Arrangements of shown examples Simple and bare-bones object model. • Just supports classes and instances • No inheritance • Class contains method and variable • Instance have a class, if no requested method / variable found in the instance, it searches among the class. Example Implementation Use dictionary to manage class method Use dictionary to manage instance attributes(variables/methods) To search requested method To register given method Dictionary’s “get” method costs too much. To solve the problem, it is required to make it target to JIT compilation ( The way to do this is described later discussion ) Hints for Controlling Optimization Applied only to interpreter written in RPython, not user program. • Two hints that enables to increase the optimization opportunities for constant folding Promotion Trace-Elidable ☆ Enable propagation to find “Constantfoldable” variables via trace guard Annotation to notify which variables are assumed as constant variable Though each of them never break code’s behavior, Using them incorrectly will definitely deteriorate its speed. What “Guard” is Dynamic Language test = x + y; That both x and y are number, or string is OK Static Language test = x + y; That both x and y are either number or string, types cannot be canged “Guard” It is necessary to assure each variables’ type are same to compile Dynamic Language to Static Language Native code is one of static language, it’s needed How “Guard” works Guard assures that interpreter is running compiled trace in same condition as when it is compiled at first time. Source code Now it Assure conditions Becomes y = 10 to compile them hot! z = 100 to machine code 100 for x in sequence: x=y+z y += 1 …. = func(x) During execution of compiled machine code… Trace result guard(x == int() ) guard(y = int()) guard(z = 100) x=y+z y += 1 …. = func(x) If conditions described in guard is true, it continues to run. If conditions described in guard is false, it stops to run and switch to interpreter exec. Promotion • Technique to operate constant-fold using guard Source Code x = somefunc() y = func(x) Trace Result x = somefunc() guard( x == 200) y = func(x) Trace tree x = somefunc() guard(x == 200) FALSE y = func(x) Normal execution root (interpreter) TRUE Result after Promotion P R O M O T E x = somefunc() guard( x == 200) y = func(200) y = func(200) Traced root (will be / already compiled) “Promote” how to Use “promote()” embedded method which is given by PyPy RPython interpreter to give it a “hint” that indicates “promote” can be applied during this scope. Assume the trace here usually be with a condition that self and val are expected to not so frequently varied. Guard-fall is expected not so occurs frequently Later discussion! (Soon!) It may not consume overhead so much and can be expected that constant-folding will bring great improvement. “Trace-Elidable” helps to apply “Promote” • To tell the truth, promote cannot be invoked without @elidable annotation in the example. Trace-Elidable: Assure specific method never change any variables. Though tracer want to “promote” method “f” , tracer doesn’t know whether self.c() returns always same value or not… Tracer considers not to use value-specific guard but type-specific guard… never “promoted”. @elidable annotations shows that given method is immutable This “hint” enables tracer to promote f()! Result trace after these 2 hints applied Before COMMON This trace is created without any hints given. After Constant-folding is applied via @elidable and promote. Technique to increate “trace-elidable” Putting It All Together Increasing the amount of Trace-elidable method increases chance to apply constant-folding and to help Promote. Prepare original “Map” class to manage Instance’s attributes instead of using dictionary To append @elidable annotations! for index map (described in next slide) Index map • Efficient / Suitable data structure for PyPy Map: To manage data location (index) “v1” : 0 “string” : 1 “x” : 3 List: Stores actual data 1234 Hello,world! 3.141592 …. Prepare getindex with this impelemntation, though it is immutable, trace-elidable can be used! How does Instance use the “Map”? This class which is used for manage instances no longer uses dictionary! No longer use dictionary Whole methods belong to “map” are “trace-elidable”. So the promote will work correctly! Versioning of Classes Using only trace-elidable don’t satisfy requirements In Python, though @elidable annotation is given, the method may yield not same value because any attributes can be changed. How do you feel if “inst.x = -1” is executed? It is necessary to handle this possible changes class A: def __init__(self): x = 100 @elidable def X(self): return x They propose “Versioning” inst = A() Use Guard Feature to Versioning Dummy class to use guard feature This promote helps to create value-specified guard with current “version”. So, it is still trace-elidable but can handle methods changing. When some of methods is changed, Yield new VersionTag and save it to self.version Evaluations Environment: Intel Core2 Duo P8400 processor with 2.26 GHz and 3072 KB of cache on a machine with 3GB RAM running Linux 2.6.35 It uses many OOP’s features No hints given Algorithm for board game BZ2 decoder OS Kernel Simulation Decimal floating Point calculations Conclusions • Two hints that can be used in the source code of an interpreter written with PyPy. • They give control over runtime feedback and optimization to the language implementor. • They are expressive enough for building wellknown virtual machine optimization techniques, such as maps and inlining. Effects to my Study • Use PyPy as infrastructure – It can emit C source code from RPython implementation • Applying P.T seems easy – Parallelized Template for Rpython • This paper performs optimizations in RPython form. How do you think that I consider to implement template code in RPython?