HipHop: High-Performance PHP Ali-Reza Adl-Tabatabai HipHop Team Facebook Facebook: Move fast & build things 3 PHP • General-purpose scripting language tailored for web development • Interactive • Weakly typed & dynamic <?php var_dump(6 + 7); var_dump('6' + '7'); var_dump('six' + 'seven'); var_dump((1<<63) + 0); var_dump((1<<63) + (1<<63)); int(13) int(13) int(0) int(-9223372036854775808) float(-1.844674407371E+19) 4 PHP • General-purpose scripting language tailored for web development • Interactive • Weakly typed & dynamic function foo($x) { echo “foo: “. $x. ”\n”; } foo(“hello”); // prints “foo: hello” foo(10); // prints “foo: 10” 5 PHP • General-purpose scripting language tailored for web development • Interactive • Weakly typed & dynamic if (...) { class B { ... } } else { class B { ... } } class C extends B { ... } 6 PHP • General-purpose scripting language tailored for web development • Interactive • Weakly typed & dynamic $a = ‘f’; $b = ‘c’; $c = ‘oo’; $func = $a . $ $b; $func(); 7 PHP • General-purpose scripting language tailored for web development • Interactive • Weakly typed & dynamic class C { public $declProp = 1; } $obj = new C; $obj->dynProp = 2; // This is OK! echo $obj->declProp . “\n”; // prints “1” echo $obj->dynProp . “\n” // prints “2” 8 PHP • General-purpose scripting language tailored for web development • Interactive • Weakly typed & dynamic if (function_exists(‘foo’)) { ... } if (class_exists($c)) { ... } 9 Memory management • PHP is reference counted • Precise destruction semantics class C { function __destruct() { echo “bye!”; } } $x = new C; $x = 1; // prints “bye!” 10 Concurrency • Single-threaded programming model • Multiple requests run in parallel • No shared memory, synchronization, or direct communication 11 Performance... 45 40 35 30 25 CPU Time 20 15 10 5 0 C++ Java C# Ocaml PHP Ruby Python Source: http://shootout.alioth.debian.org/u64q/benchmark.php?test=all&lang=all 12 Implications for Facebook 1. Bad performance can limit user features 2. Poor efficiency requires lots of re$ource$ INTERNET ... webservers storage ... 13 What have we done? Facebook (PHP) PHP/Zend Facebook (binary) Apache 14 HipHop compilation flow Facebook (PHP) hphpc Facebook (C++) PHP Runtime Gcc Webserver Facebook (binary) 15 HipHop compiler (hphpc) PHP AST C++ Parser C++ Code Generator Optimize, Infer Types Variant Double String Array Object Integer Boolean 16 Representing PHP data type data KindOfStri ng & “Lorem ipsum.” KindOfInt 13 Type inference: fewer tags! type data & “Lorem ipsum.” 13 Basic operations $a + $b Basic operations: type dispatch $ $a + $b a + $ b switch (a->m_type) { case KindOfInt: switch (b->m_type) { … } case KindOfArray: switch (b->m_type) { … } … } … } Type inference: avoiding dispatch add %rcx, %rdx $a + $b given $a :: Int,$b :: Int HipHop compiler: performance 45 40 35 30 25 CPU Time 20 15 10 5 0 C++ Java C# Ocaml PHP Ruby Python Disclaimer: estimated based on running Facebook 22 HipHop compiler: pros & cons • Good for production servers • Inadequate for development – Solution: the HipHop interpreter (hphpi) • Leverages HipHop runtime & webserver • Open problem: ≠ Can we get the best of both worlds? 23 HipHop Virtual Machine (hhvm) • Ambitious goal: replace both the HipHop Interpreter and Compiler HHVM Interpreter PHP AST Parser HHBC Bytecode Generator HHVM JIT Optimize, Infer Types 24 HipHop bytecode (hhbc) • In-house design • Stack-base VM • Closer to PHP than machine code function lookup($cache, $key) { if (isset($cache[$key])) { echo “Hit! “ . $cache[$key]; return true; } else { echo “Miss!”; return false; } } 96: 101: 106: 113: Loc 0 Loc 1 IssetM <H E> JmpZ 32 118: 123: 128: 133: 140: 141: 142: String “Hit! “ Loc 0 Loc 1 CGetM <H E> Concat Print PopC 143: 144: True RetC 145: 150: 151: String “Miss!” Print PopC 152: 153: False RetC 25 Hhvm JIT • Beyond static type inference: dynamic type specialization 1. Observe types 2. Generate specialized code 3. Guards to check types $n = 3 * $n + 1; 224: 229: 238: 243: 244: 245: 254: 255: 256: Loc 0 Int 3 Loc 0 CGetH Mul Int 1 Add SetH PopC ;; Typecheck: int($n)? cmpl $4, -4(%rbp) jne __retranslate ;; Type-spec xlation mov $3, %r12d mov -16(%rbp), %r13 mov %r13, %r14 imul %r14, %r12 add $1, %r12 mov %r12, %r13 mov $0x40000000, %r8 mov %r8, -8(%rbp) mov %r13, -16(%rbp) 26 Translation cache: Reuse & specialization $n = 1.5; ... $n = 3 * $n + 1; Translator ... __retranslate: ... Translation Cache T1: ;; Typecheck: INT($n)? cmpl $4, -4(%rbp) jne __retranslate T2 ;; Type-spec INT ;; translation . . . T2: ;; Typecheck: DOUBLE($n)? cmpl $8, -4(%rbp) jne __retranslate ;; Type-spec DOUBLE ;; translation . . . 27 Current state • hhpc – Runs www.facebook.com – Paper to appear in SPLASH ‘12 • hhvm – www.facebook.com works – Developers using it – ~27% slower than hphpc • Download from github: https://github.com/facebook/hiphop-php/ 28 Perf progress 6/11-7/14 Ongoing & future work • Performance – Profile-guided, SSA-based 2nd gear JIT – Type prediction – Tuning for the HW – Array shapes: turn hash tables into structs • Tracing garbage collection – Copy-on-write arrays – Precise destruction semantics • Language extensions 30 Summary • PHP enables us to move fast • Performance suffers because of interpreter • Hiphop compiler – Compiles PHP to C++ offline – Significantly improves user experience & data center efficiency • HipHop virtual machine – A new language VM tailored to PHP – Brings dynamic JIT compilation & optimization to PHP • Both open sourced on github 31 Thanks! Questions?