Listoflectures TDDA69DataandProgramStructure VirtualMachinesandBytecode CyrilleBerger 1IntroductionandFunctionalProgramming 2ImperativeProgrammingandDataStructures 3Parsing 4Evaluation 5ObjectOrientedProgramming 6Macrosanddecorators 7VirtualMachinesandBytecode 8GarbageCollectionandNativeCode 9DistributedComputing 10DeclarativeProgramming 11LogicProgramming 12Summary 2/46 Howisaprograminterpreted? Sourcecode Parser Lecturecontent VirtualMachines Typesofvirtual/hardwaremachines Parser Bytecode AbstractSyntaxTree Treevisitor Generator Sourcecode Bytecode VirtualMachine Assembler Assembly Simpleinstructionset FromASTtoBytecode BytecodeInterpreter ... Meta-circularevaluator OperatingSystem CPU 3/46 4/46 WhatisaVirtualMachine? AVirtualMachineisahardwareor softwareemulationofarealor hypotheticalcomputersystem Asystemvirtualmachineemulatesa completesystemandisintentedto executeacompleteoperatingsystem VirtualMachines Examples:VirtualBox,VMWare,Parallels... Aprocess/languagevirtualmachineruns asingleprograminasingleprocess Examples:JVM,CPython,V8,Dalvik... 6 NativeEnvironmentvsEmulatedEnvironment LanguageVirtualMachine OperatingSystem SourceCode Input/Output NativeProgram Compiler Hardware Libraries ByteCode VirtualMachine VirtualMachine VirtualMachine Windows Linux Mac EmulatedProgram 7 VirtualMachine OperatingSystem Emulated Input/Output Input/Output Emulated HardwareInterface Hardware Bindings Libraries 8 BenefitsofVirtualMachines Portability:VirtualMachinesare compatiblewithvarioushardware platformsandOperatingSystems Isolation:VirtualMachinesare isolatedfromeachother Typesofvirtual/hardwaremachines Forrunningincompatibleapplicationsconcurrently Encapsulation:computationis seperatedfromtheoperatingsystem Beneficialforsecurity 9 Typesofvirtual/hardwaremachines RegisterMachine:Formaldefinition A(in)finitesetofregisters,whichholdsa singlenon-negativeinteger Aninstructionset,whichdefinesthe operationonregisters RegisterMachine StackMachine Arithmetic,control,input/output... Astateregister,whichholdsthecurrent instructionanditsindex Sequentiallistoflabeledinstructions whichdefinestheprogramtobeexecuted 11 12 StackMachine:Formaldefinition An(almost)infinitestack,whichholds integers Aninstructionset,whichdefinesthe operationonthestack StackMachinesvsRegisterMachines StackMachinesneedmorecompactcode Arithmeticinstructionaresmaller StackMachineshavesimplercompiler, interpretersandaminimalprocessorstate StackMachineshaveaperformance disadvantages Arithmeticoperationsarealwaysappliedonthetoptwo elementsandtheresultsisstoredinthetop Astateregister,whichholdsthecurrent instructionanditsindex Sequentiallistoflabeledinstructions whichdefinestheprogramtobeexecuted Morememoryreferences,lesscachingoftemporaries Highercostoffactoringoutcommonsubexpressions Ithastobestoredasatemporaryvariable Mostcommonhardwareareregistermachines 13 14 VirtualMachinesforDynamicTyping Remember: Withstatictyping,typesarecheckedduring compilation Withdyamictyping,typesarecheckedduring execution Bytecode Implication Stack/Registerscontainspointertoobjects Functioncallconvention 15 Bytecode Foravirtualstackmachine Instructionset Generatethebytecode Interpretingthebytecode Simpleinstructionset 17 Stackmanagment Arithmeticoperators PUSH[constant_value] ADD,MUL... Pushtheconstantonthestack Poptwoargumentsonthestack Pushtheresult POP[number] Popacertainnumbersofvariablesfromthe stack 19 20 Variables Jumps LOAD[varname] JMP[idx] Pushthevalueofvariable Jumptoexecuteinstructionatthegivenindex DCL[varname] IFJMP[idx] Declarethevariable Popthevalueandiftruejumpto[idx] STORE[varname] Getthevalue,storetheresultandpushthe value 21 Functions 22 Objects CALL[arguments] LOAD_MEMBER[varname] Popthefunctionobjectandcallitwiththe givennumberofarguments Poptheobjectandpushthevalueofvariable STORE_MEMBER[varname] RET Poptheobjectandvalueandstoreitandpush thevalue Return Tosimplifythebytecodethiscan beimplementedasfunctioncall 23 24 FromASTtoBytecode Withatreevisitor... Itcantakeseveralpass: FromASTtoBytecode Findthevariables Computethejumps Generatethecode Therealchallengeistomaphigh levellanguagetoinstructions 26 Arithmeticoperation Functions 1+2 a-2 vara a=1 vara=1 a+=2 a=b*c+2 a.b=c func(1) func(1,2) console.log('Helloworld!') functionfunc(a,b){returna+b; } 27 28 Controls if(a){console.log('test')} if(a){console.log('hello')} else{console.log('world')} vara=10; while(a){a-=1;} for(vara=0;a<10;a+=1) {console.log(a);} for(vara=0;a<10;a+=1) {console.log(a);if(a+b){break;}} BytecodeInterpreter 29 Componentsofbytecodeinterpreter Verifier Verification Verifiercheckscorrectnessof bytecode Bytecodemaycomefrombuggycompileror malicioussource Dynamiccheckingoftypes,arraybounds, functionarguments... Everyinstructionmusthaveavalidoperation code Everyinstructionmusthavevalidparameters Everybranchinstructionmustbranchtothe startofsomeotherinstruction,notmiddleof instruction Instructionexecuter 31 32 BytecodeInterpreter Interpreterloop instruction_index=0 instructions=[...] stack=[] current_env=Environment() whileinstruction_index<len(instructions): next_instruction=instructions[instruction_index] switch(next_instruction.opcode): caseADD: .. caseJMP: ... Standardvirtualmachine interpretsinstructions Performrun-timecheckssuchasarraybounds andtypesandfunctionarguments Possibletocompilebytecodetonativecode (JIT:Just-In-Time) Callnativemethods TypicallyfunctionswritteninC 33 34 InterpreterExample FunctionCall Printtheabsolutevalueof4-1: Recursivecall 1PUSH4 2PUSH1 3SUB 4DUP 5PUSH0 6SUP 7IFJMP9 8NEG 9LOAD'print' 10CALL Forafunctioncall,instantiateanew interpreterloop Stack Pushonthestacksomeinformationonhowto restoretheinterpreterwhenreturning Morecomplexcode,buthigherperformance andflexibility Allowinfiniterecursion 35 36 HandlingExceptions(1/2) FunctionCall-Stack instruction_index=0 instructions=[...] stack=[] current_env=Environment() whileinstruction_index<len(instructions): next_instruction=instructions[instruction_index] switch(next_instruction.opcode): ... caseCALL: env=Environment() func=stack.pop() forarginfunc.args: value=stack.pop() env.set(arg.name,value) stack.push([next_instruction,instructions,current_env]) next_instruction=0 instructions=func.instructions current_env=env Usetheimplementationlanguage exceptions Addexceptioninformationtothe stack caseRET: retval=stack.pop() info=stack.pop() stack.push(retval) next_instruction=info[0] instructions=info[1] current_env=info[2] 37 38 HandlingExceptions(2/2) instruction_index=0 instructions=[...] stack=[] current_env=Environment() whileinstruction_index<len(instructions): next_instruction=instructions[instruction_index] switch(next_instruction.opcode): ... caseTRY_PUSH: stack.push(Exception(next_instruction.rescue_index,instructions)) caseTRY_POP: stack.pop() caseTHROW: whileTrue: info=stack.pop() if(infoisException): instructions=info.instructions next_instruction=info.rescue_index Meta-circularevaluator 39 Meta-circularevaluator Anevaluatorthatiswritteninthe samelanguagethatitevaluatesis saidtobemetacircular(SICP4.1). RPython RPythonisasubsetofPython Variablescontainsvalueofatmostonetype Moduleglobalsareconstants Restrictionsoncontrolflow,objects, exceptions... Easytoextendthelanguage Makeiteasiertobuildsophisticated debuggers Moresuitedforbuildingnewlanguages Meanttobeeasytointerpret 41 RPythonandPyPy 42 PyPyVirtualMachine Usuallyinterpretersarewrittenina targetplatformlanguagesuchasC ButCisacomplexandunsafe language,andinterpretertendto getcomplicated PyPyusesRPython: SourceCode Compiler PyPyVM ByteCode toprovideasetoftoolsforimplementing interpretersforanylanguage apythonimplementationusingthosetools 43 PyPyVM PyPyVM PyPyVM Windows Linux Mac 44 BenefitsofRPythonoverhostlanguage Easiertodevelop Conclusion VirtualMachinesforinterpreting programs Stackmachinesareeasierto implementbutslowerthan registermachines VirtualMachinesintroduce compilationoverhead,butare fastertoexecute IfyouknowPython,youknowRPython Easiertodebug SinceyoucanrunRPythonprogramwitha regularPythoninterpreter Factortheporttodifferent platform 45 46/46