TABLE OF CONTENTS 1. Microprocessor Based on x86 2. x86 – 16 bits 3. x86 – 32 bits 4. x86 – 64 bits 5. Arithmetic Logic Unit (ALU) 6. Random Access Memory (RAM) 7. Registers 8. Accessing the Hardware 9. Registers 10. General Purpose 11. Pointers 12. Segment Registers 13. Flag Register 14. Offset 15. Programming Issues 16. Assembly Instructions 17. OPCODES or “Binary Code” 18. INT 19. PUSH 20. POP 21. CMP 22. Understanding the Jumps 23. Unconditional Jump (JMP) 24. Conditional Jumps 25. Jump If Equal (JE and JZ) 26. Jump If Not Equal (JNE and JNZ) 27. Jump If Above and Jump Not Below or Equal (JA and JNBE) 28. Jump If Below and Jump Not Above or Equal (JB and JNAE) 29. Jump If Greater (JG) 30. Jump If Less (JL) 31. Other Conditional Jumps 32. Disassembling High Level Languages 33. Analyzing the Program 34. Changing the Binary 35. Number of Bytes 36. Sum of Bytes 37. References Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 2 1. Microprocessor Based on x86 When we hear about x86, we have to remember where it comes from. The first x86 was the 8086 (designed by Intel between 1976 and 1978) and used in the IBM PC. The most common computer that used it was the IBM PC XT. This microprocessor has 16 bits, which means all the registers, the arithmetic logic unit (ALU) and most instructions worked with 16 bit instructions. The explanation is the same for the 32 and 64 bits processors. Another good thing to point out is that we are going to mention here just processors made by Intel, but the architecture is the same for processors made by AMD or other vendors that follow the x86 architecture. For example, if we consider the old AMD K6, the architecture was the same as the Intel Pentium II, which means that the registers, the ALU and instructions are the same. 1.1. x86 – 16 bits There are a lot of differences between all the 16 bit processors, if you consider the design. But speaking about the instructions that we need to know to debug low level software, there is no need to understand these details. The following Intel processors were made with 16 bit architecture: 8086, 8088, 80188, 80186 and 80286. As you can see, there was some of them with an “88” at the end of their names, and this is because of the size of their external 8 bit databus. You can find a lot of books explaining this architecture, and you can extend some explanations to the 32 and 64 bits x86 processors. Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 3 Figure 1 – 8086 Processor (Source: Wikipedia) 1.2. x86 – 32 bits The following Intel processors were made with 32 bit architecture: 80386, 80486, Pentium (Pro, MMX, II, III, M), Pentium IV (some versions), Itanium IA­32, Core and Pentium Stealey. Figure 2 – 80386 Processor (Source: Wikipedia) Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 4 1.3. x86 – 64 bits Nowadays, we use 64 bit processors most of the time for personal computers, desktops, notebooks, laptops, servers, etc. Usually, when you are looking for a drive for a specific device on the Internet, they name the x86 64 bit as just 64 bit. On the other hand, they name the x86 32 bit as just x86, which is not totally true, because you are including the old 16 bit processors. The following Intel processors were made with 64 bit architecture: Pentium IV (some versions), Core (2, i3, i5, i7), Atom, Sandy and Ivy Bridge, Xeon Phi and Haswell. 1.4. Arithmetic Logic Unit (ALU) The Arithmetic Logic Unit, or just ALU, is the component that we use to make calculations (here we are not going to explain the differences between the integer and floating point operations). Every time you have a calculation, like a variable that receives a sum of two other variables, you are using the ALU. Also, when you have a decision in your program, like an “if” or a “while” instruction, you are also using the ALU. In the first x86, the ALU used to be outside the microprocessor (e.g. 8087), but nowadays it is within the microprocessor and it is not possible to see this component on your motherboard. 1.5. Random Access Memory (RAM) The Random Access Memory, or just RAM, is the component that we use to put o ur programs and data when we want to run some something. For example, your program Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 5 could reside on your hard disk, then the operating system loads the program into RAM and then you can run it. Without the RAM, you cannot execute most of your programs, and you will address the bytes in the RAM using the registers that we will explain later on. 1.5. Registers There are a lot of ways to explain what a register is. If you have some background in electronics, you can think about flip­flop, but for us, let's think of registers as a way to save values inside the microprocessor, and the size of the register depends on the architecture (e.g. x86 16 bit has 16 bit registers). For example, you will need to put a value in the register if you want to make some computations using the ALU or if you want to access a region of memory that can only be accessible under segmentation (it will be explained later in this material). Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 6 2. Accessing the Hardware Here we are going to discuss and explain some components that you should know about before trying to understand how a program really works in a low level approach. 2.1. Registers Let's see each group of registers and it's use inside the architecture x86. 2.1.1. General Purpose This group is formed by registers that can be used in different situations, such as moving data from/to memory, looping counter, etc. In the following table we have the description and the name of the register for each x86 architecture. Table 1 – General purpose registers. Register Name 16 bits 32 bits 64 bits Accumulator AX EAX RAX Base BX EBX RBX Counter CX ECX RCX Data DX EDX RDX As you can see, they are really easy to memorize, and you just change the first letter for each architecture based on the 16 bits (E – 32 bits and R – 64 bits). The other thing is that, considering a 16 bit register, you can access the High or the Low part of each one, as you can see in the Table 2. Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 7 Table 2 – High and low parts of a 16 bit general purpose register. Register Name High Low Accumulator AH AL Base BH BL Counter CH CL Data DH DL As you may know, usually you can run 32 bit programs inside a 64 bit architecture. Also, it is possible to run 16 bit programs inside 32 and 64 bits (it depends on how they wrote the code or if your hardware is emulating a 16 bit environment). The point is that you should know all the architectures, then you will be ready to understand all the programs that are eventually running on your machine. 2.1.2. Pointers You will always need pointers to point to some data in the main memory, to know which will be the next instruction to run or to know which region of the stack you are accessing. That is why we need pointers. Take a look in the next table. Table 3 – Pointer Registers. Register 16 bits 32 bits 64 bits Instruction Pointer IP EIP RIP Base Pointer (Stack) BP EBP RBP Stack Pointer SP ESP RSP Source Index SI ESI RSI Destination Index DI EDI RDI Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 8 One of the most important registers is the Instruction Pointer. Its function is to point to the next instruction that should run. When you debug a program, you will always see that this register has its value incremented after each instruction that you run. Also, when you have to manipulate the stack (which is located in the RAM), you will have to use the Base Pointer and the Stack Pointer. Every time you put a value in the stack, the Stack Pointer will be decremented and every time you take off a value in the stack, the Stack Pointer will be incremented. We also have the Source and the Destination Index, both used to address a region of a segment of the RAM or a character in a string, for example. We are going to discuss the segment registers in the next section. 2.1.3. Segment Registers When you want to access your program, data or stack, you will need to specify where each piece of information is, that is why we need the segment registers (actually it is a legacy from the the 8086 and 8088 – we are not going to discuss this issue here). The point is that we have to use them to access each part of our program. Table 4 – Segment Registers. Register 16 bits 32 bits 64 bits Code Segment CS CS CS Data Segment DS DS DS Stack Segment SS SS SS Extra 1 Segment ES ES ES Extra 2 Segment ­ FS FS Extra 3 Segment ­ GS GS Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 9 As we can see, the only difference between the three architectures is that in 16 bits, we do not have Extra 2 and Extra 3 registers. Also, they have the same name (and size) in these architectures. When we start debugging our programs, you will see that your code will always be inside the Code Segment and your data (e.g., your variables) will always be inside the Data Segment. 2.1.4. Flag Register This register is used by the ALU to show what happened after a certain operation. For example, if you do some calculation and the result is zero, one specific bit of this register is changed to one. The same thing happens when you compare two equal values. Basically, you will note that both arithmetic and comparing instructions will change this value, allowing the next compare instruction to make a right decision in your code. You will understand later in this document how important this knowledge is, because you can manipulate the behavior of a program just doing small changes related to the result of an ALU operation. Table 5 has a list of all the flags within the Flag Register. Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 10 Table 5 – Flags used by ALU. Flag Description Carry (CF) Set if an arithmetic carry/borrow has been generated out of the highest bit. Auxiliary (AF) Same as Carry but concerning the lower nibble. Parity (PF) Set if the number of bits with value one is even. Signal (SF) Set if the value is negative. Trap (TF) If set, executes one instruction and stops. Interruption If set, hardware interrupts will be handled. Direction (DF) Set the direction (forward or backward) when manipulating bytes. Overflow (OF) Set if an overflow happens in the value. There are also other flags that you can see in Table 6, but we do not commonly use them, thus we are just going to mention their names and abbreviations. Table 6 – Extend Flags used by ALU. Flag Abbreviation I/O privilege level IOPL Nested task NT Resume RF Virtual 8086 VM Alignment Check AC Virtual Interrupt VIF and VIP Identification ID Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 11 2.1.5. Offset This is one of the issues that sometimes is difficult to understand, but we are going to make it a way of using the example of the first x86 (8086). On this machine, it is possible to address 1048576 bytes (2^20) but the one register can address just 65536 (2^16) bytes. The solution is the use of the segment registers (e.g., DS) together with a general purpose register or a pointer register. Thus, we are going to have the segment register with the address and the other register (general purpose or pointer) with the displacement inside the segment. Take a look at this equation: Real Address = (Segment * 16) + Displacement Let's suppose that you want to access the address FA69Ah, but you have the limit of FFFFh in a 16 bit register (the 'h' is just to indicate a Hexadecimal value). Using the previous equation: Real Address = (F000 * 16) + A69A Real Address = F0000 + A69A = FA69Ah Just to clarify things, let's show some Assembly code to solve this situation. Take a look at Table 7 with code. Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 12 Table 7 – Offset example. Line X86 16 bit Instruction 1 MOV BX, F000 2 MOV DS, BX 3 MOV SI,A69A 4 MOV DS:[SI], 8F We cannot move values directly to segment registers, that is why we use the BX register (line 1), to receive the value and then transfer it to the segment register (line 2). In line 3, you can see that we are using a pointer register, in this case SI, to receive the displacement. The last instruction shows that we put the value 8F inside the segment DS, with the displacement pointed out by SI. Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 13 3. Programming Issues In this chapter, we are going to explain what kind of things we should understand about low level programming to be able to change our binary programs. 3.1. Assembly Instructions There are a lot of instructions in Assembly language for x86, and there are instructions there only exist in 32 bits or 64 bits. Thus, we are going to show some of the most important and then explain how they work. In Figure 3, we can see some instructions that we can use since the 16 bit microprocessor. AAA AAD AAM AAS ADC ADD AND CALL CBW CLC CLD CLI CMC CMP CMPSB CMPSW CWD DAA DAS DEC DIV HLT IDIV IMUL IN INC INT INTO IRET JA JAE JB JBE JC JCXZ JE JG JGE JL JLE JMP JNA JNAE JNB JNBE JNC JNE JNG JNGE JNL JNLE JNO JNP JNLE JNO JNP JNS JNZ JO JP JPE JPO JS JS LAHF LDS LEA LES LODSB LODSW LOOP LOOPE MOV MOVSB MOVSW MUL NEG LOOPNE LOOPNZ LOOPZ NOP NOT OR OUT POP POPA POPF PUSH PUSHA PUSHF RCL RXR REP REPE REPNE REPNZ REPZ RET ROL ROR SAHF SAL SAR SBB SCASB SCASW SHL SHR SRC STD STI STOSB STOSW SUB TEST XCHG XLAT XOR Figure 3 – Some x86 Assembly instructions. Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 14 We are not going to explain all the instructions here, instead we are just going to explain those that are usually related to things that we want to modify in the binary program. For example, when you have a message that appears after a certain period of usage of the software, we will need to understand the COMPARE instruction and a conditional JUMP. In addition, as we have more than one COMPARE and JUMP instruction, we are going to pick up one of each to explain, because the others have almost the same explanation. But before we do this, let's speak a little bit about the OPCODES, usually called BINARY CODE. 3.2. OPCODES or “Binary Code” When you started learning programing, you probably heard that the computers only understand binary code, and in the end of the process of generating a new program (e.g. compile, link), you will have a binary somehow. This is somehow true, without thinking about all the details that are related to this process. The point is that, in the end, you will have a binary file, running locally or in a web server inside a program that can interpret your code (e.g. PHP, Python). But what is a binary instruction for us? We really need to understand this approach to do what we want to do. W ell, sometimes it helps to understand what is going on in a program when we do not have the source code, but usually the Disassembler will revert the code for us in Assembly language. If you want to make a change permanent to the code, then sometimes you will need to know the “binary code” of the instruction(s) that you want to change. Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 15 In fact, we do not deal with the “binary code” of the instruction, because it would be difficult to deal with so many “symbols”. Thus, the best approach is to deal with this instruction as a hexadecimal code. In fact, if you consult the manual of a microprocessor, you will see that they use the hexadecimal code of the instruction instead of the binary code. 3.2.1. INT I'm going to give a small example just to make things easy to understand. We have a instruction called INT that we use to call an interruption (if you want to understand more about interruption, read our references). Let's take a look at Table 8. Table 8 – Opcode and binary of INT instruction. Instruction Opcode Binary INT 14 CD 14 11001101 00010100 INT 16 CD 16 11001101 00010110 INT 20 CD 20 11001101 00100000 INT 21 CD 21 11001101 00100001 As you can see in the first column, we have call to different interruptions, each one usually representing some hardware (again, this is not the scope of this material). But the point is to see that the opcode for an INT instruction is CD (second column) plus the number of the interruption. Maybe you could ask: And the binary code? Well, we just convert the opcode,which is a hexadecimal value, to a binary value using a calculator or manually. This is the binary code that the computer understands, but it is not so easy to deal with binary code when you Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 16 can choose between opcodes or Assembly language. Each register also has a value that you have to sum to some instructions, like MOV, PUSH, POP, etc. Take a look at Table 9 to see these values. Table 9 – Opcode value of 16 bit registers. Register Opcode Value AL, AX, ES 0 CL, CX, CS 1 DL, DX, SS 2 BL, BX, DS 3 AH, SP 4 CH, BP 5 DH, SI 6 BH, DI 7 3.2.2. PUSH To show an example of how to use this value, we are going to explain the instruction PUSH, which is used to put values in the stack. In Table 10, we can see the opcodes for each use of PUSH and we need the values from Table 9 to make them. Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 17 Table 10 – Opcodes of PUSH. Instruction Opcode PUSH “16 bit register” 50 + Opcode value from Table 9 PUSH ES 06 PUSH SS 16 PUSH DS 1E PUSH “1 byte → b1” PUSH “2 bytes → b1 b2” 6A b1 68 b1 b2 As you can see, when you push a segment register (ES, CS, SS and DS), you have to memorize the opcode. But when you push a 16 bit register, you know that the value of the opcode will be 50 plus a register value shown in Table 9. Also, if you want to put a 8 bit or 16 bit value direct into the stack, the opcode will be 6A and the 8 bit value or 68 and 16 bits value. All these examples can be seen in Table 11. Table 11 – Opcode examples of PUSH. Instruction Opcode PUSH AX 50 PUSH CX 51 PUSH DX 52 PUSH BX 53 PUSH SP 54 PUSH BP 55 PUSH SI 56 PUSH DI 57 PUSH 77 6A 77 PUSH 77 66 68 77 66 Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 18 3.2.3. POP As we have a instruction to put a value in the stack (PUSH), we also have another instruction to remove a value from the stack. This instruction is called POP. As shown before, we also have to use the values from Table 9 to construct Table 11 with all the opcodes of POP. Table 11 – Opcodes of POP. Instruction Opcode POP “16 bit register” 58 + Opcode value from Table 9 POP ES 07 POP SS 17 POP DS 1F POP “1 byte → b1” 8F b1 Again, if you are going to pop a segment register (ES, SS and DS), you have to memorize the opcode. Also, there is no opcode for CS, because you can not set a value for CS this way (you can find more information in our references). Using a 16 bit register, you have to sum 58 with the values from Table 9. Let's see some examples in Table 12. Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 19 Table 12 – Opcode examples of POP. Instruction Opcode POP AX 58 POP CX 59 POP DX 5A POP BX 5B POP SP 5C POP BP 5D POP SI 5E POP DI 5F 3.2.3. CMP The CMP instruction is very useful when you want to compare values in registers or memory. After you compare a value, the ALU changes the value of the flags and then you can use a CONDITIONAL JUMP to go to a specific place in your code. At this point, here we are going to stop showing every opcode, because we do not have space here for this and you can look up all the opcodes in the official documents of Intel processors. Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 20 But let's see an example how the instruction CMP works, let’s suppose the following code: ABSOLUT E MEMORY ADDRESS 0100 INSTRUCTION 0102 MOV BL, 02 0104 CMP AL, BL 0106 JE 010A 0108 INT 20 010A MOV AH, 02 010C MOV DL, 31 010E INT 21 MOV AL, 02 0110 INT 20 Figure 4 – Example code of CMP. It is a really common example that puts the same value (02) in two different registers (AL and BL). After this, we have the instruction CMP AL, BL that, of course, will change the value of the ZERO FLAG. Why? Because from the ALU perspective, the CMP instruction is a SUB (subtract) instruction but without saving the result in the end although changing the value of the flags (depending on the result). Continuing with the code, we have a JE 010A (which means JUMP EQUAL – more explanation in the next sections). This instruction will check if the ZERO FLAG is set, if it is TRUE, then the IP (Instruction Pointer) will go to the address 010A within the actual code segment (CS) and continue executing each instruction. If it is FALSE, then the code will continue and will match an INT 20 instruction to TERMINATE THE PROGRAM. Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 21 3.3. Understanding the Jumps We have three different kinds of jumps: ­ Unconditional, ­ Conditional, ­ Calls. These three kinds of jumps can be divided into two groups : – Direct: jump/call with address, – Indirect: jump/call without address. In “Direct”, we have the destination address where the code should go together with the OPCODE of the JUMP/CALL. In “Indirect”, we have to obtain the address using a register or a another memory address. In the “Direct” group, we have three different sub­groups : – Short: up to 1 byte of distance, – Near: up to 2 bytes of distance, – Far: up to 4 bytes of distance. In the “Indirect” group, we have three different sub­groups : 1 – Register: the register has the destination address, 2 – 2 bytes variable: the memory has the destination address (2 bytes), 3 – 4 bytes variable: the memory has the destination address (4 bytes). Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 22 Unconditional jumps (JMP instruction) allow you to use all the above combinations (direct and indirect). However, conditional jumps (e.g., JE, JNE, etc.) just allow you to use the Direct Short mode. 3.3.1. Unconditional Jump (JMP) Sometimes you are in a certain position of your code and then you need to jump to another address, like if you are trying to avoid a region of memory where a conditional jump should execute. Thus, we need to use the unconditional jump, or just JMP. If you know a couple of different high level programming languages, you have probably seen the instruction named GO or GOTO. This is the same thing: you have a specific label where you want to go, then you usually say GO/GOTO “xyz”. In Assembly language, you are going to use JMP “xyz”. Let's take a look at Table 13. Table 13 – Opcodes of JMP. Instruction Opcode Group Sub­Group JMP 1 byte EB #bytes Direct Short JMP 2 bytes E9 #bytes Direct Near JMP pointer:address EA #bytes Direct Far JMP register/memory FF ** Indirect Near JMP memory:memory FF ** Indirect Far In Table 13, the “#bytes” means the number of bytes you are going to jump above and below your code, using the same approach used to store a signed char or int in C language (please, take a look in our references at the end of this material). Furthermore, the “**” means that you have to consult the Intel Manual to see the values that you will have here. Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 23 3.3.2. Conditional Jumps We use the conditional jumps more often, because every time you have an IF/ELSE or LOOP (e.g., while, for, etc.) in a high level programming language, if you look into the disassembled code, you will find one or more types of conditional jumps. All these instructions check the flags to decide if they should or not jump to a specific address. Now we are going to see some of them to understand how it works. 3.3.2.1. Jump If Equal (JE and JZ) These instructions check if the Zero Flag (ZF) is set. If it is true, then Instruction Pointer will be adjusted to the destination address that comes right after the conditional jump. Let's take a look at Table 14 to see the instructions and opcodes. Table 14 – Opcodes of JE and JZ. Instruction Opcode Group – Sub­Group JE/JZ 1 byte 74 #bytes Direct ­ Short JE/JZ 2 bytes 0F84 #bytes Direct ­ Near Basically, if both values are equal (e.g., registers, memory), then the code will jump to the destination address. 3.3.2.2. Jump If Not Equal (JNE and JNZ) These instructions check if the Zero Flag (ZF) is not set. If it is true, then Instruction Pointer will be adjusted to the destination address that comes right after the conditional jump. Let's take a look at Table 15 to see the instructions and opcodes. Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 24 Table 15 – Opcodes of JNE and JNZ. Instruction Opcode Group – Sub­Group JNE/JNZ 1 byte 75 #bytes Direct ­ Short JNE/JNZ 2 bytes 0F85 #bytes Direct ­ Near Basically, if the values are different (e.g., registers, memory) then the code will jump to the destination address. 3.3.2.3. Jump If Above and Jump Not Below or Equal (JA and JNBE) These instructions check if the Zero Flag (ZF) and Carry Flag (CF) are set. If it is true, then Instruction Pointer will be adjusted to the destination address that comes right after the conditional jump. Let's take a look at Table 16 to see the instructions and opcodes. Table 16 – Opcodes of JA and JNBE. Instruction Opcode Group – Sub­Group JA/JNBE 1 byte 77 #bytes Direct ­ Short JA/JNBE 2 bytes 0F87 #bytes Direct ­ Near Basically, if the first value is above the second value (e.g., registers, memory) then the code will jump to the destination address. 3.3.2.4. Jump If Below and Jump Not Above or Equal (JB and JNAE) These instructions check if the Carry Flag (ZF) is set. If it is true, then Instruction Pointer will be adjusted to the destination address that comes right after the conditional jump. Let's take a look at Table 17 to see the instructions and opcodes. Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 25 Table 17 – Opcodes of JB and JNAE. Instruction Opcode Group – Sub­Group JB/JNAE 1 byte 72 #bytes Direct ­ Short JB/JNAE 2 bytes 0F82 #bytes Direct ­ Near Basically, if the first value is below the second value (e.g., registers, memory), then the code will jump to the destination address. 3.3.2.5. Jump If Greater (JG) These instructions check if the Carry Flag (ZF) is set. If it is true, then Instruction Pointer will be adjusted to the destination address that comes right after the conditional jump. Let's take a look at Table 18 to see the instructions and opcodes. Table 18 – Opcodes of JG. Instruction Opcode Group – Sub­Group JG 1 byte 7F #bytes Direct ­ Short JG 2 bytes 0F8F #bytes Direct ­ Near Basically, if the first value is greater than the second value (e.g., registers, memory), then the code will jump to the destination address. 3.3.2.6. Jump If Less (JL) These instructions check if the Carry Flag (ZF) is set. If it is true, then Instruction Pointer will be adjusted to the destination address that comes right after the conditional jump. Let's take a look at Table 19 to see the instructions and opcodes. Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 26 Table 19 – Opcodes of JL. Instruction Opcode Group – Sub­Group JL 1 byte 7C #bytes Direct ­ Short JL 2 bytes 0F8C #bytes Direct ­ Near Basically, if the first value is less than the second value (e.g., registers, memory), then the code will jump to the destination address. 3.3.2.5. Other Conditional Jumps As we are not going to discuss all the conditional jumps that we have, we decided to make a small table (Table 20) with some of them and their opcodes for short and near jumps. However, as we are not going to show every jump, we suggest you to take a look in the official Intel documents if you do not find what you are looking for. Table 20– Conditional jumps and their opcodes. Instruction Opcode (1 byte) Opcode (2 bytes) Jump If JE / JZ 74 #bytes 0F84 #bytes Equal JNE / JNZ 75 #bytes 0F85 #bytes Not Equal JA / JNBE 77 #bytes 0F87 #bytes Above JB / JNAE 72 #bytes 0F82 #bytes Below JG / JNLE 7F #bytes 0F8F #bytes Greater JL 7C #bytes 0F8C #bytes Less JBE JGE / JNL 76 #bytes 7D #bytes 0F86 #bytes 0F8D #bytes Below or Equal Greater or Equal JLE 7E #bytes 0F8E #bytes Less or Equal Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 27 4. Disassembling High Level Languages Every binary that you have in your system usually was made in a high level language, but you don’t always know the language. Sometimes it’s possible to revert or decompile your binary into the original source code. This is not an easy task; with some specific codes it is possible to do this, but you will not know the name of the variables or the comments inside the code (for example). But you can always see the binary, convert it to hexadecimal and then interpret each byte, or group of bytes, as an instruction (as you learned in the last section). You can do it manually or use a specific tool to do this. In the Table ZZ we have a list of tools that you can use for disassembling x86 code: Table ZZ – Some disassembler for x86. Tool X86 Architecture Operating System Interactive Disassembler (IDA) 32 and 64 Linux / Win OllyDbg 32 Win Hack 16 DOS / Win NDISASM 32 and 64 DOS / Win / Mac / Linux Of course there is much more than this, you can take a look on the Internet and check some free, open source and commercial tools that you can use for this purpose. In this course, most of the time we are going to use OllyDbg or IDA (32 bit free version), because it is free and easy to use. The bad thing is that it only runs on Windows. In Linux examples, we are going to use NDISASM, that comes with NASM. O ne of the best disassemblers is Interactive Disassembler, or just IDA. This software is able to disassemble code from various architectures, not just x86. They have a free x86 32 bit version for Windows. The full version, that runs on Linux and a bunch of d ifferent architectures, is very expensive. Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 28 4.1. Analyzing the Program There are a lot of things that you should think about before doing reverse engineering to a binary program. Usually, you want to modify something, like removing a message that appears in a certain moment. Sometimes you just want to change the value of a constant or variable, other times you may need to fix a bug in an old legacy code. It is really difficult and unprovable to do reverse engineering on the whole code, because the binary is usually coded in a high level language, and when you disassemble the code, you will have many more lines in Assembly language compared to the original source code in a high level language. One good example to clarify this is thinking in a LOOP coded in a high level language. Let's suppose that you have a C code like the following (Figure 4): for (i = 0; i < 0x1212; i++) { } Figure 4 – A loop coded in C language. The above code is really simple; as you can see it does nothing, just repeats for 1212 times (the 0x indicates an hexadecimal value). Just remember the syntax of the “for” loop: first parameter (i = 0) is the initial condition; second parameter (i < 0x1212) is the stop condition and the third parameter is the increment (in this case the same as i = i + 1). T o understand what happens with the coder after you compile it, we did the Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 29 compiling process in two different environments. The first one was done with the old Borland Turbo C (3.0). After we had compiled and linked the source code, we got the binary. Thus, it was possible to disassembly the binary and analyze the Assembly code. Let's take a look at Figure 5. XOR SI, SI JMP SHORT loc_1029A loc_10299: INC SI loc_1029A: CMP SI, 1212h JL SHORT loc_10299 Figure 5 – Disassembly of the binary loop compiled in Turbo C 3. To disassemble the code, I used the IDA Free for 32 bits. Do you think you can understand the relationship between the original C code and the disassembled code? Let's explain a little to you. The instruction XOR SI, SI you can understand as the “i = 0” (the first parameter from the original C code). How do we know this? It is easy: when you use XOR (Exclusive OR) using the same value for the both parameters, you will get a zero as a result (if you don't know how a XOR works, please look at the references). Let's jump to the instruction INC SI. This one means the third parameter, which is “i++”. This one is really easy to identify, because the Assembly instruction INC just increments one register that you specify. In this case, we are using the pointer register SI, which was previously started with zero. Going back to the second parameter (“i < 0x1212”) and looking in the disassembly of it, Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 30 you will observe that we have more than one Assembly instruction to represent it: JMP SHORT loc_1029A, CMP SI, 1212h and JL SHORT loc_10299. T he first Assembly instruction (JMP SHORT loc_1029A) is an unconditional loop, j ust meaning that the code will jump to the specified label (observe that the label is just a reference to the memory address, but the IDA disassembler helped us by putting a name on it). When the code jumps to “loc_1029A”, you will see that we have the instruction CMP SI, 1212h (our second Assembly instruction related to the “i < 0x1212”). This is an ALU instruction, which compares the register SI with 1212h and sets the flag bits in the Flag Register (section 2.1.4). Actually, the instruction CMP acts as a SUB (subtract), the only difference is that instruction CMP does not save the result, just changes the flags. Now that we have the flags updated, we can analyze them using a conditional jump, in this case JL. As you can can see in the third instruction related to the “i < 0x1212”, we have a JL SHORT loc_10299, which means “jump if less” referring to the “<” symbol, originally present in our C code. Note that it was really fast and easy to analyze parameters one and three from our “for” loop present in the C source code, just the parameter number two takes a while to understand the logic, but it is also easy. Now, let's see what we get when we compile the same C code using “gcc” compiler (Figure 6). JMP SHORT loc_401354 loc_401350: INC [ESP+10h+var_4] loc_401354: CMP [ESP+10h+var_4], 1211h JLE SHORT loc_401350 Figure 6 ­ Disassembly of the binary loop compiled in GCC. Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 31 As you can see, the Assembly code is not the same, there are a lot of different decisions made by the compiler that resulted in a different OPCODE combination (that we are seeing here as Assembly instructions). Let's start thinking about our loop parameters: i = 0, i < 0x1212 and i++. What happened to the first parameter (i = 0)? Well, here it is difficult to see, but the “gcc” compiler decided to deal with the variable “directly”, without moving it into a pointer register (e.g. SI, as we saw before). T o clarify this situation, let's jump again to the third parameter (i++), which will help us also explain the first one. The Assembly instruction INC [ESP+10h+var_4] is the one responsible for incrementing the variable “i” table that originally controls the loop. The first thing to observe here is the fact that the compiler generates a 32 bit code, as we can see in the ESP register (section 2.1.2). The other thing is that the brackets “[ ]” indicate that we want the address pointed by “ESP+10h+var4”, which means that this is the place where the original variable “i” is located (which answers our question about the first parameter: “i = 0”). The third parameter, “i<0x1212”, one more time should be analyzed as a subset of Assembly instructions: JMP SHORT loc_401354, CMP [ESP+10h+var4], 1211h and JLE SHORT loc_401350. The first Assembly (JMP SHORT loc_401354) instruction has the same purpose mentioned before, just jump to a label and execute the second Assembly instruction (CMP [ESP+10h+var4], 1211h). As you can see in this instruction, again we are looking into the address point by [ESP+10h+var4] to see the value of the variable. The interesting thing here is that we are comparing the variable “i” with 1211h. Why is this happening? In our original C code, we had compared the variable with 1212h. Well, the compiler “decided” to change the value and to deal with this change, it also changed the next instruction. In the third instruction (JLE SHORT loc_401350), instead of the JL mentioned before, we have a JLE, which means “jump if less or equal”. This is how the Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 32 compiler dealt with the comparison made by the second instruction. Instead of comparing with 0x1212 and “jump if less”, it decided to compare with 0x1211 and “jump if less or equal”, which means “the same thing” in the end. 4.2. Changing the Binary Let's suppose that you want to change some binary code that you already found, for example, the comparison that you found before (section 4.1). As we know, the comparison loop will continue running until the second condition becomes false (i<0x1212). Imagine that you do not want to enter in this loop anymore, but you do not have the source code to do this, then you have to do something with the binary. One of the easiest ways is to change the address of the unconditional jump (JMP) to the next instruction right after the conditional jump (JL or JLE). Another possibility is to change the value of the variable before making the comparison (CMP). In the code compiled with “gcc”, you will have to change the value pointed by [ESP+10h+var4]. In some cases, we may want to remove a message that appears, or maybe a window. In these cases, we usually have a CALL (Assembly instruction to call a subroutine) that we should “remove”. Actually, we will need to change to another instruction, as we will see in the next subsection. 4.2.1. Number of Bytes When we want to change something in our binary, we have to pay attention to the fact that we cannot change the number of bytes. It is one of the security mechanisms related to binary files, but we are not going to discuss this here. Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 33 The fact is that we have to pay attention to this if we want to make permanent changes in our binary. In section 4.2, we mentioned the JUMP example. In this case, it is really easy because we are just going to jump to the next “label” or instruction. But when we want to omit a CALL, we have to put something there to replace the old bytes. We need to put something there that will not change the behavior of the other parts of our program. Thus, it is very useful to use the instruction NOP, that does nothing! This instruction has one byte (0x90) and we can put as many as we need without changing the number of bytes of our original file. Supposing that the original CALL has three bytes, we just replace these three bytes with three 0x90, which means execute NOT three times. 4.2.2. Sum of Bytes Another issue that sometimes we have to deal with is the fact that some binary files have a mechanism called CHECKSUM. This kind of thing is really common in Computer Networks Protocols, for example. The CHECKSUM is a sum of all the bytes of the binary file, but with a limit of some bytes (e.g. 2 bytes). Let's suppose we have code with the following bytes: 0x10, 0x22, 0x35. In this case, the CHECKSUM would be 0x0067 (2 bytes). Usually, this field is somewhere at the end of the binary file, and if you change one of your bytes (e.g. 0x10 for 0x11), then you will also have to change the CHECKSUM field to the new value, in this case 0x0068. Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 34 References Granlund, T. Instructions latencies and throughput for AMD and Intel x86 processors. CSC, KTH, 2009. Tanenbaum, A.S., Woodhull, A.S., Operating Systems Design and Implementation, 3rd Edition. Prentice Hall, 2006. Hoglund, G., McGraw, G., Exploiting Software: How to Break Code. Pearson Makron Books, 2006. Tanenbaum, A.S., Structured Computer Organization, 5th Edition. Prentice Hall, 2005. Schildt, H., C: The Complete Reference, 4th Edition. McGraw­Hill Osborne Media, 2000. Software Reverse Engineering Techniques Level 1 Alexandre Beletti Ferreira 35