IT2104 Computer Organization Basics Languages and Compilers A computer programming language expresses a set of detailed instructions or standard form of commands, containing unique keywords and syntax, which are compiled to perform a specific task by the central processing unit (CPU) of a computer. Numerous programming languages have been developed, but each language has its specific purpose (JavaTpoint, n.d.). Programs and applications are created through computer programming languages to control the behavior and output of a computing machine that uses different algorithms. Computer programming languages can be classified based on the level of abstraction. It can either be a low-level or high-level language. Low-level languages require programmers to manage in detail all the computer's distinctive features relative to data storage and operations. On the contrary, high-level languages provide notations that can be easily written and read by programmers (Hammendinger, 2021). • Low-Level Languages – These are programming languages that deal with a computer's hardware components and constraints, and work to manage a computer's operational semantics. Low-level language is also referred to as the computer's native language. Hence, it is usually represented in the form of zeros and ones. This category of computer programming language provides very little to no abstraction at all. Their primary function is to operate and manage the computing hardware and components. Thus, programs and applications written in low-level language are directly executable on the computing hardware without any interpretation or translation (Techopedia, n.d.). o Assembly language – It is a low-level language that uses short human-readable instructions, also known as mnemonic codes, which allow programmers to introduce names for block of memory that hold data. Assembly language is designed to be easily translated into machine language. However, it still requires detailed knowledge of the internal computer architecture, and does not provide more sophisticated means of organizing complex information (Hammendinger, 2021). Example: mov, add, jmp, and clr o Machine language – It is a low-level language composed of a set of executable instructions that are in binary formal, also known as numeric code. Machine language is also referred to as machine code. Machine language is difficult to read and write, since it does not exhibit the conventional mathematical notation or human language. Example: 101100111110 01 Handout 1 student.feedback@sti.edu • High-Level Languages – These are programming languages that enable program development through user-friendly programming context, which are generally independent of the computer's hardware architecture. Highlevel language has a higher level of abstraction from the computer's native language semantics, and focuses on the programming logic rather than the underlying hardware components. The style and context of languages under this category are easier to understand and implement. Thus, they are considered as close-to-human language. However, every single program written in a high-level language must still be interpreted into a machine language before being executed by the computer (Techopedia, n.d.). Example: Java, Python, PHP, and Ruby A compiler is a computer software that translates (compiles) source codes written in high-level language into a set of low-level language instructions that can be understood by a computer's CPU. The formal output of the compilation process is called object code or object module. The term compiler was coined by Grace Hopper, an American computer scientist who designed one of the first compilers in the early 1950s (Encyclopedia Britannica, 2021). Major phases in a compiler (Chakraborty. 2021): • Scanning – The scanner reads the characters in the source code, from left to right, and groups it to have a collective meaning. • Lexical Analysis – The compiler converts the groups of characters into series of characters known as tokens. Tokens are characterized by regular expressions which can be recognized and understood by a lexical analyzer. The lexical analyzer uses a symbol table to store the words in the source code that corresponds to the generated tokens. • Syntactic Analysis – The compiler checks the proper arrangement of the tokens as per their usage in the corresponding source code grammar. Syntax pertains to the correct order of a set of tokens, or sometimes referred to as keywords, that leads to the desired results. The process of syntax analysis is also known as parsing, which generates a parse tree. • Semantic Analysis – This phase involves several intermediate processes. In general, this examines whether the constructed parse tree complies with the rules of the language. The meaning of the structured tokens are then interpreted by the interpreter and analyzer component to finally generate an intermediate code, known as the object code. *Property of STI Page 1 of 4 IT2104 Assembler and Object Code Assembler • It is a program that translates assembly language into machine codes that can be executed by the computer (Stallings, 2019). • This bridges the symbolically coded instructions written in assembly language and the computer processor, including memory and other computational components, by assembling and converting assembly codes into object codes that are composed of zeros and ones. • It generally allows software and application developers to access, operate, and manage a computer's hardware architecture and components. • Assemblers are classified based on the number of times it takes to read the source code before translating. There are single-pass and multi-pass assemblers, and there are some high-end assemblers that provide enhanced functionality by enabling the use of control statements, data abstraction services, and support for object-oriented programming structures (Techopedia, n.d.). Object Code • It is the machine language representation of a programming source code wherein it encompasses a sequence of statements in binary form that are generated after compiling a particular source program. • It is the output file or program of a compiler or an assembler which can then be transformed to another executable form by a linker (Stallings, 2019). • Object codes are stored in object files. This includes instructions to be executed by the computer. Note that object files may require intermediate processing by the operating system (OS) before the hardware executes the instructions. Some examples of object files are common object file format (COFF) and an executable file with a .exe file extension. Linker and Executable Code Linker • It is a utility program that combines one (1) or more object files from separately compiled programs into a single file containing loadable or executable codes (Stallings, 2019). • The major tasks of a linker are to search and locate reference modules or routines in a program, and to determine the memory location where the codes will be loaded. 01 Handout 1 student.feedback@sti.edu Executable Code • It is the output file, or program, of a linker that indicates tasks according to the encoded instructions the central processing unit (CPU) can directly execute (Stallings, 2019). Moreover, the code generated by a source code language processor such as an assembler or a compiler can also be considered as an executable code. • Note that machine language files can directly be executable as a runnable program, or may require linking with other object codes such as libraries to produce a complete executable program. Loader and Translator Loader • A loader is a major part of an operating system that is responsible for loading executable files, including libraries, into the memory and executes them. It calculates the size of the program and creates memory space for it. It initializes various registers to initiate execution (Tutorialspoint, n.d.). • The loading process for small programs is almost instantaneous, but for large and complex applications with multiple and/or large libraries, such as games and Computer-Aided Design (CAD), the loading process takes a considerable amount of time (Techopedia, n.d.). Translator • It is the general term for a programming language processor that converts a computer program from one (1) language to another. It takes a program written in source code and converts it into machine code (Teach Computer Science, n.d.). • There are three (3) types of translator: the compiler, interpreter, and assembler. Table 1 shows the summary of comparison among the three (3) translators. Compiler Translates high-level language to low-level language It completes the translation of the whole program at a time Reports the errors detected after the conversion Interpreter Translates high-level language to low-level language It completes the translation of the program line by line Reports the errors detected while doing the conversion Assembler Translates assembly language to machine language It completes the translation in two (2) passes Reports the errors during the assembly *Property of STI Page 2 of 4 IT2104 The execution process is fast Translated programs are machine independent Utilized by C, C++, COBOL, and Pascal The execution process is slow Translated programs are machine independent Utilized by PHP, Python, Ruby, and Perl The execution process is efficient Translated programs are machine dependent Utilized by assembly language • • Process data: The execution of an instruction may require some arithmetic or logical operation on data. Write data: The result of an execution may require writing data to memory or an I/O module. Table 1. Summary of comparison of the different types of translator. Processor Structure A processor is an integrated electronic circuit that performs the calculations that runs a computer. Processors can be found in different electronic devices, such as smartphones, personal computers, printers, and even routers. The purpose of a processor is to receive input as program instructions and execute millions of calculations. Every operation performed on a computing device, such as opening an application or duplicating a file, the processor interprets program instructions (Techopedia, n.d.). Logically distinct functional components of a processor (Ledin, 2020): • Control Unit (CU) – This manages the overall operation of the processor. It controls the movement of data and instructions in and out of the device, including the operation of the arithmetic logic unit. The control unit of a modern processor is a synchronous sequential digital circuit. • Arithmetic Logic Unit (ALU) – This is a combinational circuit that executes the actual computations and bit manipulation operations in a processor under the direction of the control unit. ALU requires input data values, called the operands, and a specific code indicating the operation to be performed. • Register Set – This is a temporary internal storage that serves as source and destination locations for instruction operations. Registers provide the fastest data access in a processor, but are limited to a very small number of locations. Note that the width of a register in bits is generally the same as the processor word size. Operations that are performed by a processor (Stallings, 2019): • Fetch instruction: The processor reads the instruction from memory, which can be a register, cache, or the main memory. • Interpret instruction: The instruction is decoded to determine required actions. • Fetch data: The execution of an instruction may require reading data from memory or an input/output (I/O) module. 01 Handout 1 student.feedback@sti.edu Figure 1. A simplified internal structure of a processor. Source: Computer Organization and Architecture: Designing for Performance 11th edition, 2019 p. 656 Figure 1 shows the major components of a processor and specifies the data transfer and logic control paths. The component labeled as Internal CPU bus is needed to transfer data between various registers and the ALU. In addition, the figure contains some basic elements in an ALU. Number Systems Review • A bit is the smallest unit of information in a digital computer. It represents a discrete data element containing the value zero (0) or one (1). Bits are marked individually within a binary number, with bit zero as the rightmost and least significant bit (LSB). • A byte is composed of eight bits placed together to form a single value. It is the smallest unit of information that can be read from or write to a computer memory by most modern processors. The following illustration contains an example of a bit and a byte. *Property of STI Page 3 of 4 IT2104 • • • • In a positional number system, the ultimate numeric value is determined by the position the number holds, and not by the number itself. The placement of digits works through the use of a base number with a series of exponents applied to the base. Note that the decimal number system has a base of 10; the binary number system has a base of 2; and the hexadecimal number system has a base of 16 (Coughlan, n.d.). The binary number system, also known as the base-2 number system, is a system that represents counting numbers by using the numerals 0 and 1. Any number of bits, n, can take on 2n values. Thus, 1byte, which contains 8bits, can take on 28 or 256 different combinational values (Ledin, 2020). As an example, the binary 100112 below is equivalent to 1910 as a decimal number. The hexadecimal number system, also known as the base-16 number system, encompasses binary numbers that are separated into groups of 4bits. Since there are 4bits in a hexadecimal group, the possible numbers of binary combinations that a group can take on is 24 or 16. The first 10 of these 16 numbers are assigned to the digits 0 to 9, while the last six (6) numbers are assigned to the letters A to F (Ledin, 2020). As an example, the hexadecimal number 2B9F16 below is equivalent to 1116710 in decimal number. As an additional example, the binary 111010102 can be represented more compactly by breaking it into two 4-bit groups (1110 and 1010) and write them as the hexadecimal digits EA16. This only proves that the use of 01 Handout 1 student.feedback@sti.edu • binary values in programming is less common, and the hexadecimal number system is favored due to its compactness. Binary complements are utilized by digital computers for the logical manipulations and to simplify the addition and subtraction operation. There are two (2) types of complements which are the following (Reddy, 2019): o 1's complement – The only process involved in this operation is the inversion of the given binary number. This operation is seldomly used for representing signed (positive or negative) binary numbers since, 010 has two representations in 1's complement which are: -010 that is represented with 1s (e.g., 111112 in a 5-bit register); and +010 that is represented with 0s (e.g., 000002 in a 5-bit register). Example: The 1's complement of 1001 0011 is 0110 1100. o 2's complement – The processes involved in this operation are the inversion of the given binary number and the addition of 1 to the LSB. This operation is highly utilized in representing signed (positive or negative) binary numbers. The most significant bit (MSB) of a 2's complement data value is the sign bit which can either be: 0 that represents a positive value (+); and 1 that represents a negative value (–). Note that in 2's complement, 010 has only one (1) representation that is always considered positive (e.g., 000002 in a 5-bit register) References: Britannica, T. Editors of Encyclopedia. (2021, May 4). Compiler. Retrieved on June 30, 2021 from Encyclopedia Britannica. https://www.britannica.com/technology/compiler Chakraborty, K. (2021, February 22). Compiler. Retrieved on July 6, 2021 from https://www.techopedia.com/definition/3912/compiler Coughlan, D. (n.d.). Positional number systems. Retrieved on July 12, 2021 from https://courses.lumenlearning.com/collegesuccess2x48x115/chapter/positional-number-systems/ Encyclopedia.com. (n.d.). Binary number system. Retrieved on July 12, 2021 from https://www.encyclopedia.com/computing/news-wireswhite-papers-and-books/binary-number-system Encyclopedia.com. (n.d.). Hexadecimal notation. Retrieved on July 12, 2021 from https://www.encyclopedia.com/computing/dictionariesthesauruses-pictures-and-press-releases/hexadecimal-notation Hemmendinger, D. (2021, January 29). Computer programming language. Retrieved on June 30, 2021 from Encyclopedia Britannica: https://www.britannica.com/technology/computer-programming-language JavaTpoint. (n.d.). What is a programming language?. Retrieved on June 30, 2021 from https://www.javatpoint.com/classification-ofprogramming-languages Ledin, J. (2020). Modern computer architecture and organization. Packt Publishing Reddy, A. ( 2019, February 21). 1's complement vs 2's complement. Retrieved on July 13, 2021 from https://www.tutorialspoint.com/1-scomplement-vs-2-s-complement Stallings, W. (2019). Computer organization and architecture: Designing for performance (11th ed.). Pearson Education, Inc. Teach Computer Science. (n.d.). Translators. Retrieved on July 8, 2021 from https://teachcomputerscience.com/translators/ Techopedia. (n.d.). Assembler. Retrieved on July 6, 2021 from https://www.techopedia.com/definition/3971/assembler Techopedia. (n.d.). High-level language (HLL). Retrieved on June 30, 2021 from https://www.techopedia.com/definition/3925/high-levellanguage-hll Techopedia. (n.d.). Low-level language. Retrieved on June 30, 2021 form https://www.techopedia.com/definition/3933/low-level-language Techopedia. (n.d.). Processor. Retrieved on July 9, 2021 from https://www.techopedia.com/definition/28254/processor Tutorialspoint. (n.d.). Compiler design – overview. Retrieved on July 7, 2021 from https://www.tutorialspoint.com/compiler_design/compiler_design_overview.htm *Property of STI Page 4 of 4