IB Comp Sci HL1 Chapter 01 - The Big Picture 1.1 Computing Systems (pg 5-8) Computing System - dynamic entity that solves problems and interacts with its environment. comprised of hardware, software, and data Computer Hardware - the collection of physical components that make up the machine. boxes, circuit boards, chips, wires, disk drives, keyboards, monitors, printers Computer Software - Programs that provide instructions that a computer carries out Layers of a Computing System - - Computing system like an onion Innermost layer - Data in the computer, stored in binary Hardware - The physical hardware of the system, which uses gates and circuits Software - It can take many forms, used to solve problems and make stuff happen Operating System - Allow users to interact with and manage the other parts of the computer, - ex. macOS, Windows, Linux Applications - Used to solve real-world problems rather than allow the system to work Abstraction - Abstraction - A way to think about something that omits complex details Miller’s Law - humans can only hold seven pieces of information in short-term memory at a time Information Hiding - Eliminate the need for one part of a program to access information in another part 1.2 The History of Computing (pg 9-29) A Brief History of Computing Hardware Early History - Jacquard’s loom - for weaving cloth, introduced input through punched cards - Charles Babbage’s analytical engine - vision of memory so values did not have to be re-entered - Ada Lovelace - Extended Babbage and conceived the loop - Alan M. Turing’s Turing machine - the foundation for computing theory - Significant early computers - the Colossus (1943), the first all-programmable electronic digital computer - Harvard Mark I (1944) - ENIAC (1946) - EDVAC (1950) - UNIVAC 1 (1951), the first commercial computer, was used to predict the outcome of a presidential election First Generation (1951-1959) - Used vacuum tubes to store information - Generated a lot of heat - Not very reliable - Required heavy-duty air conditioning, large rooms, and frequent maintenance - Magnetic drums for memory - Rotated under a read/write head - Input device - Punch hole IBM card - Output device - Magnetic tape drives - Data on tape is stored linearly, so much be accessed linearly - Storage devices - Auxillary storage devices - ex. magnetic tape - Peripheral devices - input devices, output devices, and auxiliary storage Second Generation (1959-1965) - Transistor - Replaced vacuum tube as the main component - Smaller, faster, cheaper, and more durable - Immediate Access Memory - Magnetic cores - Tiny donut-shaped devices - Store one bit of information - Strung together to form cells - Cells combined formed a memory unit - Motionless and electronic, so information access was instantaneous - Magnetic Disk - New auxiliary storage device - Faster than magnetic tape because each item is accessed directly by referring to its location on the disk - Each piece of data has its own identifier (called address) Third Generation (1965-1971) - Transistors and other components assembled by hand onto PCBs - Integrated circuits (ICs), solid pieces of silicon that contained transistors, other components, and connections - Moore’s Law - number of circuits on an integrated circuit would double each year - Transistors used for memory construction - Each transistor represents one bit - IC tech allowed for memory boards built of transistors - Auxiliary storage is still needed because transistor memory is volatile (information disappears when power is lost) - Terminal, input/output device with keyboard and screen - Gave user direct access to computer and immediate response Fourth Generation (1971-?) - Large-scale integration - Thousands of transistors to whole microcomputers on chips - Main memory devices remained out of chips - Moore’s law edited to compound every 18 months - PCs became common - Almost anyone could have a microcomputer - New names - Apple, Tandy & Radio Shack, Atari, Commodore, and Sun - Joined old, IBM, RemRand, NCR, DEC, Hewlett-Packard, Control Data - IBM PC (1981) - Followed by Dell, Compaq, and Apple (1984) - Workstations - More powerful, meant for business use - Networked by cables - RISC (reduced-instruction-set computer) - Computers understood only their machine language - Instructions were fast and memory was slow - Called UNIX workstations because they run on UNIX - New Moore’s law: “Computers will either double in power at the same price or halve the cost for the same power every 18 months” Parallel Computing - Rely on interconnected CPUs - Either all processors use the same memory unit or the central processor has its own memory and communicates over an internal network - Increased speed - Steps in programs can be separated into pieces and executed simultaneously - SIMD (single-instruction, multiple-data-stream) - Work on different parts of the program simultaneously - MIMD (multiple-instruction, multiple-data-stream) - Must be programmed differently than sequential programs Networking - Ethernet - Cheap coaxial cable to connect machines - Using a server rather than locally storing allowed for central control with autonomy - Workstations or PCs networked known as LANs (local area network) - Internet - Descended from ARPANET, government-sponsored network in the lat 1960s - Uses packet switching - Global Networks use TCP/IP (Transmission Control Protocol/Internet Protocol) Cloud Computing - use of computer resources on internet instead of physical devices en location - Data center - separate business providing computing hardware that another business uses - Offloads managing infrastructure, but communication can take time A Brief History of Computing Software First Generation Software (1951-1959) - Programs written using machine language - Used binary - Tedious to write machine language - Began to write using assembly language - Used mnemonic codes to represent machine-language instruction - Software translators were used to translate from assembly language to machine code - ex. Assembler - first systems programmers Second Generation Software (1959-1965) - High Level Languages - Allowed for written instructions in english-like statements - FORTRAN - Numerical applications - Grew over the years after release - COBOL - Business applications - Deisgned first, then implemented - Lisp - AI applications - HLL allowed for running the same program on multiple machines, through use of a ocmplier - Checks syntax of HLL - Applications programmers wrote programs Third Generation Software (1965-1971) - First two generations used loaders to load programs and linkers to link programs - Third gen utility programs put under control of OS - Systems software - utility programs, OS, assemblers, and compilers - Time sharing - splitting central processing time amongst multiple users - General purpose application programs - SPSS (statistical package for the social sciences) - allowed the user to describe data and statistics - System programmers were writing software for nonprogrammers Fourth Generation (1971-1989) - Structured programming - Logical, disciplined approach - Pascal, Modula-2, and updated BASIC C, C++ UNIX, MS-DOS, and more OSes became further developed New application types introduced - Spreadsheets, word processors, and database management systems Fifth Generation (1990-Present) - Microsoft became a dominant player in computer software, ood/p, and WWW - Application programs bundled together into office suites - Object-oriented design - structured hierarchy of data objects - Tim Berners-Lee - Technical rules for World Wide Web - Created HTML and a rudimentary text-only browser - Mosaic - First graphics capable browser - Netscape Navigator vs IE lawsuit - Ended with IE winning, and other browsers being released - Fifth generation had recreational web browsing and computer usage Chapter 02 - Binary Values and Number Systems 2.1 Numbers and Computing Number - unit belonging to an abstract mathematical system, subject to laws of succession, addition, and multiplication Natural number - any numbers that can be obtained in increments of 1 Negative numbers - are less than 0 Integers - are any natural numbers and their negative Rational numbers - are integers or the quotient of two integers (fraction) 2.2 Positional Notation Base - specifies the number of digits used throughout the system, always begin with 0 and continue through one less than the base Positional notation - each digit has a place value and the number is the sum of each digit to the power 2 1 0 of its place value (can be represented as 9 * 𝑥 + 4 * 𝑥 + 3 * 𝑥 ) Binary, Octal, and Hexadecimal Binary (base-2, 2 digits) Octal (base-8, 8 digits) Decimal (base-10, 10 digits) Hexadecimal (base-16, 16 digits) Digits start with index 0 Hex digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F Octal (754) to decimal (492) Hexadecimal (ABC) to decimal (2748) Binary, Octal, and Decimal conversion chart Octal-Binary conversion - Get the binary of each digit and add them together Algorithm - logical sequence of steps that solves a problem Example: Algorithm to convert from decimal to hexadecimal Binary Values and Computers - Storage location - binary digit or bit - must be a 0 or 1, no empties - 8 bits makes a byte - bytes connected to make words - number of bits in word is word length - Modern computers are 32 or 64 bit a Chapter 03 - Data Representation 3.1 - Data and Computers Data - basic values or facts Information - data that has been organized or processed in a useful way multimedia devices - deal with multiple types of data (numbers, text, audio, images and graphics, video) data compression - reduce the storage needed for a piece of data bandwidth - the maximum amount of data that can be transmitted from one place to another in a fixed amount of time compression ratio - how much compression occurs (compressed data divided by original data) lossless compression - data can be retrieved without losing any original information lossy compression - information is lost in the process of compaction Analog and Digital Data analog data - continuous representation analogous to the actual information digital data - breaking the information into separate element digitize - break data into pieces pulse-code modulation (PCM) - digital signal jumping between two extremes reclocking - the process of cleaning the data and having it regain its original shape to prevent degredation past the point of unreadable 3.2 - Representing Numeric Data Binary can be used to represent data (00 = park, 01 = drive, 10 = neutral, 11 = reverse) signed magnitude representation - +/- sign to denote sign of number fixed size number - half of values represent negative numbers (ex. if max is 2 digits, +1-49, -50-99) ten’s complement - (-3) = 102 - 3 = 97 two’s complement - number represented by 8 bits, seven for number & one for sign overflow - computeed value can’t fit into the number of bits allocated for the result radix point - decimal point for binary numbers floating point - number of digits is fixed but the radix point floats sign * mantissa * 10exp sign * mantissa * 2exp scientific notation - an alternative floating point representation 3.3 Representing Text character set - a list of character and codes used to represent each one ASCII (American Standard Code for Information Interchange) - Orignally used 7 bits to represent each character (128 max) - 8th bit was check bit, ensured proper data transmission - Later evolution (Latin-1 ASCII extended character set) allowed for 256 characters, first 32 Unicode - extended the ASCII process to all international use, using usually 16 bits to represent a much wider variety of characters Encoding Text - Keyword encoding - used to replace frequently used words with single characters - but there is no good way to distinguish between the actual use of symbols in the text versus the compressed word - - Run-length encoding - compresses repeating characters using a flag character, making each long word smaller - nnnnnxxxxxxxxxccchhhhhh some other text kkkkkkkkeee = *n5*x9ccc*h6 some other text *k8eee Huffman encoding - uses the minimum needed bit length to represent a charcter, removing the extra unused bits to allow text to take up less pace - DOORBELL = 1011110110111101001100100 3.4 Representing Audio Data - Sound is represented by air compression hitting the eardrum Digital representation measures the voltage of the sound wave at a set rate (a high enough resolution that the human ear can interpret it as sound, 40000hz) Analog representation of sound wave is vinyl record (grooves change to show the sound wave they represent Digital representation of sound wave is CD, surface of CD has microscopic pits to represent binary digits, read by laser Audio Formats - MP3 - MPEG-2, audio layer 3 file Favoured due to high compression ratio Discards audio that is not hearable by humans Compressed using a form of Huffman encoding 3.4 Representing Audio Data Representing Color - Color is the perception by the human eye of frequencies of light RGB - each of the 3 values represents contribution of the primary colors - 0-255 scale color space - 3d image used to represent color color depth - the amount of data used to represent color high color - 16-bit color depth, 5 bits for each color value and extra for transparency true color - 24-bit color depth, 8 bits for each color value Digitized Images and Graphics pixels - picture elements, single color resolution - number of pixels to represent an image raster graphics - images are stored pixel-by-pixel, have a maximum resolution for the image where the human eye can begin to see the individual pixels that make up the image (ex. JPEG, PNG, GIF) vector graphics - image is stored in terms of geometric shapes, it can be scaled up or down as much as needed without losing resolution (ex. SVG) metadata - describes the details of the image, like size, resolution, location it wa taken, creator of image, copyright status GIF JPEG PNG SVG - only 256 colors, but those 256 can change indexed color - changing set of colours optimal for line art and small animations averages the color hues over short distances, superior for photographic color created to improve on GIF APNG allows for animation represented in plaintext well supported by browsers 3.6 Representing Video codec - COmpressor/DECompressor video codec - shrink the size of the movie, uses lossy compression temporal compression - combines information that remains constant between frames to not to waste space with duplicated data spatial compression - combines parts of the data that exist the same into larger chunks, similar to run-length encoding Chapter 04 - Gates and Circuits (93-113) 4.1 Computers and Electricity gate - device that performs a basic operation on electrical signals circuit - gates combined to perform more complicated tasks Boolean algebra - variables can only be 1 or 0 logic diagram - graphical representation of a circuit truth table - lists all possible input combinations of a gate 4.2 Gates NOT Gate - accepts one input value and produces one output value inverts its single input value. represented by ‘ AND Gate - Expressed with dot (∙), asterisk (*), or just assumed (AB) produces 1 if both input values are 1 OR Gate - Expressed with plus (+) produces 1 if one or the other or both input values are 1 XOR Gate - Expressed with circled plus (⊕) produces 1 if one or the other (but not both) input values are 1 NAND Gate - No symbol produces the opposite results of an AND gate NOR Gate - No symbol produces the opposite results of an OR gate Gates with More Inputs - Gates can accept more than two input values 3 input AND produces 1 only if all inputs are 1 2n outputs based on n input values 4.3 Constructing Gates Transistors transistor - device that acts as a conductor as resistor based on input signal semiconductor - both not a great insulator nor a great conductor, often silicon - - three terminals - source - produces a high voltage value (~5 volts) - base - emitter - connected to a ground wire source grounded = 0v 4.4 Circuits combinational circuit - input values explicitly determine output sequential circuit - input values and existing state determine output, usually involving storage of information Combinational Circuits - Gates combined into circuits Can be represented by truth tables Put algebraic operations as titles of columns A(B + C) A B C B+C A(B + C) 0 0 0 0 0 0 0 1 1 0 0 1 0 1 0 0 1 1 1 0 1 0 0 0 0 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 circuit equivalence - different circuits that produce the same result (ex. A(B + C) = AB + AC Boolean Algebra Laws Adders adder - special addition operator half adder - adds two bits and produces the correct carry bit full adder - takes into account the carry-in value Representation of a half adder truth table and circuit diagram Full adder truth table and circuit diagram Multiplexers multiplexer - (mux) circuit that produces a single output signal, equal to one of several input signals which is selected by select signals or select control lines. binary values of n input control lines determine the output of 2n demultiplexer - (demux) takes single input, routes to one of 2n outputs, based on values of n control lines Multiplexer with three select control lines, three inputs corresponding to 3 binary digits, 0-7 4.5 Circuits as Memory - Gates can be used in a sequential circuit so output of one serves for input of next S-R latch - X0; Y1 - X1; Y0 - X is current state of circuit - Set to 1 by setting S0 with R1, vice versa 4.6 Integrated Circuits integrated circuit - aka chip, piece of silicon with multiple gates embedded in it, each pin connects to input of gate, output of gate, power, or ground SSI chip with 4 NAND gates 4.7 CPU Chips - Most important IC is CPU - Advanced circuit with input and output lines Contains pins that communicate to many other advanced circuits like memory and I/O devices Leads to component architecture, bringing back abstraction Chapter 05 - Computing Components (93-113) 5.1 Individual Computer Components - - - CPU - Clock - rapid pulse to coordinate all actions of a computer - Clock speed - (Hz) rate at which the clock pulses FSB (front-side bus) - terminal through which the computer communicates with the outside world - only activated when processor needs something not found in cache Cache - small, fast memory built into processor chip - accessible by processor without using FSB GPU (graphics processor unit) - can be more poewrful than main rpocessors - - - data for screen in its memeory RAM (random access memory)/main memory - SDRAM (synchronous, dynamic RAM), each byte of ram can be accesed directly rather than one at a time HDD (hard disk drive) SSD (solid-state disk) - similar to RAM, but data isn’t lost when power is cut - Tradeoffs are everywhere - faster processor = more power, leading to overheating and possible shutdown - faster FSB = faster other devices to interface with = more expensive circuitry - bigger cache = slower data access = slower processor The prefixes of computing 5.2 The Stored-Program Concept von Neumann Architecture - Named for John von Neumann The realisation in 1944-45 that the data and instructions to manipulate the data used the same logic and could be stored in the same place (John Mauchly and J. Presper Eckert) The units that process and store information are different Five components - Memory unit, holds data and instructions - addressability - number of bits in each addressible location of memory - Arithmetic/logic unit, performs arithmetic and logic operations on data - register - small storage area in CPU used to store values or special data that is needed immediately (ex. One * (Two + Three), two + three needs to be stored temporarily) - Input unit, moves data from the outside world into the computer - - - - punch cards, keyboard, mouse, scanner The output unit, moves results from inside the computer to the outside world - printer, display The control unit, stage manager to ensure that all the other components act in concert - instruction register (IR) - contains instruction that is being executed - program counter (PC) - contains the address of the next instruction to be executed Connected through wire called the bus - bus width - number of bits that can be transfered through the bus simultaneously - cache memory - a small amount of fast-access memory for frequently-used data to be stored - pipelining - split instructions into smaller steps so they can be overlapped and speed up the fetch-execute cycle motherboard - PCB on which holds all of the von Neumann machine components and connections for other components to add to the bus - n-bit processor refers to the number of bits in the CPU’s general register The Fetch–Execute Cycle - 4 main steps 1. Fetch the next instruction - program counter contains address of next instruction - copy of instructions are made and placed int eh instruction register 2. Decode the instruction - Instruction into control signals - Logic of circuitry determines which operations need to be performed 3. Get data if needed - If additional memory is needed 4. Execute the instruction - Sends signals to arithmetic/logic unit to carry out processing RAM and ROM ROM - read-only memory, permanent and can’t be altered burning - storing a bit pattern on a ROM RAM - random-access memory, contents of each location are changeable RAM is volatile, ROM is not (doesn’t disappear when power is cut) Secondary Storage Devices secondary/auxillary storage devices - storage other than the main memory mass storage device - a secondary storage device that can story large quantities of data - - - Magnetic tape - often used to back up data on a disk - to access data in the middle, all the data before must be accessed and discarded Magnetic disks - disk drive - cross between compact disk player and tape recorder - track - concentric circles around the surface of the disk - sector - divided section of each track - block - continuous sequence of bits held in each sector - seek time - time for read/write head to get to position - latency - time for the specified sector to spin to the read/write head - hard disk - consists of several platters stacked CDs and DVDs - CD - compact disk - track spirals from inside to outside - variable rotation speed because data is evenly packed - CD-DA - compact disk-digital audio - CD-ROM - same as CD-DA but data is stored permanently - CD-R - recordable, writable but not rewritable (CD-RW is rewritable) - DVD - digital versatile disk - - higher storage capacity - DL - dual layer, doubled capacity - R - recordable - RW - rewritable - Blu-ray - read with blue laser instead of red, higher storage capacity Flash Drives - Introduced 1998 as floppy disk alternative - Faster and consumes less power than a hard disk Touch Screens - - - resistive touch screen - two layers, one vertical one horizontal, made of conductive material - separated by thin air gap - when pressed, bottom and top are contacted and current flows capacitive - laminate over the whole screen - current over whole screen which flows into finger from each corner infrared - crisscrossing beams of infrared, breaking the beam = touch surface acoustic wave - (SAW) projects high-frequency sound waves in a similar manner to touch screen 5.3 Embedded Systems embedded system - any computer that is preprogrammed to preform a dedicated or narrow range of functions as a part of a larger system - often uses assembly-language program to reduce space needed 5.4 Parallel Architectures If a problem can be solved in n time units on a computer with one processor, can it be solved in n/2 time units on a computer with two processors, or n/3 on a computer with three processors? Parallel Computing - - - - bit level - on an 8-bit processor, 16-bit operation would require two operations - increase the word size to reduce the operations on data values - leads to 64-bit processors instruction level - some instructions can be carried out independently in parallel - superscalar processor - able to recognise these situations and execute on them - not multiple processors but multiple execution resources (execution units) data level/synchronous processing - single set of instructions run on different data sets at the same time - SIMD - single instructions, multiple data - control unit directs multiple ALUs task level - different processors execute different tasks on the same or different data sets - each processor does one step and hands off result to next stage shared memory parallel processor - each processor has a local memory and communicate through shared memory Classes of Parallel Hardware - Symmetric multiprocessor (SMP) - multiple identical cores, shared memory and bus to connect them Distributed computer - multiple memory units connected through network Cluster - group of stand-alone machines connected through an (C)OTS network Massively parallel processor - many networked processors connect through a specialized network (>1k processors) each level is mixed more and more in modern computers 06 - Low-Level Programming Languages and Pseudocode (151-182) 6.1 Computer Operations computer - programmable electronic device that can store retrieve, and process data programmable - the instructions that manipulate data are stored within the machine process - performing arithmetic and logical operations on data values 6.2 Machine Language machine language - the actual instructions used by the computer in binary assembly language - a slightly more abstract programming language which uses mnemonics mnemonics - used to represent the individual parts of machine language instructions - each machine language command performs one very specific, low level task ex. adding two numbers together (each number represented in binary) 1. enter a number into the accumulator 2. add second number 3. save result Pep/9: A Virtual Computer VMs - can be used to emulate the functions of real computers without requiring the full hardware and software to support those, often used by developers or manufacturers to test software or hardware Pep/9 - 65536 bytes of storage, numbered as a decimal 0 to 65536 - 2 byte/16 bit word length - 7 registers, 3 of which are - PC (program counter), contains the address of the next instruction to be executed - IR (instruction register), contains copy of instructions to be executed - A (accumulator), holds data and results of operations Instruction format - two parts of an instruction - 8-bit instruction specifier - 16-bit operand specifier (optional) - opcodes (operation codes) vary from 4-8 bits Immediate addressing mode - used when the instruction is sent with an operand to be used as the instruction direct addressing mode - the operand has a specific memory address to be accessed. Sample Instructions Opcode Meaning of Instruction 0000 Stop execution 1100 Load word into the A register 1101 Load byte into the A register 1110 Store word from the A register 1111 Store byte from the A register 0110 Add the operand to the A register 0111 Subtract the operand from the A register 0000 Stop execution - halt the program - unary operator, no operand specifier 1100 Load word into the A register - loads one word (two bytes) - changes depending on addressing mode 1101 Load byte into A register - similar to 1100, but loads 1 instead of 2 bytes 1110 Store word from the A register - stores the A register to the memory location from the operand 1111 Store byte from the A register - similar to 1110, but stores 1 byte instead of 2 bytes 0110 Add the operand to the A register - uses addressing mode specifier to give different outputs 0111 Subtract operand from the A register - like the add operand, except it is subtracted from the A register Input and Output in Pep/9 memory mapped I/O - wires input and output devices to specific, fixed addresses in the main memory Pep/9 uses ASCII to represent characters 6.3 A Program Example Pep/9 Simulator 1. Input the hexadecimal code for the program into a window, with two z's to end the program 2. Load the program onto the memory through the menu 3. Execute the program Example Input: Example Output: Another Machine-Language Example Examples of simple machine language programs include writing text to the screen or manipulating inputted text Example Input: Example output: 6.4 Assembly Language 1. Written in the assembly language 2. Run through the assembler which translates the assembly language into machine code 3. Machine code can then be run by the computer assembler - a program that translates an assembly-language program into machine code assembler directives - Instructions to the assembler (ex. signalling the end of the program) instructions - used to manipulate data (ex. storing a word in the memory) Pep/9 Assembly Language Example program Print "Hi" LDBA STBA LDBA STBA STOP .END 0x0048,i 0xFC16,d 0x0069,i 0xFC16,d ; Load ‘H’ into accumulator ; Store accumulator to output device ; Load ‘i’ into accumulator ; Store accumulator to output device - Operand specified by 0x and hexadecimal, then a comma, then the addressing mode (i for immediate, d for direct) LDWA = load word LDBA = load byte .END = end of program .ASCII, with operand “banana\x00”, represents string of ASCII characters .WORD, with operand 0x008B, reserves a word in memory and stores a value in it .BLOCK, with operand of the number of bytes, reserves a particular number of bytes in memory Comments started by a semicolon Numeric Data, Branches, and Labels - Pep/9 can only output individual characters, so no arithmetic branch instructions - let the programmer decided which to execute next DECI - decimal input, takes and stores in location from operand DECO and STRO - write data to output (DECO decimal number, STRO full string of characters) BR - unconditional branch, accuses program counter to be set to memory addres of operand (jumps to another place in the program) BRLT and BREQ - branch if conditions are less than or equal to 0 CPWA - compare operant to value in accumulator\ labels - put at the beginning of a line separated by a colon, used as a point to refer back to instead of a specific memory address Example program showing labels and addresses in symbol table Loops in Assembly Language Example loop which adds to counter until BREQ instruction returns true 6.5 Expressing Algorithms algorithm - plan or outline of a solution, logical sequence of steps that solve a problem pseudocode - a language to allow expression of algorithms in a clearer form Pseudocode Functionality Variables - names that refer to places in the memory where values are stored - name should reflect the role of the content - ex. sum is summation of other values Assignment - give a variable a value - ex. set sum to 0 or sum ← 0 Input/Output - input data from outside world and output result on the screen Write “Enter the number of values to read and sum” Read num - Display = print = write, get = input = read (doesn’t matter what specifics are, as long as the intention is understood Selection - Chioce between two actions IF (sum < 0) Print error message ELSE Print sum Repetition - Allows instructions to be repeated - Set limit to number of values to sum Set limit to number of values to sum WHILE (counter<limit) Read num Set sum to sum + num Set counter to counter + 1 Writing a Pseudocode Algorithm defer details - giving a task a name and filling details on how to accomplish the task at a later time desk checking - sit at a desk and work through an algorithm to trace what happens to the input values 6.6 Testing test plan - document athat specifics how many times and with which data a program must be run in order to thoroughly test it Code coverage/clear-box testing - ensures that each statement in the code works data coverage/black-box testing - uses test inputs to ensure correct outputs with no regard to the cone that is actually running, however testing usually involves a combination of both test plan implementation - running each of the test cases in the test plan and recording the results, if the result is unexpected, go back and find the errors 07 - Problem Solving and Algorithms (191-231) 7.1 How to Solve Problems Ask Questions - figure out what your task is - Polya step 1 - understand the problem Look for Familiar Things - Simplyify the process by finding preexisting solutions don't reinvent the wheel Polya step 2 - connect the unknown of the problem to something that is already known - find similar solutions and reuse them Divide and Conquer - Break down the problem into subtasks to simplify development of a solution in manageable chunks Polya step 3 - Carry out the plan Algorithms - Produce an algorithm to allow for future applicability Polya step 4 - check the result and ensure the correct solution Computer Problem-Solving Process 1. analysis and specification phase - results in a clearly written problem statement 2. algorithm development phase - results in a plan for a general solution to the problem specified in the first phase 3. implementation phase - results in a working computer program that implements the algorithm/solution to the problem 4. maintenance phase - no output unless changes need to be made 7.2 Algorithms with Simple Variables Algorithms with Repetition Count Controlled Loops loop control variable - value that is read by the loop to determine if the task is complete 1. init LCV 2. test LCV (has it reached predetermined value?) 3. increment LCV by 1 pretest loop - testing takes place before loop in executed (ex. while loop, if value is false to begin with it will never execute) infinite loop - loop that never terminates Event-Controlled Loops event controlled loops - controlled by an event that occurs within the body of the loop itself 1. init event 2. test event 3. update event nested structure - one control structure embedded in another Square Root abstract step - an algorithmic step which needs to be expanded further concrete step - step which does not need to be expanded and details are fully specified 7.3 Composite Variables Composite data types can hold multiple individual simple data types Arrays Arrays are a collection of like items which can be accessed through an array - named collection of homogeneous items - index - the place of each individual item in the array Records Records are collections which do not have to contain all like data types, often used to link multiple characteristics of a single object - heterogeneous group of items - items accessed by name Classes Classes are the base of OOP, a group of objects which are linked through the class 7.4 Searching Algorithms Sequential Search - look through each item and compare it to the desired item stop when - item is found - all items have been checked and no matches found Sequential Search in a Sorted Array - if it is known that an array is sorted, the search can stop looking when the position of the desired item would be is passed Binary Search - assuming a sorted array, begins in the middle of the array and divides in half to find if the desired item is greater or less than the middle item - continue halving the the array until the desired position or item is found Average Number of Comparisons Length Sequential Binary 10 5.5 2.9 100 50.5 5.8 1000 500.5 9.0 10000 5000.5 12.0 7.5 Sorting unsorted array - an array in a raw, unsorted form sorted array - has been sorted by a certain quality of the data within that sort, such as an alphabetical or numerical value Selection Sort 1. goes through an array and finds the value at one extreme of the sort 2. moves it to a new array 3. repeats on the original array - similar to how a list would be sorted by hand. - find the extreme to one end of the direction you are sorting by and begin to form a new pile as you move further towards the other extreme Bubble Sort 1. Start at the end of an index, if that value should be above the item that is above it, move it. 2. If the item above it should be above it, begin the process over with the higher value Insertion Sort - requires only one array will go from one end of the array to another, sorting the new value in respect to the previous values begin at the top of an index and add one item to the sort at a time, moving further down the index one item at a time 7.6 Recursive Algorithms recursion - the ability of an algorithm to call itself best case - a case in which an algorithm that has an answer general case - expresses the solution in terms of a call to itself with a smaller iteration of the problem size factor - the size of each problem Subprogram Statements - halt the program and execute a piece of code, then resume below whatever was executed calling unit - place where the name of the piece of code appears two main types of subprograms 1. void subprograms - named code that does a particular task - used as a statement in the calling unit 2. value-returning subprograms - named code that does a task but returns a single value to the calling unit - used in an expression if the calling unit is used in the evaluation of the expression subprograms allow for abstraction - allows reader to see task without reading the details of the task itself Recursive Factorial a factorial is an example of a recursive algorithm N! = N * (N - 1)! infinite recursion - subprogram calls itself until run-time system runs out of memory (equivalent to an infinite loop) Quicksort splits the original array into smaller subgroups to break the sorting into more manageable chunks, breaking it down to its smallest group of one, then slowly recombining back up from the bottom 7.7 Important Threads Information Hiding defer details multiple times - give name to task without worrying about how it will be implemented - designer sees details that are relevant to that level - makes details at lower levels inaccessible during design of higher levels Abstraction result with the details hidden - only details that are essential to reader are shown data abstraction - keep view of data separate from implementation procedural abstraction - separation of logical view of an action from implementation control abstraction - separation of logical view of control structure from implementation control structure - alter the sequential flow of the control of the algorithm (ex. FOR and WHILE) Naming Things identifiers - shorthand phrases to stand for the tasks and data that are being used 08 - Abstract Data Types and Subprograms (241-271) 8.1 What Is an Abstract Data Type? abstract data type (ADT) - container whose properties are specified independently of any particular implementation data structures - the implementation of a composite data field in an abstract data type containers - ADTs whose sole purpose is to hold other objects 8.2 Stacks stack - only accessible and editable from one end of the structure (LIFO) - ex. lunch trays in a cafeteria: you can only take the top plate - the item removed is the item tha thas been in the stack the shortest time LIFO - last in, first out push - adding item to stack pop - removing item from stack (must use isEmpty to ensure that the stack is not empty) 8.3 Queues queue - accessible from both ends, one for adding and one for removing (FIFO) FIFO - first in, first out many names for adding - enqueue - enque - enq - enter - insert and removing - dequeue - deque - deq - delete - remove 8.4 Lists unsorted list - stored in the order of how the data was inserted into the memory sorted list - sorted in terms of one characteristic lists characterized by three properties - items are homogeneous - items are linear - each item except the first has a unique component that comes before it, each item before the last has a unique component that comes after it (ex. for a three-item list, the second comes after the first and before the third) - lists have varying lengths lists provide operations to - insert - insert an item delete - delete an item isThere - check if item is there getLength - report number of items in a list reset, getNext, moreItems - allow user to view each item in sequence list and array are not the same - array is structure thats part of a programming language; list is an abstract structure. list may be implemented using an array linked structure - two pieces of information; user’s data and link to indicate where next node is unsorted linked list sorted linked list 8.5 Trees tree - hierarchical structure with more classification the further you move down the structure binary trees - each node in the tree can have no more than two children Binary Trees binary tree - each node can have two successor nodes (children) root - first node in a binary tree leaf - last node in a chain which has no children binary search trees - are similar to binary trees, but binary search trees have an additional property to determine its place in the tree; the left offshoot of a node must have a value less than the original, and the right offshoot of a node must have a value greater than the original 8.6 Graphs graph - a structure which does not have the restriction of one parent node vertices - nodes in a graph edges/arcs - lines which connect the nodes undirected graph - edges with no direction (ex. two way road linking cities) directed graph/digraph - edges have one direction (ex. flight to a city with no return flight) adjacent vertices - two vertices connected by an edge Creating a Graph stack → returns the item that has been in the stack the least amount of time queue → returns the item that has been in the queue the longest amount of time Lists & trees → return the information that is requested versus graph →has algorithms defined upon it that actually solve classic problems Graph Algorithms Depth-First Search 1. Use a stack to store vertices as they are visited while trying to find a path between them 2. 3. 4. 5. Examine the first vertex associated with startVertex, if it is endVertex, the search is complete Otherwise, examine all the vertices that can be reached in one step from the first vertex Store other vertices adjacent to the starting vertex If path does not exist from first vertex, go back to the previous vertex and try again on the other vertices Breadth-First Search 1. Instead of going back one, go back all the way to the beginning Single-Source Shortest-Path Search - Uses priority queues - Retrieve the closest vertex to the current one, decreasing distance to target as much as possible 8.7 Subprograms Subprograms - available in most programming languages - ex: math package Parameter Passing parameter list - identifiers or values the subprogram works with, appears in parentheses beside the subprogram name parameters - variables used as placeholders inside subprograms arguments - the actual values passed into the function Value and Reference Parameters value parameter - the parameter takes the actual value inputted reference parameter - the parameter gets the value from the address inputted 09 - Object-Oriented Design and High-Level Programming Languages (282-320) 9.1 Object-Oriented Methodology Object Orientation Object-oriented programming uses self-contained objects which store and manipulate data, rather than having a more linear top-down approach to completing the task with each task just operating on the data passed through it in parameters. top-down: transforming input into output OOP: data objects are transformed “Read the specification of the software you want to build. Underline the verbs if you are after procedural code, the nouns if you aim for an object-oriented program.” - Grady Booch Objects - entities which represent one piece that is necessary to solving a problem (ex. a student) Classes - groups which contain objects (ex. students with common actions) Fields - the properties of each class (ex. student name and student ID) Design Methodology 1. Brainstorming - determine what classes are necessary to outline and solve the problem 2. Filtering - go over the classes listed in brainstorming and attempt to combine/simplify any or add any that are missing. 3. Scenarios - determine the behavior of each class, go through all of the 'what ifs' regarding the situation that will be presented to the program encapsulation - bundling of data and actions to separate logical properties from implementation details Top-down: task is to calculate GPA given the data OOP: student class is responsible for calculating its own gpa 4. Responsibility algorithms - algorithms are written to complete the responsibilities of each class 9.2 Translation Process Translating - converts the programmer's code into usable instructions for the computer. Assemblers - convert from simple assembly-language instructions into machine code compiler - a higher power assembler to convert the programmer’s code into machine code when using higher level languages Interpreters - translates a statement and then immediately executes it, unlike assemblers or compilers Translators - produce an equivalent program which needs to be run, whereas an interpreters will execute that input program immediately. Bytecode - a standard machine language which Java is converted into when being run 9.3 Programming Lanugage Paradigms paradigm - a pattern or model and set of assumptions, concepts, values, and practices that constitute a way of programming Imperative Paradigm imperative model - sequential instructions which operate on values in memory accessed through memory locations stored in variables, and using assignment statements to change those values procedural programming - statements are grouped into subprograms, each subprogram performing a specific task necessary to solve the problem, data objects are passive and acted on by the program, object oriented programming - each object is responsible for its own actions, data objects are active, the code to manipulate the objects are bunded with the object, it’s responsible for its own maniulation Declarative Paradigm declarative paradigm - results are described, but steps to accomplish those results aren’t stated functional model - based on the mathematical concept of function, expressed in terms of function calls (ex. (+ 30 40) would apply the first item (the function) to the rest of the list) 9.4 Functionality in High Level Languages Boolean Expressions boolean expression - a sequence of identifiers, separated by compatible operations, to evalvuate true or false legal expressions - boolean variable - arithmetic expression then a relational operator then an arithmetic expression - a boolean expression then a boolean operator then a boolean expression relational operator - compare two values Data Typing strong typing - variables are assigned a type and only values of that type can be stored in the variable data type - description of the set of values and operations that can be appleid to values of that type simple/atomic data types - values which are distinct and cannot be subdivided into parts integers - a range of integer values, range depending on the number of bytes assigned to each value reals - smalled to largest value with given precision, depending on number of bytes used to represent them characters - stored using ASCII (one byte) or Unicode (two bytes) boolean - true or gals composite data types - made up of a collection of values (ex. string) declaration - language statement associating an identifier with a variable reserved word - word in lanugage which had a special meaning; can’t be used as an identifier case sensitive - uppercase and lowercase letters are not considered the same Control Structures control structures - instruction to determine the order in which other instructions in a program are executed asynchronous - not occurring at the same time as another operation 9.5 Functionality of Object-Oriented Languages Encapsulation encapsulation - a language feature to enforce information hiding (ex. module hides code in functions that are not immediately visible) Classes instantiate - create a new class using the new operator Inheritance inheritance - allows classes to inherit data and methods from other classes superclass - class being inherited from derived class - class doing the inheriting Polymorphism polymorphism - the ability of a language to determine which of several methods should be executed for a given invocation (ex. jane.execute vs john.execute) 10 - Operating Systems (329-349) 10.1 Roles of an Operating System application software - written to address specific needs (ex. word processor, games, inventory control systems, missile guidance systems) system software - provides the tools and environments for the application softwares to run, connecting directly to the hardware to provide more functionality operating system - manages all of the computer's resources (memory, I/O, processors, etc.) and provides the user an interface to access the capabilities of the computer's resources. Memory, Process, and CPU Management multiprogramming - keeping multiple programs in the main memory at the same time, competing for access to the CPU to do their work memory management - keeping track of which programs are in the memory and where they are process management - track the progress of a process and its current state CPU scheduling - determines which process is executed by the CPU at any given moment Batch Processing batch processing - loading multiple jobs in a batch into the memory, each othe those compete for CPU and resource time; as resources become available, the jobs are scheduled to be used by the CPU (used infrequently nowadays) Timesharing Timesharing - allows multiple users to interact with a computer at one time, utilizing multiprogramming and multiple processes to appear to give each user full and individual access to the computer, even though they are truly sharing the resources mainframe - a large multiuser computer used in early timesharing systems Other OS Factors real-time system - a system which must provide a guaranteed minimum response time to the user (ex. important for systems like robot, nuclear reactor, missile, etc.) response time - delay between receiving stimulus and productive an output 10.2 Memory Management logical address - the general location of a value in relation to the location of the program, but not the memory as a whole. physical address - the true location of the value in the main memory device address binding - mapping a logical address to a physical address Single Contiguous Memory Management Single contiguous memory management - divides the memory into two sections (operating system and applications) and has logical addresses which are relative to the starting point of the program Partition Memeory Management Partition memory management - multiple programs using the main memory at the same time Fixed partition - divides the memory into a set number of chunks when the OS first boots Dynamic partition - creates chunks of memory which are allocated between applications as needed and can adjust based on the unique needs of each application partition algorithm - used to determine how memory is allocated to different programs Paged Memory Management Paged memory - divides the memory into chunks of fixed sizes and assigns those chunks, called frames page-map table - the organization used by the system to manage pages and frames demand paging - not all parts of a program need to be in memory at the same time (CPU is only accessing one at a time) so pages are brought into memory in demand page swap - bringing a page from secondary memory into the main memory, often moving another page back to secondary memory from main memory virtual memory - the illusion that a program has no size restriction because the entire process doesn’t have to be on memory at the same time thrashing - excessive page swapping; can degrade system performance virtual memory can be created by demand paging because demand paging brings new pages into the memory only when they are needed. Because of this, it can appear as though there are no limits on program size because you can be running a program on multiple memories at once. 10.3 Process Management The Process States New - process created (ex. user logging on, user hits submit button, system creates process to accomplish task) Ready - a process which has no barriers to execution and is waiting for a chance to use the CPU, but not waiting for an event to occur or data to be brought from the secondary memory Running - currently being executed by the CPU, instructions are being processed in the fetch-execute cycle Waiting - currently waiting for resources other than the CPU. (ex. waiting for a page of memory to be brought from the secondary memory or for another process to send a signal to continue) Terminated - has completed execution and isn’t active anymore The Process Control Block process control block (PCB) - the structure used by the OS to manage information and data about a process - each state is represented by a list of PCBs, one for each process in that state - when a process moves from one state to another, its corresponding PCB moves state lists as well context switch - the exchange of information from the register values for the current running state onto the PCB and the register values for the new running state onto the CPU 10.4 CPU Scheduling nonpreemptive scheduling - new CPU process emerges as a result of the current executing process (scheduling decision happens when process switches from running to waiting/terminated) preemptive scheduling - the process is preempted by the operating system (process moves from running/waiting state to ready state) turnaround time - the metric used to measure the amount of time between whena process arrives in a ready state to when it leaves the running state for the last time (lower is more desirable) First Come, First Served FCFS - processes are moved to the CPU in order of arriving in the running state, nonpreemptive - once a process is given CPU access, it stays in the CPU unless it makes a request that forces it to wait - Easy to implement but suffers from lack of input of factors such as turnaround time Average turnaround time is (140 + 215 + 535 + 815 + 940) / 5 or 529 Shortest Job Next SJN - looks at all processes in the ready state and dispatches the one with shortest service time - If estimates for service time are wrong, efficiency deteriorates Average turnaround time is (75 + 200 + 340 + 620 + 940) / 5 or 435 Round Robin round robin - disitrubutes the processing time equally among all ready processes time slice - the mount of time each process receives before it is preempted or returned to ready state to allow a new process to have its turn, then the preempted process gets another time slice later - Most widely used and considered the most fair - Not necessarily the best nor the worst algorithm, depends on use case Average turnaround time is (515 + 325 + 940 + 920 + 640) / 5, or 668 11 - File Systems and Directories (361-378) 11.1 - File Systems file - named collection of related data, the smallest amount of data written to the secondary memory by the user file system - view of data from the operating system to the user which allows for easy management directories - groups of files which make up file systems Text and Binary Files text file - bytes of data organized as ASCII or Unicode characters* binary file - requires interpretation from raw binary into a usable form *still stored in binary, but as chunks of 8 or 16 (characters) source file - text file which stores a program written in a high level language File Types file type - kind of information contained in a file (Java program, Word document, Premiere project, etc.) file extension - Part of the file name which indicates file type File Operations operating systems help you perform these actions: - Create a file - Delete a file - Open a file - Close a file Read data from a file Write data to a file Reposition the current file pointer in a file Append data to the end of a file Truncate a file (delete its contents) Rename a file Copy a file File Access sequential file access - views a file as a linear structure, data is processed in order direct file access - data is accessed directly by specifying a record number File Protection users can be given permissions to restrict access to a particular file, with each level of access differing in amount 11.2 - Directories Directory Trees parent directory - directory containing another directory subdirectory - directory contained in another directory directory tree - a way to view the file system, showing each directory and subdirectory in a nested structure root directory - directory at the highest level of a directory tree working directory - the currently active subdirectory Windows directory tree Unix directory tree Between the two, naming conventions and directories are different. UNIX was developed as a programming and system-level environment so names are more abbreviated and cryptic. Path Names path - text designation of the location of a file or subdirectory in a file system, a series of directories to find a file absolute path - specifying each step down a tree until reaching the desired file, always starting at the root relative path - names the path to a file, starting at the current working directory Unset ABSOLUTE PATHS IN WINDOWS AND UNIX C:\Program Files\MS Office\WinWord.exe C:\My Documents\letters\applications\vaTech.doc C:\Windows\System\QuickTime /bin/tar /etc/sysconfig/clock /usr/local/games/fortune /home/smith/reports/week1.txt RELATIVE PATHS IN WINDOWS AND UNIX ..\landscape.jpg ..\csc111\proj2.java ..\..\WINDOWS\Drivers\E55IC.ICM ..\..\Program Files\WinZip utilities/combine ../smith/reports ../../dev/ttyE71 ../../usr/man/man1/ls.1.gz 11.3 Disk Scheduling disk scheduling - technique in which the operating system determines which requests to satisfy first First-Come, First-Served Disk Scheduling - same concept as FCFS CPU scheduling read/write heads move to satisfy requests in the order that they were made ex. 49, 91, 22, 61, 7, 62, 33, 35 (the order they were requested) Shortest-Seek-Time-First Disk Scheduling - moves the heads in order of the minimum amount of movement needed to satisfy any of the many pending requests doesn’t guarantee the smallest overall head movement, but usually is an improvement over FCFS ex. 49, 61, 62, 91, 7 starvation - when new requests keep coming in and prevent other requests on the opposite end of the queue from being processed Scan Disk Scheduling - read/write heads move towards the spindle, then back toward the edge of the spindle circular SCAN treats the disk as a ring, not a disk ex. 33, 35, 49, 61, 62, 91 new requests are not given any special treatment, they will be serviced depending on where they fall and the current movement or direction of the heads