FIT 1047 Computer Systems, Networks, and Security Computer Systems Topics Covered Representing Numbers and Characters Boolean Logic and Gates CPUs Memory and I/O BIOS/UEFI and Firmware Operating Systems Representing Numbers and Characters Binary Numbers Binary numbers are a base two system, so we can only use zero and ones, however, we can do operations on them just like base ten numbers ● Addition ○ Adding two binary number works just like the decimal system ○ Start from the rightmost digit ○ Add the matching digits ○ If the result doesn’t fit, carry it over Hexadecimal These are numbers which are represented in base 16, so each number 10-15 is represented as A-F. Each group of 4 binary bits can be easily interchanged with its hexadecimal equivalent HEX DECIMAL 4-BITS 0 0 0000 1 1 0001 2 2 0010 3 3 0011 4 4 0100 5 5 0101 6 6 0110 7 7 0111 8 8 1000 9 9 1001 A 10 1010 B 11 1011 C 12 1100 D 13 1101 E 14 1110 F 15 1111 Converting Between Bases ● Binary or Hex to Decimal a. Consider each digit to have the value of m*2^n, where n is the position from the right, starting at zero, and m is the value of the digit b. Add the values that these digits have 4D86B16 = B*16^0 + 6*16^1 + 8*16^2 + D*16^3 + 4*16^4 = 11 + 96 + 2048 + 53248 + 262144 = 317,547 ● Decimal to Binary a. Starting with your number, divide it by two and take note of the remainder (0 or 1) b. Place the string of remainders from right to left 156%2 -> 0; 78%2 -> 0: 39%2 -> 1: 19%2 ->1; 9%2 -> 1; 4%2 -> 0; 2%2 -> 0; 1%2 -> 1 15610 = 100111002 ● Decimal to Hex a. Starting with your number, divide it by sixteen and take note of the remainder (0-F) b. Place the string of remainders from right to left 317,547%16 -> B; 19,846%16 -> 6; 1,240%16 -> 8; 77%16 -> D; 4%16 -> 4 317,54710 = 4D86B16 ● Binary to Hex and Vice Versa a. For binary to hex, replace each 4 bit string with its associated hex value b. For hex to binary, replace each hex value with its associated 4 bit string 4D86B -> 0100, 1101, 1000, 0110, 1011 -> 1001101100001101011 1001101100001101011 -> 4, D, 8, 6, B -> 4D86B Boolean Logic and Gates Boolean Algebra A B AB A+B AxorB AnandB AnorB A’ 0 0 0 0 0 1 1 1 0 1 0 1 1 1 0 1 1 0 0 1 1 1 0 0 1 1 1 1 0 0 0 0 Laws ● Identity Laws ○ 1A = A, 0+A = A ● Null Laws ○ 0A = 0, 1+A = 1 ● Idempotent Laws ○ AA = A, A+A = A ● Complement Law ○ AA’ = 0, A+A’ = 1 ● Commutative Law ○ AB = BA, A+B = B+A ● Associative Law ○ (AB)C = A(BC), (A+B)+C = A+(B+C) ● Distributive Law ○ A+(BC) = (A+B)(A+C), A(B+C) = (AB)+(AC) ● Absorption Law ○ A(A+B) = A, A+(AB) = A ● DeMorgan's Law ○ (AB)’ = A’+B’, (A+B)’ = A’B’ ● Double Complement Law ○ A’’ = A Optimising Functions Either the previous laws listed or Karnaugh Maps can be used to simplify between expressions This karnaugh map can can be expressed in its simplest form as F(A, B) = A This karnaugh map can be expressed in its simplest form as F(A, B, C) = A+CB’ Rules for KMaps 1. No group can have a zero 2. Groups can never be diagonal 3. Group sizes must be a power of two 4. Groups can overlap 5. All ones must be in a group 6. Groups may wrap around the edges 7. Groups should be made as large as possible CPUs Von Neumann Architecture Von Neumann architecture consists of the following: ● Central Processing Unit (CPU) ○ Consists of: ○ Arithmetic/Logic Unit (ALU) ○ Registers ○ Control Unit (CU) ● Memory ● Input/Output The CPU is the brain of the computer it runs all the instructions sent to it with registers to store temporary data. CPU Basics Alongside the internal makeup of the CPU, it is linked to outside influence by the bus; a set of wires connecting to external devices, including the memory, monitor, network, etc. A CPU is given a program, a sequence of instructions, typically in the form of instruction sets Instruction Set Architecture The set of machine code instructions that a particular type of CPU understands. The ISA must be able to do the following ● Take one or more words, do an operation on them, and store it somewhere ● If a condition is true, continue with different instructions ● Transfer data to or from memory, or to or from an I/O device] Fetch, Decode, Execute ● Fetch ○ The PC register contains the memory address where the next instruction to be executed is stored. ○ In the fetch cycle, the CU transfers the instructions from memory into the IR ○ It then increments the PC by one so that it points to the next instruction ● Decode ○ The CU looks at the instruction in the IR and decodes what it means ● Execute ○ The actual operation encoded by the instruction is performed BIOS/UEFI and Firmware The Boot Process The basic process involves the following steps: 1. The user turns the power on 2. The power supply checks the power and, if it’s good, it issues a power good signal 3. The CPU starts and hands off the process to the BIOS 4. BIOS starts the POST process 5. When the POST is complete it launches the bootstrap loader In more detail: 1. Power Good ○ PCs power supply circuitry tests for proper voltage 2. PC Wakes Up ○ CPU is brought into the boot process ○ Activates ‘Basic I/O System’ (BIOS) 3. POST ○ BIOS starts the ‘Power On Self Test’ (POST) 4. CMOS and BIOS ○ BIOS talks to CMOS ■ CMOS contains information on how the system should be ○ BIOS and CMOS will communicate to ensure that what is there and what should be there are the same 5. Loading Drivers ○ BIOS loads very basic drivers for devices attached to the computer into system memory 6. System Bus ○ The BIOS will have the CPU send signals over the system bus to make sure all basic components are working 7. Video ○ POST launches video display ■ It uses basic video settings ○ This is the first time in the process anything is displayed on the monitor 8. Memory ○ The POST writes and reads from the system memory 9. The Handoff ○ POST hands off to a process called the bootstrap loader 10. Bootstrap Loader ○ Loads the operating system from the bootstrap Operating Systems An operating system (OS) is a piece of software, or collection of software, that manages the resources in a computer and provides a convenient interface between applications and hardware. Introduction ● What Does and OS Do? ○ An OS provides a level of abstraction between hardware and software. We hide complicated, diverse, low-level concepts behind an interface ○ An OS has the following tasks ■ Managing multiple processes running in parallel ■ Manage the memory a process uses ■ Provide access to file systems, the network, and other I/O resources ○ The core functionality of an OS is provided by its kernel ● Abstraction ○ An OS makes computers easier to use and develop from ○ For users, it provides a UI, manages multiple applications, and provides protection from malicious or buggy code ○ For programmers it provides a programming interface to easily access the hardware and I/O devices, it also manages resources for a program ○ These are functions of abstraction. It achieves this through virtualization; it provides a virtual form of each physical resource to each process. It gives each process the illusion that it has all processes to itself Virtualizing the CPU The goal of virtualizing the CPU is to run multiple processes simultaneously, this leads to two challenges ● A process shouldn’t know that it's not running continuously ● Each process should get a fair share of CPU time The mechanics for virtualizing a CPU classifies each process as being in one of three states ● Ready ● Running ● Blocked When the process is created by placing is in memory, it is made ready. When the OS decides it should now be executed, it is classified as running. Now one of two things can happen ● The OS decides its time is up; the process is de-scheduled and placed back in the ready state ● The process requests some I/O and will have to wait until it's done, so it is blocked. Once the I/O is finished the process will be set back to ready Limited Direct Execution CPU virtualization, as discussed with switching, has a number of challenges 1. Performance ○ CPU time should be spent on the processes, not managing them 2. Control ○ The OS should enable fair scheduling and offer protection against harmful code The solution to these is to get support from the hardware, in a mechanism called limited direct execution (LDE). It allows for two things 1. Allows each process to run directly on the CPU (performance) 2. Limits the operations depending on its mode ○ User mode ■ Only a subset of all instructions are allowed to be executed ○ Kernel mode ■ Code is run without any restrictions. Typically switches to kernel mode on an interrupt ○ To allow the user mode to access I/O operations in kernel mode the user can make a system call System Calls A system call is a special CPU instruction that switches the CPU from user mode to kernel mode and then jumps to a subroutine. This provides an instruction called a software interrupt. The application needs to let the OS know what kind of privileged operation to perform, so the OS sets up a table of system call handlers, which is a block of memory containing the addresses of each OS subroutine. When a user application calls for a particular system call the following happens: 1. The system is switched to kernel mode 2. It jumps to the interrupt handler for software interrupts 3. The interrupt saves the process context into memory 4. The handler makes an indirect jump to the subroutine in the system call table 5. The code for the system call is executed and returns to the interrupt handler 6. The interrupt handler restores the process context from memory 7. The CPU is switched back into user mode 8. The interrupt handler makes a jump return to the userspace application that called it ● Process Switching ○ So far one step is still missing; how does an OS switch from one process to another? ○ Look at step 6, the OS can simply decide to restore a different process context ● Co-op & Pre-emptive Time Sharing ○ The mechanisms that triggers a switch is an interrupt. This includes I/O and system calls ○ If these are our only two sources, however, the OS may not get an interrupt for a long time ○ There are two solutions to this problem 1. Cooperative Timeshare: A process that makes system calls at regular intervals ● This is easy to implement, however, malicious code may never make a call and make a system call unusable 2. Pre-emptive Timeshare: The third type of interrupt, timer interrupts, are introduced. It's a hardware circuit that generates an interrupt at regular intervals Process Scheduling The OS needs to decide how long a process can use a CPU before it switches. There are three scheduling policies ● First Come, First Served ○ The first processes to enter the queue are served regardless of their length. ○ This doesn’t have a great turnaround time as some short processes may have to wait a long time for large processes ● Shortest Job First ○ We let short processes jump in front of the longer ones. This always results in optimal turnaround time, however long processes will have to wait even longer ● Round Robin Scheduling ○ The OS splits all the processes into small time slices and regularly switches between them. This gives the benefits of the first two, but not as pronounced. Virtualizing the Memory Virtualizing the memory has three main goals 1. To enable protection of a processes memory from others 2. To make programming easier as a programmer won’t need to know where the memory is organised 3. To enable processes to use more memory than provided by RAM This brings into light various concepts that achieve the virtualization ● Address Space ○ We call the addresses that can be used by a process its address space ○ Multiprogramming ■ The first multiprogramming systems divided the memory and each process was allocated a fixed region to use ■ There are two challenges to this 1. To enable programmers to program as if their code was loaded at the address 0 2. Prevent code from accessing memory belonging to another program ■ This created the need for virtual memory ● Virtualizing the Memory ○ In a virtual memory system, the instruction operates on virtual addresses which the OS translates into the physical address ○ Virtual Address ■ A simple approach for implementing a virtual address uses an additional register, the base register, that contains the address of the program’s starting position So to load a program at a virtual address of 100, the CPU loads the address B+100 ○ Memory Protection ■ The CPU creates bounds for the program to use with another, additional bounds register, which contains the largest address a process is allowed to access ○ Real Virtual Memory Systems ■ The previous approach was unrealistic, as the OS can’t know how much RAM a particular process will need. ■ Realistic virtual memory systems, therefore, implement a more complex approach, where memory is allocated in pages. The OS keeps a list of pages for each process and can add or remove pages is the existing ones are full or empty. ■ Another advantage is that pages can be stored in the disk drive if a process hasn’t used them for a while MARIE Topics Covered Registers Instructions Subroutines From Instructions to Circuits Memory MARIE has the following properties ● Words are 16 bits wide ● There are 16 instructions ● Each instruction is one word, composed of a 4 bit opcode and 12 bit address ● There is a general purpose register Registers ● AC (Accumulator) ○ General purpose register ● MAR (Memory Address Register) ○ Holds the address of a word that needs to be read or written to memory ● MBR (Memory Buffer Register) ○ Holds the data read or written to memory ● IR (Instruction Register) ○ Contains the instruction being executed ● PC (Program Counter) ○ Contains the address of the next instruction Instructions A MARIE instruction is a 16 bit word. The leftmost 4 bits are opcode, with each number representing an instruction. The remaining 12 bits represent a memory address for the instruction to use. These raw bytes are machine code, which is exceedingly difficult for humans to understand. MARIE Instructions We give each opcode a mnemonic to more easily understand the machine code OPCODE MNEMONIC EXPLANATION 0001 LOAD X Load value from X into AC 0010 STORE X Store value from AC into X 0011 ADD X Add value from X into that of AC 0100 SUBT X Subtract value from X into that of AC 0101 INPUT Read user input into AC 0110 OUTPUT Output the value of AC 0111 HALT Stop execution 1010 CLEAR Set AC to zero Jumps, Loops and Conditionals We can use instructions that can jump to different parts of the program depending on conditions. Jumping backwards in code can allow loops to be achieved OPCODE MNEMONIC EXPLANATION 1000 SKIPCOND X Skip next instruction under condition of X 1001 JUMP X Continue to execution at location X ● Jump ○ Causes the CPU to start doing its instructions from a given address. It does this by changing the value of the PC ● SkipCond ○ A conditional instruction that behaves differently depending on the value given and in the AC. In this case, X is used to distinguish between the three versions of skipcond ■ SkipCond 000: If AC < 0, skip ■ SkipCond 400: If AC == 0, skip ■ SkipCond 800: If AC > 0, skip Indirect Addressing Instead of accessing the value at a hardcoded location, x, we can use a value stored at x as the address for the value we want OPCODE MNEMONIC EXPLANATION 1011 ADDI X Add value pointed to by X 1100 JUMPI X Continue execution at location in X 1101 LOADI X Load from address in location X into AC 1110 STOREI X Store AC in location given by X Subroutines A subroutine, or function, is a piece of code that ● Has a purpose or function ● Is executed often ● Can be called and given an argument ● Returns to where it was called OPCODE MNEMONIC EXPLANATION 0000 JNS Stores return address at X, then runs code after X JnS Stores the address of the next instruction into X, then runs the code at X+1. It concludes with JumpI X to jump back to the value which JnS stored at X From Instructions to Circuits Data Paths The data path in a CPU describes how the different functional units, in particular, the register and the ALU are connected. The hardware implementation of the data paths is the system bus; the set of wires that connects components in the CPU ● Data Bus ○ Transports individual words of data between the memory, registers, and ALU ● Address Bus ○ Connects the memory with the MAR. It selects the memory address that the CPU reads/writes to ● Control Bus ○ Used by the CU to select different modes of operation on the other components Register Transfer Language Assembly language is very low level, but even lower is RTL, which defines the fetch-decode-execute cycle ● Fetch ○ The PC register contains the address of the next instruction that needs to be executed. The CU needs to load the next instruction from memory into the IR register and increment the PC ○ This itself consists of four small steps 1. MAR ⟵ PC 2. MBR ⟵ M[MAR] 3. IR ⟵ MBR 4. PC ⟵ PC + 1 ● Decode ○ The CU looks at the instruction from IR and figures out what needs to happen. Most instructions contain an address, x, which is kept in MAR. ○ If the instruction needs to read data from memory it is kept in the memory buffer register (MBR), this makes up the next two items 5 MAR ⟵ X 6 MBR ⟵ M[MAR] ● Execute ○ This depends on the instruction being executed INSTRUCTION RTL LOAD X 7. AC ⟵ MBR STORE X 6. 7. MBR ⟵ AC M[MAR] ⟵ MBR ADD X 7. AC ⟵ AC + MBR SUBT X 7. AC ⟵ AC - MBR SKIPCOND X 6. IF MAR = 0x800 & AC > 0: PC += 1 IF MAR = 0x400 & AC = 0: PC += 1 IF MAR = 0x000 & AC < 0: PC += 1 JUMP X 6. PC ⟵ MAR CLEAR 5. AC ⟵ 0 ADDI 7. 8. 9. MAR ⟵ MBR MBR ⟵ M[MAR] AC ⟵ AC + MBR JUMPI X 7. PC ⟵ MBR LOADI X 7. 8. 9. MAR ⟵ MBR MBR ⟵ M[MAR] AC ⟵ MBR STOREI X 7. 8. 9. MAR ⟵ MBR MBR ⟵ AC M[MAR] ⟵ MBR JNS X 6. 7. 8. 9. MBR ⟵ PC M[MAR] ⟵ MBR AC ⟵ MAR PC ⟵ AC + 1 Control Signals Each control signal is a set of wires in the control bus that can switch a component on, off, or select it’s mode of operation SIGNAL WIRES NUM BITS VALUES REGISTER READ P2, P1, P0 3 000 001 010 011 100 111 (NONE) (MAR) (PC) (MBR) (AC) (IR) REGISTER WRITE P5, P4, P3 3 000 001 010 011 100 111 (NONE) (MAR) (PC) (MBR) (AC) (IR) MEMORY READ MR 1 0 1 (FALSE) (TRUE) MEMORY WRITE MW 1 0 1 (FALSE) (TRUE) ALU OPERATION A2, A1, A0 3 000 010 001 011 100 (NOTHING) (ADD) (SUBTRACT) (CLEAR) (INCREMENT BY 1) Now we can annotate each RTL step with the control signals, for example, ADD X STEP [P5P4P3,P2P1P0,MRMW,A2A1A0] 1. MAR ⟵ PC 001, 010, 00, 000 2. MBR ⟵ M[MAR] 011, 000, 10, 000 3. IR ⟵ MBR 111, 011, 00, 000 4. PC ⟵ PC + 1 010, 010, 00, 100 5. MAR ⟵ X 001, 000, 10, 000 6. AC ⟵ AC + MBR 100, 100, 00, 010 Memory A sequence of locations, each of which has an address, and can store one data value of a fixed width Addressing Memory Locations An address is simply an unsigned integer that references one unique memory location. In MARIE one memory location stores one word. Each address, therefore, references a whole word, which we call ‘word addressable memory’. In order to address 2n memory locations, we always need n bits for the address RAM Random Access Memory allows the CPU to access any location in RAM in the same amount of time Memory Organisation RAM modules are made up of multiple chips of a fixed size L*W, where L is the number of locations, and W is the number of bits per location. To know which RAM chip to access in a given location we split up the address into two parts ● Use the leftmost 4 bits for the row ● Use the rightmost 12 bits to select the byte in the row Networks Topics Covered Introduction Layers and Protocols Application Layer Physical Layer Data Link Layer Network Layer Transport Layer The Internet Introduction The primary goal of connecting computers with each other is to enable them to communicate Network Structure ● Client ● ● ● ● ● ● ● ○ A device that enables users to access the network Server ○ A device that provides services to clients Switch ○ Connects multiple devices into a Local Area Network (LAN). All devices in the LAN can communicate with each other Router ○ Connects multiple devices to different networks Addresses ○ An address, in general, is a unique identifier. On the internet, each device needs an IP address Types of Networks ○ Local Area Network (LAN) ■ A group of clients and servers that share a local circuit, ie, they are directly connected to each other ○ Backbone Network (BN) ■ Connects multiple LANs together using routers. Usually still within a local geographical area ○ Metropolitan Area Network (MAN) ■ Connects LANs and BNs. Usually spans several kilometres ○ Wide Area Network (WAN) ■ Similar to MAN, except its usually over a much longer distance Transmission Rates ○ Measured in bits per second (bps), its how much data is sent through a network in an amount of time Latency ○ How long the data takes to travel from sender to receiver Networks Application Architectures Usually, clients provide an application to users. Each of these applications has a different task to fulfil ● Presentation Logic ○ The part of the app that provides the user interface ● Application/Business Logic ○ Defines how the app behaves ● Data Access Logic ○ Defines how the app manages it data ● Data Storage ○ Where the data is kept In a networked application, the work between clients and servers can be split into different application architectures ● Server Based ○ All tasks are performed by the server ● Client Based ○ All tasks are performed by the client, but data storage is on the server ● Client-Server ○ Client performs presentation and application logic ○ Server implements data access and storage logic ● Thin-Client ○ Client performs only the presentation ○ Server implements application, data access, and data storage ● Multi-Tier ○ Tasks are further split, with servers handling application logic and databases handling data access and storage ● Peer-to-Peer ○ Clients are connected to each other without servers Layers and Protocols Like an operating system, networks provide layers of abstraction, each layer defines a language or protocol to implement it. Internet Model The internet model is a packet switching network. That means any communication between devices is sent in short packets, so large data needs to be split by the sender and assembled by the receiver. We will consider the following layers. “People Don’t Need These Anyway” 1. Hardware/Physical Layer ○ The actual hardware and signals 2. Data Link Layer ○ Defines the interface between hardware and software 3. Network Layer ○ Responsible for routing 4. Transport Layer ○ Establishes a logical connection between an application sending and the application receiving 5. Application Layer ○ The actual software a user interacts with Protocols and Message Encapsulation ● Protocol ○ A formal language that defines how two applications talk to each other ○ For layers 2, 3, 4 in the Internet model, the protocols take the form of headers of data that are added to the message which maintain data integrity ○ It’s like an envelope with the message inside and added protocol data outside ● Message Encapsulation ○ Each layer adds its own envelope with all above layer’s envelopes inside Application Layer Software at the application layer implements the functionality of a networked application. The different application architectures are implemented at this level. The most widely used applications are the World Wide Web and email. The World Wide Web (WWW) WWW is what you access through a web browser. It has a web page that can be accessed at a URL and that can link to other pages. Three technologies were created as a consequence ● Uniform Resource Locator (URL) ○ The address of a document ● Hypertext Transfer Protocol (HTTP) ○ A set of commands understood by browsers and servers ○ Used to send and request documents ● Hypertext Markup Language (HTML) ○ Describes the contents of the document HTTP ● Request-Response Cycle ○ Request ■ Consists of a request line which includes the method, a path, and the protocol version. ■ The request is followed by the request header (the name of the host), and request body (the document) ■ Request header: HOST: www.google.com ■ Request body: GET /doc.html HTTP/1.1 ○ Response ■ Consists of the response status, with the status code, followed by the response header which can provide metadata ■ Response status: HTTP/1.0 200 OK ● State in HTTP ○ HTTP is stateless; every request is sent independently and the server doesn’t know if two requests are from the same user or not ○ Users are given session IDs ■ The server returns the document with a link to the ID ■ A cookie is used as a part of the response header to record/share the session ID Electronic Mail (email) ● Client-Server Approach ○ The traditional email setup is a two-tier client-server application. A sender uses the Simple Mail Transfer Protocol (SMTP) to send a message to a mail server. ○ The recipient uses either Post-Office Protocol (POP) or Internet Message Access Protocol (IMAP) to access email from a server ○ Multi-Purpose Internet Mail Extensions (MIME) encodes data into text to attach files to email ● WebMail Approach ○ In this approach, users access their email through a web page rather than on an application Physical Layer The physical layer is the lowest level layer of the internet model. It contains the actual network hardware Network Hardware ● Network Interface Card (NIC) ○ The hardware component that connects devices to a network. The most common NICs are components that connect to WiFi, Mobile, Bluetooth, or cable ● Network Cables ○ Components used for sending electronic signals ○ Unshielded Twisted Pair (UTP) ■ Several pairs of copper wire twisted together to reduce interference ■ Each wire is insulated with plastic ○ Shielded Twisted Pair (STP) ■ Like UTP but shielded with metal to further prevent interference ○ Coaxial Cable ■ An inner wire surrounded by insulation and wire mesh ○ Optical Fibre ■ Lasers are sent through the fibre at high speeds with little loss Signals A signal is defined as energy travelling through a medium. In computer networks, we have electrical signals that travel through copper cables. ● Digital ○ Digital means we send discrete states (0 or 1) ○ A digital signal encodes ones and zeroes onto two different arbitrary voltages which update in a fixed amount of time. ○ To prevent loss of synchronisation between sender and receiver ones and zeroes can be represented as two values so that the bits are always changing, and therefore resynchronisation is easy ● Analogue ○ Analogue means we send continuous states (0 to 1) ○ Analogue signals are propagated through sine waves, and can, therefore, send richer signals because we can describe the signals with three parameters rather than one: amplitude, frequency, and wavelength ○ Frequency Modulation ○ Amplitude Modulation ○ Phase Modulation ○ These three modulations can be combined to give many values ● Attenuation ○ The weakening of signals with increasing distance Modems The process of converting between analogue and digital is called modulation and is done by a modulation/demodulation, or modem, for short Data Link Layer The data link layer is responsible for controlling the hardware and for error detection Media Access Control (MAC) Only one device is allowed to transmit at a time on a network which uses a single medium, this is what MAC aims to solve. There are two main approaches to MAC ● Controlled Access ○ Only one device can send or receive at a time, with either one authority assigning permission, or permission is passed from device to device ● Contention Based Access ○ Access is provided on a first come, first served basis. Senders sense that other devices are sending before committing to transmission, however collisions can happen and need to be detected by the MAC MAC in ethernet is based on the CSMA/CD method: ● Carrier Sense (CS) ○ A device listens to the network and only transmits when no others are ● Multiple Access (MA) ○ Multiple devices can share the same medium ● Collision Detection (CD) ○ While a device is sending it will listen for other sending, if it does, a collision occurred, and it waits for a random amount of time to send again Using ethernet with multiple local devices can be implemented in two ways ● Ethernet as a Shared Bus ○ All devices on ethernet usually share a single cable or bus. A consequence of this is that all devices receive all messages, so each message needs a destination address. ○ This is given in the form of a MAC address when a new device enters a network. It consists of 6 bytes ○ Problems started to arise on the shared bus topology, so a star based topology was used instead, where all devices connect to a hub, which takes control of the network while maintaining the behaviour of a shared network ● Switched Ethernet ○ The solution to problems in a hub-based, shared medium ethernet is to move from a logical bus topology to a logical star topology. ○ The network is no longer shared, and messages are sent from one device to another, this is known as a switch network. ○ Switches follow a simpler approach. Initially, a switch doesn’t know where the destination devices are, so it sends it to every unknown device until it gets a response, where that device is stored into a forwarding table. In this way, it avoids collisions by only sending messages over the necessary cables. Wireless Local Area Network (WLAN) The advantages of WLAN are obvious ● No cables ● Flexible network access ● New ways to use the network Since WLAN is wireless it uses radio waves to communicate. The spectrum of waves is strictly regulated. WLAN uses a part of the spectrum that can be used freely in a local area. WLAN can use 2.4GHz or 5GHz. IEEE defines a number of channels around the 2.4GHz band so that multiple WLAN devices can be used at a time CHANNEL NUM FREQ RANGE (GHz) 1 2.401 - 2.423 2 2.406 - 2.428 3 2.411 - 2.433 4 2.416 - 2.438 5 2.421 - 2.443 6 2.426 - 2.448 7 2.431 - 2.453 8 2.436 - 2.458 9 2.441 - 2.463 10 2.446 - 2.468 11 2.451 - 2.473 12 2.456 - 2.478 13 2.461 - 2.483 ● WLAN Topology ○ Independent Basic Service Set (BSS) ■ A number of devices that talk directly to each other ■ Acts like shared ethernet but wirelessly ○ Access Point (AP) ■ The access point is connected to the infrastructure of the network which wireless devices can connect to ■ All communication is through the AP ■ AKA the infrastructure of BSS ○ Extended Service Set (ESS) ■ Multiple access points working together ■ They are all connected to the same network, uses its identifier (name), and all have some overlap. That way a device can roam between APs ■ This is entirely within the data link layer ● WLAN MAC ○ MAC on WLAN is similar to that on shared ethernet, however, it is less likely for devices to detect alien transmissions. Because of this, to avoid collisions, WLAN uses CSMA/CA rather than CSMA/CD, where CA is collision avoidance ○ There are two different CA mechanisms ■ Automatic Repeat Request (ARQ) ● After transmitting a device waits for acknowledgement from the AP. After a random time without response, a device will resend ■ In addition, WLAN devices may use controlled access. In that case, a device sends a request to send (RTS) and only sends data once it receives a clear to send (CTS) Network Layer The main responsibility of the network layer is routine, which is the main mechanism for exchanging packets. The network layer is implemented by the internet protocol, the part of TCP/IP IP Addresses Each device on the internet requires a unique address for each of its NICs. ● IPv4 ○ IPv4 is a 32 bit address written in dot notation to represent the four bytes (0-255) given to a device ■ 130.94.66.43 ○ IP addresses are hierarchical to make it easier to route a packet. The first 18 bits may identify the network, the next 8 the subnet, and the final 8 represent the device. This, however, may limit our ability to distribute the last 16 bits, we may need many subnets with few devices. The solution to this is a subnet mask which tells us how many bits to use for the network and subnet identifier. ■ 130.94.66.43/24 or ■ 130.94.66.43/255.255.255.0 ● Subnets and LANs ○ There’s a link between subnet masks and lower layers. Each subnet corresponds to a single LAN, so all devices in a LAN belong to the same subnet, while those outside the LAN have a different subnet. This enables devices to make decisions on where to send a packet. ● IPv6 ○ IP addresses that are 32 bits long only allow 232, or 4.3 billion, devices, which is far less than the actual amount of devices, so IPv6 allows 2128, or 340 undecillion devices ○ Here’s a typical IPv6 breakdown ■ Regional Internet Registry (RIR) ● An authority for allocating addresses in a region ■ Internet Service Provider (SIS) ● Provides addresses and backbone for an organisation ● Gives the organisation 16 unique bits ■ Organisation ● A body running their own network ■ Subnet ● An organisation can own 216 subnets ■ Interface ID ● Identifies a device in the subnet ● A device can have many IDs ● Provides anonymity Address Resolution The step that maps a higher layer address (like URL or MAC) to a lower layer address (like IP) ● The Domain Name System (DNS) ○ Responsible for mapping human readable addresses to IP addresses. ○ A client sends a DNS request (to find the IP) to a DNS server which replies with one of the following ■ An error if no IP is found ■ The IP of another DNS server that can handle the request ■ The IP address is found ● Mapping IP to MAC ○ This is only required in LAN ○ The address resolution protocol (ARP) asks every device who has a specific IP, the router will receive this and reply with the MAC address Routing A router is a device that is connected to multiple networks and forwards packets between them ● Routing Tables ○ Routers use routing tables to decide where to send the packets. It contains entries for different networks, and for each network, it would tell the router which other routers can handle that network ● Static Routing ○ To set up a router it is given a simple mechanism, static routing, which means a router has a fixed table set by a human or protocol ● Dynamic Routing ○ The routers talk to each other to build accurate routing tables. ○ There are two types of dynamic routing protocols: ■ Distance Vector Routing ● Exchanges information about the distance to a destination network and the target router for that network ● Distance is measured by the number of routers it needs to go through ■ Link State Routing ● Exchanges information about distance and also the quality of a link Transport Layer The transport layer ensures packet integrity and delivery. The Transmission Control Protocol (TCP) provides a virtual circuit. To achieve this TCP splits application layer messages into segments, makes sure the segments arrive correctly and reassembles them at the destination Addressing Applications Since many computer applications can access a device's network from a single device there needs to be a way to decide which packet goes to which application. So, each application is given a port number, together with the IP eg. 130.194.77.37:57017 There are also reserved ports which identify a service we may want to use ● 80 = HTTP ● 25 = SMTP TCP Error Control and Session Management TCP sets up a reliable channel by splitting application messages and ensuring arrival. TCP implements Automatic Repeat Request (ARQ) using 2 numbers sent with every packet. TCP has three phases: 1. 3-way Handshake a. Client and server ensure numbers to set up connection i. Client sends SYN packet ii. Server responds to SYN with SYN and ACK 2. Data Transmission 3. 4-way Handshake a. Client and server both send FIN packets upon completion TCP Parameters Maximum segment size: ● Determined by shortest frame length between connection ● Maximum Transfer Unit Speed of segment transmission: ● The frames are sent and after a certain time it waits for acknowledgement The Internet A network of networks: the internet is a collection of all devices that are running the TCP/IP protocol suite and are connected via routers Autonomous Systems At the highest level, the internet is made up of Autonomous Systems (AS) which are networks that are operated by a single organisation. Splitting the internet into ASs is important for routing. When a packet is sent between devices in an AS, it is routed internally, otherwise, it’s routed externally via a border router. Internet Structure and Peering ASs need to be connected to each other. Most users access the internet via an ISP, mobile networks, cable, or NBN. ISPs can connect to each other at an Internet Exchange Point (IXP), which provides hardware for multiple ISPs to connect. Scaling Up the Internet The original internet protocols were not designed for large networks and low latency. We’ll look at three techniques that mitigate the problems ● Load Balancing ○ Many services on the internet are used by so many people simultaneously, it cannot be implemented with a single server computer. ○ The solution is to use multiple servers over the world. The server used is decided with DNS-based load balancing; the same DNS will return different servers depending on where you are in the world. ● Content Caching ○ In addition to pointing a user to a geographically close server via DNS, a router can store a copy of frequently requested documents. This is content caching ○ Content caching is supported by HTTP, when a user makes a GET request the server can respond with an additional header field, EXPIRES, that states how long the cache is stored for ● Content Delivery Networks ○ The content delivery network (CDN) is a collection of independent data centres connected by high speed dedicated networks which sell access of the CDN to content providers Security Topics Covered Cryptography Access Control, Passwords, and User Authentication Security Protocols Firewalls and Viruses Cryptography The ciphering and deciphering of messages in a secret code, or cipher. Also; the computerised encoding and decoding of information Symmetric Encryption The main idea of a symmetric algorithm is to use the same key for encryption and decryption. Most classical ciphers are based on symmetric ciphers Sender: ‘Hello’ ⟶ key ⟶ ‘-AXFf>’ Receiver: ‘-AXFf>’ ⟶ key ⟶ ‘Hello’ Modern symmetric ciphers will take blocks of a message and scramble them in a way that each change in the original block will have an effect to the complete encrypted block so that there are no detectable dependencies between individual bits in the block and specific bits in the encrypted block. Advantages: ● Very efficient ● Low impact on message size ● Short keys Disadvantages ● Key distribution ○ One needs to establish a shared secret ● Scalability ○ Each pair of sender and receiver needs a unique key. ○ The number of keys grows exponentially with the number of participants ● Non-repudiation is impossible ○ No evidence of who generated the message as sender and receiver have the same keys S-Boxes and Permutations An S-Box is like a lookup table that has the first hex digit on the side, and the second digit on top. These two digits are replaced by the value found where the column and row intersect. The resulting substitutional message is permuted, ie the order of the bit is changed The Advanced Encryption Standard (AES) AES uses 14 cycles in the 256-bit version. AES works on message blocks. In its basic form it provides confidentiality for the encrypted data, only someone with the correct key can view it. AES uses a blockchain, so we start with an initialisation vector and then combine each encrypted block with the real, this, blocks in the wrong order cannot be decrypted and a changed block will disable description of the following blocks. AES is often used alongside Cipher Block Chaining (CBC) and Message Authentication Code (MAC) so one block can be used to check the integrity of the complete message Public Key Cryptography Unlike symmetric cryptography, the public key system works with a pair of keys used for encryption. The idea is to rely on a difficult mathematical problem to generate a key pair, such that the private key cannot be derived from the public key without solving the problem. This can solve the problems of symmetrical cryptography, at the cost of a large key size and less efficiency. The private key can produce a digital signature with the original message to check who signed the message. RSA is a form of encryption that involves finding the two prime factors of large numbers. For a secure encryption, algorithms should rely on random numbers and hash functions Cryptographic Hash Functions A hash function maps an input of arbitrary length to a fixed length output. Cryptographic hash functions have additional properties, in particular, they are infeasible to invert. They are used in digital signatures, storing and comparing passwords, etc. Ideal hash functions have the following properties: ● Computing a hash value is fast and uses little resources ● Given just a hash, it’s impossible to find the original message ● Hashes for similar messages shouldn’t be correlated ● Infeasible to find collisions Generating Public and Private Keys 1. 2. 3. 4. 5. Pick two prime numbers, p and q Let n = p * q z = (p-1) * (q-1) Let e be a prime number less than z that doesn’t divide z n and e are the public keys Let d be our private key such that e * d = 1(mod z) So that (e*d) / z = (ℕ * z) + 1 Encrypting a Message 1. 2. Pe = E(mod n) ● P is our message ● n and e are the public keys ● E is the encrypted message E = Pe - ((Pe//n)*n) Decrypting a Message 1. 2. Ed = P(mod n) ● E is the encrypted message ● d is the private key ● P is the message ● n is part of the public key P = Ed - ((Ed//n)*n) Security Protocols We need protocols that specify how to use the algorithms to satisfy particular security goals. Some tasks for protocols include: ● ● ● ● ● Negotiate the cryptography algorithms to use Include necessary information Interoperability, how to talk to each other Key exchange and the establishment of short-term key sessions Combining security mechanisms Transport Layer Security (TLS) Cryptographic protocols that provide communications security over a computer network SLL 2.0 was the first practical TLS, though we currently use TLS.1.2 TLS sits between the application layer and transport layer TLS Phases: 1. TLS Handshake a. Results in a shared key and session ID 2. TLS Record a. After an exchange of ChangeCipherSpec, all subsequent traffic is encrypted 3. TLS Alert a. Closes a session Certificates A certificate provides additional information for a public key ● Validity (expiry date) ● Subject Name ● Issuer Name ● Other Parameters A trusted certificate is signed by a known certification authority. Browsers have a list of these authorities. Virtual Private Network (VPN) A VPN logically connects a client, or network, to another network via an encrypted channel. Tunnels can be established by TLS or IPSec, but security is only maintained within the tunnel IPSec A protocol suite on the level of IP packets ● Authenticates and encrypts data for each IP packet ● Transport Mode ○ The payload in packets is encrypted, the header is protected ● Tunneling Mode ○ Complete IP packets are encrypted and placed in a new IP packet Firewalls and Viruses Firewalls A barrier between some internal network and an outside network. It filters traffic based on security rules ● Operates on the network layer ● Filters based on source, destination, protocols, ports, etc ○ Source IP Address ■ Any address can connect to a web server ■ Management access should be restricted to specific IPs ○ Destination IP Address ■ IP of the server running the server ■ Destination needs to be defined ■ Never allow any IP address ○ Destination Port ■ The server accessed by a specific port ■ Never allow any port ● It inspects the first few bytes of TCP and UDP headers in IP packets Firewalls should be set up on PCs, home networks, and company networks. The internal network should never be directly accessible. Demilitarized Zone (DMZ) A zone less secure than the internal network, but protected from direct access Proxies and NAT ● Proxies ○ Hide individual devices in the internal network ● Network and Port-Address Translation (NAT) ○ Internal network uses internal IP addresses not visible to the outside IDS and IPS ● Intrusion Detection System (IDS) ○ Monitors network and system activities ○ Alert when malicious activity is found ○ Logs information ● Intrusion Prevention System (IPS) ○ IDS with additional functionality ○ Blocks malicious activities ○ Signature or anomaly based Anti-Virus Software ● Prevents infections from known malware, but cannot detect new malware ● Computer Worm ● ● ● ● ● ○ A standalone malware computer program that replicates itself in order to spread to other computers Phishing ○ The use of impersonation to convince a victim to share private information Ransomware ○ The installation of malicious software that threatens a victim for money Botnets ○ An automated process which aims to prevent a service from working, most commonly through a Denial of Service attack ○ Distributed Denial of Service (DDOS) ■ An attack that prevents a service from working by making massive requests, causing traffic which brings a service to a halt Virus ○ Inserts itself into a program or document which runs and spreads when that program or document is executed Trojan ○ Malware disguised as a legitimate piece of software Security Properties ● Authenticity ○ Something has definitely happened in the way we assume ● Integrity ○ Some data has not been changed since some authentic event ● Confidentiality ○ Some information is only known to some principals ● Privacy ○ Protection of personal information ● Availability ○ Some service or resource can be used within a particular time with a particular quality