AQA GCSE (9–1) Computer Science is available as a Whiteboard eTextbook and Student eTextbook. Whiteboard eTextbooks are online interactive versions of the printed textbook that enable teachers to: ●● Display interactive pages to their class ●● Add notes and highlight areas ●● Add double page spreads into lesson plans Student eTextbooks are downloadable versions of the printed textbooks that teachers can assign to students. Students can: ●● Download and view them on any device or browser ●● Add, edit and synchronise notes across two devices ●● Access their personal copy on the move Find out more and sign up for a free trial – visit: www.hoddereducation.co.uk/dynamiclearning FOR THE 8525 SPECIFICATION AQA GCSE (9–1) Computer Science Second Edition George Rouse Lorne Pearcey Gavin Craddock 9781510484306_AQA_GCSE_Computer_Science_FM.indd 1 19/08/20 10:56 PM Although every effort has been made to ensure that website addresses are correct at time of going to press, Hodder Education cannot be held responsible for the content of any website mentioned in this book. It is sometimes possible to find a relocated web page by typing in the address of the home page for a website in the URL window of your browser. Hachette UK’s policy is to use papers that are natural, renewable and recyclable products and made from wood grown in well-managed forests and other controlled sources. The logging and manufacturing processes are expected to conform to the environmental regulations of the country of origin. Orders: please contact Bookpoint Ltd, 130 Park Drive, Milton Park, Abingdon, Oxon OX14 4SE. Telephone: +44 (0)1235 827827. Fax: +44 (0)1235 400401. Email education@bookpoint.co.uk Lines are open from 9 a.m. to 5 p.m., Monday to Saturday, with a 24-hour message answering service. You can also order through our website: www.hoddereducation.co.uk ISBN: 978 1 5104 8430 6 © George Rouse, Lorne Pearcey and Gavin Craddock 2020 First published in 2020 by Hodder Education, An Hachette UK Company Carmelite House 50 Victoria Embankment London EC4Y 0DZ www.hoddereducation.co.uk Impression number 10 9 8 7 6 5 4 3 2 1 Year 2024 2023 2022 2021 2020 All rights reserved. Apart from any use permitted under UK copyright law, no part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying and recording, or held within any information storage and retrieval system, without permission in writing from the publisher or under licence from the Copyright Licensing Agency Limited. Further details of such licences (for reprographic reproduction) may be obtained from the Copyright Licensing Agency Limited, www.cla.co.uk Cover illustration © By ToheyVector – stock.adobe.com Illustrations by Integra Software Services Pvt. Ltd., Pondicherry, India. Typeset in Integra Software Services Pvt. Ltd., Pondicherry, India. Printed in Italy A catalogue record for this title is available from the British Library. 9781510484306.indb 2 19/08/20 8:31 PM CONTENTS How to use this book 1 FUNDAMENTALS OF ALGORITHMS 1.1 Representing algorithms 1.2 Efficiency of simple algorithms 1.3 Sorting and searching algorithms 2 PROGRAMMING 1 1 8 9 25 2.1 Data types 2.2 Programming concepts 2.3 Arithmetic operations 2.4 Relational operations 2.5 Boolean operations 2.6 Data structures 2.7 Input and output 2.8 String-handling operations 2.9 Random number generation 2.10 Structured programming and subroutines 2.11 Robust and secure programming 3 FUNDAMENTALS OF DATA REPRESENTATION 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 Number bases Converting between number bases Units of information Binary arithmetic Character encoding Representing images Representing sound Data compression 4 COMPUTER SYSTEMS 4.1 4.2 4.3 4.4 4.5 9781510484306.indb 3 v Hardware and software Boolean logic Software classification Classification of programming languages and translators Systems architecture 26 28 37 38 40 42 46 46 49 50 55 80 81 83 90 91 95 97 100 103 123 124 124 134 138 143 iii 19/08/20 8:31 PM 5 FUNDAMENTALS OF COMPUTER NETWORKS 5.1 Computer networks 6 CYBER SECURITY 169 169 188 6.1 Fundamentals of cyber security 6.2 Cyber security threats 6.3 Methods to detect and prevent cyber security threats 7 RELATIONAL DATABASES AND STRUCTURED QUERY LANGUAGE (SQL) 7.1 Relational databases 7.2 Structured Query Language (SQL) 188 188 194 200 200 203 8 ETHICAL, LEGAL AND ENVIRONMENTAL IMPACTS OF DIGITAL TECHNOLOGY ON WIDER SOCIETY 211 8.1 Ethical, legal and environmental impacts and risks of digital technology 211 GLOSSARY 221 KNOWLEDGE CHECK ANSWERS 226 QUESTION PRACTICE ANSWERS 234 INDEX 252 ACKNOWLEDGEMENTS 257 iv 9781510484306.indb 4 19/08/20 8:31 PM HOW TO USE THIS BOOK To help you get the most out of it, this textbook uses the following learning features: Important words Highlighted in green, these are terms that you will be expected to know and understand in your exams. Important words You will need to know and understand the following terms: algorithm decomposition abstraction Tech terms Jargon or technical definitions in blue that you may find useful. Tech terms Inverter An alternative name for a NOT gate, because the output inverts the input. Key point An important idea or concept. Key point A computer program is an implementation of an algorithm – an algorithm is not a computer program! Worked examples Used to illustrate an idea or a concept, these will guide you through the reasoning behind each step of a calculation or process. Worked example FOR a ← 1 to 4 FOR b ← 1 to 3 OUTPUT a * b ENDFOR ENDFOR v 9781510484306.indb 5 19/08/20 8:31 PM Beyond the spec Information that you will not be expected to know or state in an exam but will aid understanding or add some useful context.. Beyond the spec Computer Scientists use ‘Big O notation’ to measure mathematically how efficient algorithms are. Big O attempts to measure the change in efficiency as the size of the data input increases. This is covered fully in the A-level Computer Science specification. Knowledge check Quick check-ins that help you to recap and consolidate your understanding of the previous section. Knowledge check 1 Define the following terms: (a) Decomposition (b) Abstraction 2 A programmer is creating software to send and receive encrypted messages via email. Describe how decomposition can be used to allow this problem to be solved. Recap and review A targeted summary of everything you have learned in the chapter. Use this to help you recap as you work through your course. Question practice More formal questions to help you to prepare for both examination papers. Answers Answers to all of the Knowledge check and Question practice questions are at the back of the book and also at www.hoddereducation.co.uk/ AQAGCSEComputerScience. vi 9781510484306.indb 6 19/08/20 8:31 PM 1 FUNDAMENTALS OF ALGORITHMS CHAPTER INTRODUCTION In this chapter you will learn about: 1.1 Representing algorithms ➤ Understand and explain the term algorithm ➤ Understand and explain the terms decomposition and abstraction ➤ Explain algorithms in terms of their inputs, outputs and processing ➤ Use a systematic approach to problem solving including pseudo-code, program code and flowcharts ➤ Determine the purpose of simple algorithms using trace tables and visual inspection 1.2 Efficiency of simple algorithms ➤ Understand that more than one algorithm can be used to solve the same problem ➤ Compare the efficiency of algorithms 1.3 Sorting and searching algorithms ➤ Understand and explain how linear search and binary search algorithms work ➤ Understand and explain how merge sort and bubble sort algorithms work 9781510484306.indb 1 1.1 Representing algorithms Computers are simply electronic machines that carry out instructions given by human programmers; although a computer can solve many complex problems, it can only do so if it is given instructions to tell it how to do so. Writing these instructions in a form that is suitable for a computer to carry out can be a challenging task! An algorithm is a step-by-step sequence of instructions that are used to solve a problem or complete a task. Each instruction in an algorithm must be precise enough to be understood independently. An algorithm must also clearly show the order in which instructions are carried out and where decisions are made or sections repeated. Key point A computer program is an implementation of an algorithm – an algorithm is not a computer program! 1 19/08/20 8:31 PM 1 Fundamentals of algorithms Decomposition and abstraction In order to define algorithms effectively, decomposition and abstraction are often used. ● Decomposition is breaking a problem down into smaller sub-problems that each accomplish an identifiable task, which might itself be further subdivided. ● Abstraction is removing or hiding unnecessary details from a problem so that the important details can be focused on or more easily understood. For example, imagine a programmer wishes to create a system to sell T-shirts. This could be a relatively large problem. Where would they start? By using decomposition and abstraction, the programmer can begin to work out how to tackle the problem. Decomposition Abstraction How can the T-shirt sales system be split up? What can we focus on and what can we ignore? It perhaps involves: ● A login system for users ● A search function to find particular T-shirt ● A system to allow users to buy a T-shirt. ● A reordering system Let’s consider the login and search functions: ● The login system could focus on a customer’s email address and password but ignore everything else. ● The search function could focus only on certain T-shirt details (for example colour, size, material, price, how many in stock). A particularly powerful example of abstraction is Harry Beck’s well-known London underground map. The map focuses on the connections between stations and how lines intersect. It is a useful tool for anyone who wants to plan a journey in London. However, it ignores details such as tunnels, distances between stations and which streets pass overhead. These details are not important for underground travellers and would make the map more confusing if included. To see Harry Beck's original design, go to the Transport for London website, at https://tfl.gov.uk/corporate/about-tfl/culture-and-heritage/art-and-design/harrybecks-tube-map. Figure 1.1 A modern interpretation of a transport system map 2 9781510484306.indb 2 19/08/20 8:31 PM AQA GCSE Computer Science Knowledge check 1 Define the following terms: (a) Decomposition (b) Abstraction 2 A programmer is creating software to send and receive encrypted messages via email. Describe how decomposition can be used to allow this problem to be solved. 3 A chess club develops a system to store details about games played. For each game, the winner's and loser’s names are stored alongside the date that the game was played. Explain how abstraction has been used in the development of this system. Explain simple algorithms in terms of their inputs, processing and outputs An algorithm can be represented by the diagram below: Input Process Output Figure 1.2 Input–Process–Output Input refers to data that is given to the algorithm by the user. This may be typed in via the keyboard, entered with another input device such as a microphone or read in from an external data file. Output is the data that is given back to the user by the algorithm. This may be done with a message on the screen, using another output device such as a printer or by writing to an external data file. Processing describes the steps that the algorithm takes in order to solve the problem. This stage involves processing the given input in order to produce the desired output. Going back to the example of the system to sell T-shirts, what are the inputs, outputs and processes for the section of the system that allows the user to search for suitable T-shirts? The inputs include everything that the system requires the user to enter: for example, the size, colour and style of T-shirt that they require. Another input would be an external file containing details of all the T-shirts available. With these inputs, the system will then carry out processes to find T-shirts to match the user’s search criteria, such as only selecting those in the correct size and colour and excluding any that are out of stock. Once this has been completed, a list of matching T-shirts will be produced as an output to the user. Input Process Output Size of T-shirt required Colour of T-shirt required File of all T-shirts available Search through list Find matching T-shirts Exclude T-shirts out of stock List of T-shirts that meet customer requirements Figure 1.3 Example of Input–Process–Output 3 9781510484306.indb 3 19/08/20 8:31 PM 1 Fundamentals of algorithms Problem solving and algorithm creation Algorithms can be created using flowcharts, pseudo-code or program code. Flowcharts The inputs, processes and outputs can be put together into an algorithm by using a flowchart. This is a graphical representation of an algorithm and uses symbols to denote each step, with arrows showing how to move between each step. A flowchart may use any of the following symbols: Line An arrow represents control passing between the connected shapes. Process This shape represents something being performed or done. Subroutine This shape represents a subroutine call that will relate to a separate, non-linked flowchart. Input/ output This shape represents the input or output of something into or out of the flowchart. Decision This shape represents a decision (Yes/No or True/False) that results in two lines representing the different possible outcomes. Terminal This shape represents the ‘Start’ and ‘End’ of the process. Figure 1.4 Flow diagram symbols All flowcharts begin and end with the terminal shape, indicating the start and end of the flowchart. Inputs and outputs are represented by a parallelogram, with decisions using a diamond shape. Decision boxes must have two possible outputs: True/False or Yes/No. All other processes are shown as a rectangle. Where algorithms are decomposed into separate subroutines, a rectangle with two additional vertical lines is used to show a call to a different subroutine. The following flowchart shows part of an algorithm for the T-shirt system that deals with re-ordering T-shirts when stock runs low. 4 9781510484306.indb 4 19/08/20 8:31 PM AQA GCSE Computer Science Start Enter ItemCode Read StockLevel, ReorderLevel from User Is StockLevel ≤ ReorderLevel? True Call Reorder Subroutine False Output ‘Stock not ordered’ Output ‘Stock ordered’ End Figure 1.5 Flow diagram for T-shirt reordering system Pseudo-code Key point In general, pseudo-code is not strictly defined so many variations would be acceptable. For example, the keyword OUTPUT could instead be replaced with PRINT, or even DISPLAY – as long as the steps to be taken are clear, that is sufficient. Key point Questions in the exam use AQA Pseudo-code to define the algorithm. This has a particular syntax – an AQA Pseudo-code guide is available on the AQA website. All examples in this book will also be written using AQA Pseudo-code. Alternatively, an algorithm may be represented using pseudo-code. Pseudo-code is a textual, English-like method of describing an algorithm. It is much less strict than high-level programming languages, although it may look a little like a program that could be entered directly into a computer. The same T-shirt reordering system that was described in a flowchart could also be represented in pseudo-code as follows: ItemCode ← USERINPUT StockLevel ← USERINPUT ReorderLevel ← USERINPUT IF StockLevel ≤ ReorderLevel THEN ReOrder() OUTPUT 'Stock ordered' ELSE OUTPUT 'Stock not ordered' ENDIF 5 9781510484306.indb 5 19/08/20 8:31 PM 1 Fundamentals of algorithms Program code Program code refers to instructions given in a high-level language. For the AQA GCSE Computer Science specification, you must choose to use either Python, VB.Net or C#; each of these languages has a separate examination paper. Where answers are requested as program code, it is important that you are as precise with the syntax as when you are actually coding. Your code answers should work if they were typed into your computer. The following shows the T-shirt ordering algorithm written in each of these high-level languages: Python ItemCode = input("enter item code") StockLevel = int(input("enter stock level")) ReorderLevel = int(input("enter reorder level")) if StockLevel <= ReorderLevel: ReOrder() print("stock ordered") else: print("stock not ordered") VB.Net Dim ItemCode As String Dim StockLevel As Integer Dim ReorderLevel As Integer ItemCode = Console.Readline() StockLevel = Console.Readline() ReorderLevel = Console.Readline() If StockLevel <= ReorderLevel Then ReOrder() Console.WriteLine("stock ordered") Else Console.WriteLine("stock not ordered") End If C# string ItemCode; int StockLevel; int ReorderLevel; ItemCode = Console.Readline(); StockLevel = Convert.ToInt32(Console.Readline()); ReorderLevel = Convert.ToInt32(Console.Readline()); if (StockLevel <= ReorderLevel) { ReOrder(); Console.WriteLine("stock ordered"); } else { Console.WriteLine("stock not ordered"); } 6 9781510484306.indb 6 19/08/20 8:31 PM AQA GCSE Computer Science Determine the purpose of simple algorithms Trace tables If an algorithm has been given, its purpose can be found by checking what it does. Very simple algorithms can be visually inspected to check their purpose – for instance, it might follow the same pattern as an algorithm you are familiar with, or it might be so simple that you can clearly see what the algorithm is doing. For more complex algorithms visual inspection becomes very difficult. In these cases, a trace table can be used to follow each line of an algorithm through, step by step. The trace table will list each line of code, all of the variables and the outputs. By filling in the table you can track the contents of each variable and the outputs after each line has been carried out. By manually following through an algorithm in this way, you can check the purpose of each line of the algorithm and the algorithm as a whole. For example, we can trace through the pseudo-code T-shirt reordering algorithm shown previously when considering stock item 009. Let’s assume we currently have eight items in stock and it has a reorder level of three. Line ItemCode StockLevel ReorderLevel Output Comments 01: ItemCode ← USERINPUT 009 Item code 009 entered by user 02: StockLevel ← USERINPUT 009 8 03: ReorderLevel ← USERINPUT 009 8 3 Reorder level of 3 entered by user 04: IF StockLevel ≤ ReorderLevel THEN 009 8 3 8 is NOT smaller than or equal to 3, so line 7 executed next 07: OUTPUT "Stock not ordered" 009 8 3 'Stock not Correct output ordered' printed Stock level of 8 entered by user By tracing this algorithm using a trace table, we can see that it makes the choice of whether to reorder the stock based on whether the current stock level falls below the reorder level. We could also have traced through with stock values that were just equal to the reorder level, and with another item that was out of stock, to be sure that this algorithm works perfectly in every scenario. (We will consider this more thoroughly when discussing testing in Chapter 2.) Key point A trace table shows the values of each variable after each step has been executed. Although the item code is only entered on line 1, it still holds the same value throughout the rest of the program and so its value is repeated in later lines of the trace table. 7 9781510484306.indb 7 19/08/20 8:31 PM 1 Fundamentals of algorithms Knowledge check 4 The following algorithm has been designed to decide which of two numbers is the largest. Complete the trace table to check that the algorithm works correctly when the values 8 and 5 are entered by the user. 01 NumOne ← USERINPUT 02 NumTwo ← USERINPUT 03 IF NumOne ≥ NumTwo THEN 04 OUTPUT NumTwo 05 ELSE 06 OUTPUT NumOne 07 ENDIF Line NumOne NumTwo Output Comments 1.2 Efficiency of simple algorithms An important point in computer science is that there are usually many ways of solving the same problem. For example, even to print out the numbers one to five could be done in any of the following ways: #Version 1 OUTPUT 1 OUTPUT 2 OUTPUT 3 OUTPUT 4 OUTPUT 5 #Version 2 FOR x ← 1 to 5 OUTPUT x ENDFOR #Version 3 x ← 1 WHILE x ≤ 5 OUTPUT x x ← x + 1 ENDWHILE Each of the algorithms above will produce exactly the same output. If we changed the algorithm to print out the numbers 1 to 100 000, it is perhaps obvious that the first example would be more time-consuming to write out whereas the others would be simpler to change. However, this does not in itself make the first algorithm any less efficient than the other two. 8 9781510484306.indb 8 19/08/20 8:31 PM AQA GCSE Computer Science The efficiency of an algorithm is the time taken for a computer to complete that algorithm. Because computers all run at different speeds, we count up the number of steps in an algorithm as a rough guide to its efficiency. The following two algorithms complete the same task, of finding the greatest common divisor of two numbers #Algorithm 1 a ← USERINPUT b ← USERINPUT WHILE b ≠ 0 IF a > b THEN a ← a - b ELSE b ← b - a ENDWHILE OUTPUT a #Algorithm 2 a ← USERINPUT b ← USERINPUT gcd ← 0 IF a > b THEN large ← a ELSE large ← b ENDIF FOR x ← 1 TO large IF (a MOD x = 0) AND (b MOD x = 0) THEN gcd ← x ENDIF ENDFOR OUTPUT gcd Beyond the spec Computer Scientists use ‘Big O notation’ to measure mathematically how efficient algorithms are. Big O attempts to measure the change in efficiency as the size of the data input increases. This is covered fully in the A-level Computer Science specification. Algorithm 1 does this using a special mathematical technique (called the Euclidian algorithm) that works by subtracting one number from the other, depending on which is larger, until zero is reached. Algorithm 2 simply divides both numbers by every possible integer. Both will produce the same result but algorithm 2 will take many more steps to find the answer. We can therefore say that algorithm 1 is more efficient. The efficiencies of these two algorithms also depend on the input values. Running the algorithm with numbers such as 10 and 50 will show a small difference in efficiency between them, but if numbers in the millions were used algorithm 1 would be much more efficient than algorithm 2. 1.3 Sorting and searching algorithms One of the great things about an algorithm is that it can be reused. Once a computer scientist has written down a clever set of instructions for how to do something, other people can simply follow those instructions to solve the same problem. A sorting algorithm is a set of instructions used to put a list of values into order. A searching algorithm is used to find a value within a list, or to confirm that value is not present in the list. 9 9781510484306.indb 9 19/08/20 8:31 PM 1 Fundamentals of algorithms Beyond the spec The GCSE Computer Science specification covers two sorting algorithms and two searching algorithms, but there are many, many more that each work in different ways. Why not investigate how quick sort, insertion sort, shell sort or even bogo sort work? Sorting algorithms Bubble sort The bubble sort algorithm works by comparing pairs of values. If the two values are in the wrong order with respect to each other, they are swapped around. This is repeated for each adjacent pair of values. When the last pair of values has been compared, the first pass of the bubble sort algorithm is complete. The algorithm then repeats the whole process again in a second pass, then a third pass and so on. It will continue to repeat until a pass has been completed with no swaps occurring. Once this happens, the list is guaranteed to be in order. Worked example The following list of numbers will be sorted into ascending order using the bubble sort algorithm. 7 2 9 4 3 The first two values in the list, 7 and 2, are compared as the first two values in the list. These are in the wrong order, so they are swapped over. 2 7 9 4 3 Now 7 and 9 are compared. These are in the correct order and so no swap is necessary. 2 7 9 4 3 On the next step, 9 and 4 are compared. These are in the wrong order and so are swapped. 2 7 4 9 3 Finally, 9 and 3 are compared, which are in the wrong order and so are swapped. 2 7 4 3 9 The first pass of the bubble sort algorithm has been completed. However, at least one swap has taken place and so the algorithm is repeated. After this second pass, the numbers will be in the following order, with two swaps having taken place: 2 4 3 7 9 Again, because swaps have taken place, the algorithm must repeat. This time, only one swap is needed, giving the following list: 2 3 4 7 9 Key point The list now appears to be in numerical order. However, the algorithm only stops when a pass is completed without any swaps taking place. This is not yet the case. The final pass of the algorithm compares each pair of numbers and finds no numbers that need to be swapped. The algorithm is therefore complete and the values are in order. The bubble sort algorithm gets its name because numbers ‘bubble’ to the top after every pass. 10 9781510484306.indb 10 19/08/20 8:31 PM AQA GCSE Computer Science We can write out the bubble sort algorithm in very informal pseudo-code as follows: REPEAT REPEAT for each pair in list # where list[x] is 1st item in a pair # and where list[x + 1] is 2nd item in a pair IF list[x] > list[x + 1] THEN swap list[x], list[x + 1] ENDIF UNTIL all pairs of values checked UNTIL no swaps made Key point At GCSE level, you will not be expected to remember the sorting and searching algorithms in pseudo-code, but you are expected to understand how they work. Merge sort Key point You will be expected to know how to perform a merge sort on a list with an even or odd number of elements. With an odd number of elements there are two choices of where to divide lists – you can choose either but you must apply your choice consistently. Each of the merge stages must resemble the divide stages, just with elements in a different order. The merge sort algorithm uses a ‘divide and conquer’ approach to split data up into individual lists and then merge them back together in order. The way that the lists are merged back together is key to understanding how this algorithm works. For example, we have a list like this one below and we want to sort it in ascending order (from lowest to highest): 7 2 9 4 3 8 5 1 First, in the ‘divide’ stage the list of values is split into two separate sublists. Each sublist is repeatedly split in half until we have individual lists of size 1 each. 7 2 9 4 3 8 5 1 Then each pair of lists are merged together into new lists in the ‘conquer’ stage. Where there is an uneven number of lists, the odd list will simply remain unmerged until the next step in the process. When two lists are merged together, the first two numbers in each of the lists are compared and whichever should be first is taken to be first in the new list. This process is repeated until all numbers have been inserted into the new list. 3 8 5 1 4 9 2 7 2 4 3 1 7 9 8 5 Here, 7 and 2 are compared, with 2 being inserted into a new list before 7. Similarly, 4 is inserted before 9 in the next merged list. This continues for the other pairs of lists, merging them together into four new lists. The merging process is repeated again to merge pairs of lists together. This time the first list is made of 2 and 7, and the second list is made of 4 and 9. The algorithm compares the first numbers in each list, 2 and 4, to decide which value will be first in the new list. 2 is inserted into the new list. Next, 7 and 4 are then compared, with 4 being inserted. Then, 7 and 9 are compared, with 7 being inserted before 9 into the first new list. The same process is then repeated for the lists made of 3 and 8, and 1 and 5, leaving two new lists in order, each made of four numbers. 9781510484306.indb 11 11 19/08/20 8:31 PM 1 Fundamentals of algorithms 7 4 9 3 2 4 7 8 1 5 2 9 1 3 5 8 The merging process is again repeated with the final two lists. The first numbers in each list (2 and 1) are compared to decide which value will be first in the merged list, with 1 being inserted. 2 is then compared against 3, with 2 being inserted into the new list next. This continues until all numbers have been merged from the two lists into the final list. This final list is now in order. 2 4 7 9 1 3 5 8 1 2 3 4 5 7 8 9 Key point At any point in the merging stage of this algorithm, each merged list is always in order. This means that it is only the first numbers from each list that need to be considered when deciding on how to merge them. We can write out the merge sort algorithm in very informal pseudo-code as follows: REPEAT Divide each list in half UNTIL each list is of size 1 When all numbers are split up into separate lists, the merge stage can begin. This uses the following steps in very informal pseudo-code: REPEAT REPEAT REPEAT Compare the first value in both lists Insert larger of two values into new merged list Remove value from old list UNTIL all numbers in pairs of lists are merged UNTIL each pair of lists have been considered UNTIL all lists are merged If at any stage an odd number of lists are present, the odd list can simply be ignored until the next iteration. Comparison of sorting algorithms 12 9781510484306.indb 12 Both of the above algorithms will result in a sorted list, but they do it in very different ways. As previously discussed, an algorithm's efficiency depends on the number of steps it takes to execute – the more steps, the lower the efficiency. A bubble sort is generally thought of as a simple but slow algorithm – as the size of the list of values increases, it slows down significantly because it requires multiple passes over the same data. A merge sort is much more efficient, especially with large lists of values. 19/08/20 8:31 PM AQA GCSE Computer Science Knowledge check 5 Explain how a bubble sort would sort the values [6, 9, 2, 5, 8] into order. 6 A merge sort is an example of a divide and conquer algorithm. State what happens during the divide stage. Searching algorithms A searching algorithm is used to find an item of data in a list, or to confirm that it is not in the list. The simplest searching algorithm is a linear search. Linear search A linear search is carried out by inspecting each item of the list in turn to check if it is the desired value. If it is, we have found the item; if it is not, the next item in the list must be checked. If the algorithm gets to the end of the list without finding the item, then it is not in the list. 7 2 9 4 3 To find the value 9 in this list, the algorithm would first check 7 and then check 2 before finally finding 9. To find the value 8, the algorithm would check every item in the list (7, 2, 9, 4 and 3). Only after checking the last value can we can be sure that 8 is not in the list. A linear search is simple but inefficient. If we have a list of a million values, the algorithm would have to check all one million of them before being sure that that a particular value was not in the list. We can write out the linear search algorithm in very informal pseudo-code as follows: REPEAT Check one value (from the start) IF value matches what is being searched for THEN Output value ELSE Move to next value ENDIF UNTIL number is found OR end of list is reached Binary search A much more efficient algorithm to find values in a list is a binary search. However, this algorithm has the pre-requisite that the list it searches must be in order. A binary search on an unsorted list will not work. The binary search algorithm picks the middle value in the sorted list. ‘Middle’ means there are equal numbers of values either side of it. If there are an even number of values in the list then there isn’t an exact middle value – however, generally the value to the left of the middle is chosen. (Either side can be picked as long as we are consistent.) 13 9781510484306.indb 13 19/08/20 8:31 PM 1 Fundamentals of algorithms If the middle value is the one that we are searching for then the algorithm finishes. However, if not, then the algorithm can discard half of the list because the value we are looking for cannot be in it. It can discard the lower half of the list if the middle value is smaller than the one we are searching for (as all of the values in the lower half will be smaller than the middle value); or it can discard the upper half of the list if the middle number is larger than the one we are searching for (as all of the values in the upper half will be larger than the middle value). Either way, we always also discard the middle value. If we get to a situation where the list only has one item and it is not the one that we are searching for, then the value is not in the list. Worked example The list below is in alphabetical order and so can be used in a binary search. We will look for the value Q in this list. A C D F H K P Q S T V W Z We first take the middle value P. This is not the value that we are looking for and is smaller than Q alphabetically, so we can discard the lower half of the list, up to and including P. Q S T V W Z We now have a list of six values. As there is no middle value, we pick the value to the left of the middle, which is T. This is the not the value that we are looking and T is larger than Q alphabetically, so the upper half of the list (including T) can be discarded. Q S We now have a list of two values. Again, there is no middle value so we pick the value to the left of the middle which is Q. This is the value that we are searching for and so the algorithm stops. We can write out the binary search algorithm in very informal pseudo-code as follows: REPEAT ick the middle value in a list (if even number of P values, pick left-of-middle value) ­ IF value matches what is being searched for THEN Output value ELSE IF value searched for > middle value THEN Discard middle value AND lower half of list ELSE IF value searched for < middle value THEN Discard middle value AND upper half of list ENDIF UNTIL (value found) or (list is of size 1 and value is not found) 14 9781510484306.indb 14 19/08/20 8:31 PM AQA GCSE Computer Science Comparison of searching algorithms A linear search and a binary search will both find a value in a list, but will do so in very different ways. As previously discussed, an algorithm's efficiency depends on the number of steps it takes to execute – the more steps, the lower the efficiency. A linear search will work with any list of values, but may be very slow as it checks every value in the list. A binary search will be much more efficient, but requires the list to be sorted into order. A binary search halves the size of the list to be searched on every comparison. Imagine searching for a value in a list of one million numbers sorted in order. In the worst-case scenario, a linear search will need to compare against each and every one of these million numbers. A binary search, however, needs only to make a maximum of 21 comparisons before it has completed its search. This is because the number of values in the list halves on every comparison. So, by the second comparison the number has halved from one million to 500 000. After three comparisons it has halved again to 250 000. After 21 comparisons the original list has been reduced to a single number. This means we can be sure that if a binary search of a million items cannot find the value being searched for in 21 comparisons, then the number is not in the list. Knowledge check 7 Explain how a linear search would find the value 18 in the list [1, 8, 6, 2, 18, 14, 7]. 8 Which value would be the first to be compared in a binary search through the list [A, B, C, D, E]? 9 How does a linear search determine that a value does not appear in a list? 15 9781510484306.indb 15 19/08/20 8:31 PM RECAP AND REVIEW 1 FUNDAMENTALS OF ALGORITHMS Important words You will need to know and understand the following terms: algorithm decomposition abstraction input process output flowchart pseudo-code program code trace table efficiency sorting algorithm bubble sort merge sort searching algorithm linear search binary search 1.1 Representing algorithms Decomposition and abstraction An algorithm is a step-by-step sequence of instructions that are used to solve a problem. ■ ■ Each instruction in an algorithm must be precise enough to be understood independently. An algorithm must also clearly show the order that each instruction is carried out in. Decomposition breaks a problem into smaller sub-problems that can each be tackled individually. Each stage can be further decomposed if needed. Abstraction means focusing on what is important in a problem and ignoring or hiding the irrelevant details. Explain simple algorithms in terms of their inputs, processing and outputs An algorithm: ■ accepts inputs from a user (or from sensors or data files) ■ processes this data, and then ■ outputs the result back to the user in some format. Identifying the inputs and the required outputs for a system enable us to decide how to map between the two. Programmers write code that processes the inputs in order to provide the required outputs. Input Process Output Problem solving and algorithm creation Algorithms can be created using flowcharts, pseudo-code or program code. Flowcharts A flowchart is a graphical representation of an algorithm. It can be followed from top to bottom, making decisions as appropriate. ■ ■ Decisions are represented by diamond shapes and must have two output lines corresponding to Yes/No or True/False. Inputs and outputs are represented by parallelograms. 16 9781510484306.indb 16 19/08/20 8:31 PM AQA GCSE Computer Science ■ ■ ■ Processes are represented by rectangles. Start/Stop instructions are represented by rounded rectangles. Each flowchart must begin and end with a Start and Stop instruction. Line An arrow represents control passing between the connected shapes. Process This shape represents something being performed or done. Subroutine This shape represents a subroutine call that will relate to a separate, non-linked flowchart. Input/ output This shape represents the input or output of something into or out of the flowchart. Decision This shape represents a decision (Yes/No or True/False) that results in two lines representing the different possible outcomes. Terminal This shape represents the ‘Start’ and ‘End’ of the process. As an example, the algorithm in the figure below will decide whether a positive number entered is odd or even using repeated subtraction. Start Enter Number Is Number = 0? TRUE Output ‘Even’ FALSE Is Number = 1? TRUE Output ‘Odd’ FALSE Number = Number − 2 End 17 9781510484306.indb 17 19/08/20 8:31 PM 1 Fundamentals of algorithms Pseudo-code Pseudo-code is a textual representation of an algorithm. ■ Pseudo-code enables programmers to communicate algorithms to other programmers, showing the steps used without worrying about which language they are using. The pseudo-code algorithm below carries out the same algorithm as the flowchart shown previously – finding whether a positive number is odd or even through repeated subtraction. ■ Number ← USERINPUT WHILE Number ≥ 0 IF Number = 1 THEN OUTPUT 'odd' ELSE IF Number = 0 OUTPUT 'even' END IF Number ← Number – 2 ENDWHILE Program code Program code is code that is written in a high-level language. As you are studying AQA’s GCSE Computer Science specification, this will be either Python, VB.net or C#. Program code in answers must be precise. Determine the purpose of simple algorithms Trace tables Very simple algorithms can be visually inspected to check their purpose. ■ Trace tables are used to follow each line of a more complex algorithm from start to finish. At each point, the value of each variable and any outputs are recorded. A table such as the one shown below is used. More columns are added if more variables are used. ■ Line Variable1 Variable2 Variable3 Output Comments If the program repeats certain lines, this is reflected in the trace table. Each row shows one line that is executed and lines of code can be repeated as many times as required. The completed trace table below shows the previous algorithm running to check whether 5 is an odd or even number. 18 9781510484306.indb 18 19/08/20 8:31 PM AQA GCSE Computer Science Line 01: number ← USERINPUT 02: WHILE number ≥ 0 03: IF number = 1 05: ELSE IF number = 0 07: ENDIF 08: number ← number - 2 09 ENDWHILE 02: WHILE number ≥ 0 03: IF number = 1 05: ELSE IF number = 0 07: ENDIF 08: number ← number - 2 09 ENDWHILE 02: WHILE number ≥ 0 03: IF number = 1 04: OUTPUT 'odd' 07: ENDIF 08: number ← number - 2 09 ENDWHILE 02: WHILE number ≥ 0 Number 5 5 5 5 5 3 3 3 3 3 3 1 1 1 1 1 1 -1 -1 -1 Output 'odd' Comments Number inputted by user Number is ≥ 0, continue Number is not equal to 1 Number is not equal to 0 End of IF statement 2 subtracted from number End of loop, repeat again Number is ≥ 0, continue Number is not equal to 1 Number is not equal to 0 End of IF statement 2 subtracted from number End of loop, repeat again Number is ≥ 0, continue Number does equal 1 'odd' printed End of IF statement 2 subtracted from number End of loop, repeat again Number is NOT ≥ 0, WHILE loop and program ends. 1.2 Efficiency of simple algorithms ■ ■ ■ An algorithm’s efficiency depends on the time it takes to execute. As computers vary in speed, the number of steps taken is counted as an estimate of this. The more steps a computer takes to complete a process, the lower the efficiency of the algorithm. 1.3 Sorting and searching algorithms Sorting algorithms Bubble sort The bubble sort algorithm uses the following steps in very informal pseudo-code: 19 9781510484306.indb 19 19/08/20 8:31 PM 1 Fundamentals of algorithms REPEAT REPEAT for each pair of numbers in the list # where list[x] is 1st item in a pair # and where list[x + 1] is 2nd item in a pair IF list[x] > list[x + 1] THEN swap list[x], list[x + 1] ENDIF UNTIL all pairs of values checked UNTIL no swaps made Bubble sort is a relatively simple algorithm. However, it is quite inefficient and may take much longer to complete than other sorting algorithms on very large lists. Merge sort The merge sort is a divide and conquer algorithm. The divide stage uses the following steps in very informal pseudo-code: REPEAT Divide each list in half UNTIL each list is of size 1 When all numbers are split up into separate lists, the merge stage can begin. This uses the following steps in very informal pseudo-code: REPEAT REPEAT REPEAT Compare the first value in both lists Insert larger of two values into new merged list Remove value from old list UNTIL all numbers in pairs of lists are merged UNTIL each pair of lists have been considered UNTIL all lists are merged If at any stage an odd number of lists are present, then one list can simply be ignored until the next iteration. ● The merge sort is much more efficient than the bubble sort. ● It will sort a large list of random values into order in a quicker time than a bubble sort. 20 9781510484306.indb 20 19/08/20 8:31 PM AQA GCSE Computer Science Searching algorithms Linear search A linear search uses the following steps: REPEAT Check one value (from the start) IF value matches what is being searched for THEN Output value ELSE Move to next value ENDIF UNTIL number is found OR end of list is reached The linear search is relatively inefficient, but it works on any list, regardless of whether it is in any particular order. Every single value in the list needs to be checked before you can be certain that a value is not present in a list. Binary search A binary search requires the list of values to be in order. It uses the following steps: REPEAT Pick the middle value in a list (if even number of values, pick left of-middle value) IF value matches what is being searched for THEN Output value ELSE IF value searched for > middle value THEN Discard middle value AND lower half of list ELSE IF value searched for < middle value THEN Discard middle value AND upper half of list ENDIF UNTIL (value found) or (list is of size 1 and value is not found) ■ ■ A binary search is highly efficient. If an ordered list of one million numbers is used, the binary search could find a number in the list with no more than 21 comparisons. The linear search by contrast could take up to one million comparisons. However, the binary search will only work if the list of values is ordered. Therefore, it cannot always be used. 21 9781510484306.indb 21 19/08/20 8:31 PM QUESTION PRACTICE 1 Fundamentals of algorithms 01 State two computational thinking principles and describe how they could be used when developing a computerised solution to a problem. [6 marks] 02 Draw a flowchart to describe a solution to the following problem. A menu offers four options: • A Add a name • B Delete a name • C Change a name • D End The program should ask for an input repeatedly until option D is selected, at which point it ends. If option A, B or C is selected, the program should pass control to a procedure depending on which option is selected: • A: Add • B: Delete • C: Change [10 marks] 03 An amateur football team wants a computer program to calculate the points scored. The user inputs: • the number of games won • the number of games drawn. There are 3 points for a win, 1 for a draw and 0 for a loss. The program should prompt the user for the inputs and output a suitable message with the points total. Using AQA pseudo-code or a high-level language that you have learned, write a program to input the data above and output the total points scored. [6 marks] 22 9781510484306.indb 22 19/08/20 8:31 PM AQA GCSE Computer Science 04 Here is an algorithm that is intended to collect and store the names of people going into an event. Only 15 people are allowed to enter the event. number WHILE number < 15 name[1] ← USERINPUT number ← number + 1 ENDWHILE OUTPUT 'The total number of names entered is' OUTPUT number 04.1 Identify the mistakes in this algorithm. [2 marks] 04.2 Rewrite the algorithm so that it does what is intended. [2 marks] 04.3 Add to the algorithm so that it outputs all the names entered. [4 marks] 05 Complete the trace table for the following program: [6 marks] x ← 0 y ← 0 WHILE y < 6 y ← y + 1 x ← x + y ENDWHILE OUTPUT x x y output 06 Show the stages of a bubble sort when applied to the following data: Elephant, Dog, Cat, Dolphin, Sheep, Frog [4 marks] 23 9781510484306.indb 23 19/08/20 8:31 PM 1 Fundamentals of algorithms 07 Show the stages of a bubble sort on the following data: Pear, Apple, Grape, Banana, Strawberry, Raspberry [5 marks] 08 For the following data: Pizza, Apple, Banana, Chips, Sandwich, Crisps, Egg, Pasty 08.1 Show the stages of a merge sort on the data. [4 marks] 08.2 Outline the advantage of a merge sort over a bubble sort. [2 marks] 09 For the following list: Jeremy, Adrian, Ben, Harry, Frank, James 09.1 Show the stages of a linear search for Harry in the list. 09.2 Explain why a binary search could not be used with this list to find Harry. [4 marks] [1 mark] 10 The following shows airline departures from an airport. 10.1 Show the stages of a binary search for HA102 in the list: AD245, BE767, FR226, HA102, HC224, JA233, KE124, MA267, PE334 [3 marks] 10.2 State the number of comparisons needed to find MA267 using a binary search. [1 mark] 10.3 State the number of comparisons needed to find MA267 using a linear search. [1 mark] 24 9781510484306.indb 24 19/08/20 8:31 PM 2 PROGRAMMING CHAPTER INTRODUCTION In this chapter you will learn about: 2.1 Data types ➤ Understand the concept of a data type ➤ Understand the use of integer, real, Boolean, character and string data types 2.2 Programming concepts ➤ Use and understand variables, constants and assignments ➤ Use meaningful identifier names and understand why it is important to use them ➤ Understand the three combining principles of sequence, selection/choice and iteration/repetition ➤ Use definite (count-controlled) and indefinite (condition-controlled) iteration ➤ Use nested selection and iteration structures 2.3 Arithmetic operations in a programming language ➤ Be familiar with and use addition, subtraction, multiplication and division (real and integer division, including remainders) 2.4 Relational operations in a programming language ➤ Be familiar with and use equal to, not equal to, less than, greater than, less than or equal to, and greater than or equal to 2.5 Boolean operations in a programming language ➤ Be familiar with and be able to use NOT, AND, OR 2.6 Data structures ➤ Understand the concept of data structures ➤ Use one and two-dimensional arrays (or equivalent) in the design of solutions to simple problems ➤ Use records (or equivalent) in the design of solutions to simple problems 2.7 Input/output ➤ Be able to obtain user input from the keyboard ➤ Be able to output data and information from a program to the computer display 2.8 String-handling operations in a programming language ➤ Understand and be able to use standard stringhandling operations such as length, position, substring, concatenation and type conversion 25 9781510484306.indb 25 19/08/20 8:31 PM 2 Programming 2.9 Random number generation in a programming language ➤ Be able to use random number generation within a computer program 2.10 Structured programming and subroutines (procedures and functions) ➤ Understand the concept and advantages of subroutines ➤ Describe how parameters are used to pass data within programs ➤ Use subroutines that return values to the calling routine ➤ Understand and use local variables ➤ Describe and explain the advantage of a structured approach to programming 2.11 Robust and secure programming ➤ Understand simple data validation and authentication routines ➤ Test algorithms and programs, correcting errors ➤ Understand and justify the choice of test data, including normal, boundary and erroneous data Key point Paper 1 requires significant knowledge and understanding of programming. It is a requirement of the AQA GCSE Computer Science course that you are able to program in one of the specified high-level languages (Python, VB.Net or C#). Assessment of programming skills will only be made through this written examination. ➤ Understand syntax and logic errors Some questions in the examination will be presented in AQA Pseudo-code. Full details of this are ➤given Identify categorise errorsquestions within algorithms and programs on theand AQA website. Other will be presented in Python, VB.Net or C#, depending upon the option chosen by your teacher; AQA provide a separate examination paper for each of these languages. Examples in the main text are given in AQA Pseudo-code. Where possible, all worked examples given in this chapter are written in AQA Pseudo-code, Python, VB.Net and C#. Please ensure that you are aware which of these you are studying. Please refer to www.aqa.org.uk for full details of the assessment. 2.1 Data types Computer programs store data in memory. However, the type of data to be stored determines how much memory needs to be allocated for that value. There are five main data types that GCSE Computer Scientists need to be aware of. Integer Integers are whole numbers, positive or negative, that have no decimal or fractional part. Integers are used for counting or storing quantities. For example: score ← 25 highScore ← 100 numOfAttempts ← 0 All variables used in this example are integers. 26 9781510484306.indb 26 19/08/20 8:31 PM AQA GCSE Computer Science Tech term Floating point numbers Another name for real numbers – the floating point refers to the position of the decimal point, which can be different (or ‘float’) for different numbers. Real The real data type is used for numbers, positive or negative, that have (or may have) a decimal or fractional part. Real numbers are sometimes called float or floating point numbers. For example: price ← 19.99 fastestTime ← 9.983 score ← 17.0 All variables used in this example are real numbers. Boolean Boolean variables only ever store True or False values. They are used to indicate the result of a condition. For example: sorted ← False LoggedIn ← True All variables used in this example are Boolean values. Character A character is a single item from the character set used on the computer, such as H, r, 7 or &. Uppercase and lowercase are different characters and space is also a character. When assigning a character to a variable, quotation marks are required to indicate that the value to be assigned is a character. For example: a b c d ← ← ← ← 't' '4' '%' '' All variables used in this example are characters. String A string data type stores a collection of characters and is typically used for names, addresses or other textual information. Note that numbers and symbols can also be characters and so can be included in a string. Just like characters, when assigning a string to a variable, quotation marks are required: a b c d ← ← ← ← 'the colour blue' '47 times' '95% increase' 'p@55w0rd' All variables used in this example are strings. 27 9781510484306.indb 27 19/08/20 8:31 PM 2 Programming Key point Strings and characters require quotation marks around the values to be assigned. This is to differentiate them from variables. Compare the following lines of code: colour ← 'blue' colour ← blue The first line assigns the string value of 'blue' to the variable colour. However, the second line treats both colour and blue as variables and assigns the contents of variable blue to the variable colour. If the variable blue does not exist, this would cause an error in the program. Knowledge check 1 State whether the following data are real numbers, integers, characters, Boolean or strings: (a) 35 (b) & (c) 3 ≠ 2 (d) twenty (e) 35.0 (f) 6.63 2.2 Programming concepts Tech terms Identifier The name of a variable or constant. Reserved keyword A word in a ­programming language that has some special purpose and cannot be used for a variable or constant identifier. Variables, constants and assignments Variables are used in a computer program to store a single piece of data. For example, you may wish to store someone’s score in a game. As the name suggests, the contents of a variable may be changed (or varied) during the running of a program. When a variable is first defined, the programming language allocates a small area of memory to store this data. A variable has an identifier, or name. This is the label given to the area of memory. Variable identifiers can be almost anything, but they must not contain a space, start with a number or be a reserved keyword (that is, a word that means something else in that programming language). Examples of allowed variable names score max_points item2 Examples of disallowed variable names new score (contains a space) 2ndName (starts with number) print (reserved keyword) Identifier names should also be meaningful. This means that the purpose of the variable should be obvious to another programmer. For example, although variables are allowed to be called x, temp or abc, the purpose of these are unclear. It would be much better to give a variable an identifier that describes what data is being held, such as userName, totalScore or goalScored. Most programming languages (including Python and C#) are case sensitive; this means that score and Score are treated as different variables Constants are used in computer programs to store a single piece of data that has a fixed value and cannot change during the running of the program. A constant also has an identifier that acts as a label for a memory location that stores the data. 28 9781510484306.indb 28 19/08/20 8:31 PM AQA GCSE Computer Science Key point A common mistake is to describe a constant as a variable that doesn’t change – variables and constants are similar, but they are not the same. Would you describe a car as a bike with four wheels? Assignment means to give a value to a variable or constant. This is done using the ‘←’ sign in AQA Pseudo-code and the ‘=’ symbol in Python, VB.Net and C#. The variable or constant identifier always goes on the left and the value to be assigned goes on the right. A variable can be assigned a value multiple times, but when a new value is assigned the old value is overwritten and lost. A constant can only be assigned a value once, usually at the start of a program. In most high-level languages, variables and constants are declared (defined) at the beginning of the program. (Note, however, that Python does not use declarations.) Worked example The following program shows a variable and a constant being assigned values, using the AQA Pseudo-code. Note how the contents of the variable score is changed, but the constant maxScore is fixed and cannot change. constant score ← score ← score ← maxScore ← 100 20 30 score + 10 After this algorithm is run, maxScore has the value 100 and score has the value 40. Examples in high-level languages – variables Python does not require variables to be declared before they are used. The data type will automatically be chosen by Python based on the data given. score = 20 playerName = "Kirstie" VB.Net allows the programmer to declare the variable before it is used with the Dim keyword. The data type is declared at this point. However, this declaration is optional. Dim score As Integer = 20 Dim playerName As String = "Kirstie" C# makes it compulsory to declare a variable before it is used with the int , float , bool, char or string keywords. int score = 20; string playerName = "Kirstie"; Once a variable has been declared, its value can be changed by simply reassigning a new value. It does not need to be declared again. score = 30 score = 30 score = 30; #Python 'VB.Net //C# 29 9781510484306.indb 29 19/08/20 8:31 PM 2 Programming Examples in high-level languages – constants Python does not allow the programmer to define constants. By convention, variable names in capitals (such as SCORE) will be treated as constants by other programmers, but there is no restriction on their value being changed. MAXSCORE = 100 #Python Const maxScore As Integer = 100 const int maxScore = 100; 'VB.Net //C# Sequence, selection and iteration The building blocks for any computer program can be reduced down to only three constructs: sequence, selection and iteration. Sequence Sequence is the execution of statements one after the other, in order. The program runs from top to bottom and each instruction completes fully before the next one is executed. Program A x ← 10 x ← x * 3 x ← x + 1 OUTPUT x Program B x ← 10 x ← x + 1 OUTPUT x x ← x * 3 Both programs have the same instructions but in a different order. Program A will print out 31 but program B will print out 11. However, in program B the final value of x equals 33 because there is another line of code after the OUTPUT x statement. The sequence of instructions is important and changing this can change how the program works. Selection Selection is the construct used to make decisions in a program. These decisions are based on Boolean conditions and the program then takes one of two paths based on this condition. Key point A Boolean condition is a statement that can be evaluated to be either True or False. ‘What do I want for lunch today?’ is not a Boolean condition as the answer could be one of many things, but ‘Do I want pizza for lunch?’ would be a Boolean condition as the answer could only be True (Yes, I want pizza) or False (No, I do not want pizza). Condition False True Code to execute if condition is True Code to execute if condition is False Figure 2.1 Selection based on a Boolean condition 30 9781510484306.indb 30 19/08/20 8:31 PM AQA GCSE Computer Science The most common way of implementing selection is by using IF statements. The IF keyword is used to define the condition to be evaluated, with the code to be executed if true indented between the IF and ENDIF keywords. Worked example name ← USERINPUT IF name = 'George' THEN OUTPUT 'Hello George' ENDIF In this example, the condition that is evaluated is whether the inputted name matches the value given (George). If it does, the third line is executed. If not, the program skips over this line entirely. To extend this program to do something else if the condition is false, the ELSE keyword can be used. Worked example name ← USERINPUT IF name = 'George' THEN OUTPUT 'Hello George' ELSE OUTPUT 'Hello stranger' ENDIF But what if we want to check for multiple conditions, each with their own associated code to run if true? The ELSE IF keyword allows us to do this. Worked example name ← USERINPUT IF name = 'George' THEN OUTPUT 'Hello George' ELSE IF name = 'Lorne' THEN OUTPUT 'Great work Lorne' ELSE IF name = 'Kirstie' THEN OUTPUT 'Nice to see you again' ELSE OUTPUT 'Hello stranger' ENDIF In this example, each possible condition is evaluated in turn. First, the inputted name is checked against ‘George’. If this is true, the message on line 3 is printed. If not, the second possible condition (‘Lorne’) is checked. If not true, the third and final condition (‘Kirstie’) is checked. If none of these is true, the ‘Hello stranger’ message is printed. 31 9781510484306.indb 31 19/08/20 8:31 PM 2 Programming Examples in high-level languages – IF statements Python uses the if, elif and else keywords. Indentation is used to show which code is inside each block. A double equals sign (==) is used to compare if values are equivalent. #Python name = input("enter a name") if name == "George": print("Hello George") elif name == "Lorne": print("Great work Lorne") elif name == "Kirstie": print("Nice to see you again") else: print("Hello stranger") VB.Net uses If, ElseIf, Else and End If keywords. Indentation is optional and a single equals (=) is used to compare if values are equivalent. 'VB.Net name = Console.Readline() If name = "George" Then Console.WriteLine("Hello George") ElseIf name = "Lorne" Then Console.WriteLine("Great work Lorne") ElseIf name = "Kirstie" Then Console.WriteLine("Nice to see you again") Else Console.WriteLine("Hello stranger") End If C# uses if, else if and else keywords. Indentation is optional and a double equals (==) is used to compare if values are equivalent. //C# name = Console.Readline(); if (name == "George") { Console.WriteLine("Hello George"); } else if (name == "Lorne") { Console.WriteLine("Great work Lorne"); } else if (name == "Kirstie") { Console.WriteLine("Nice to see you again"); } else { Console.WriteLine("Hello stranger); } With the above example, the program will always have one path to follow. It is important to note that the conditions are checked in the sequence given. 32 9781510484306.indb 32 19/08/20 8:31 PM AQA GCSE Computer Science Worked example OUTPUT 'Enter a score out of 20' mark ← USERINPUT IF mark > 5 THEN OUTPUT 'Could do better' ELSE IF mark > 10 THEN OUTPUT 'Average mark' ELSE IF mark > 15 THEN OUTPUT 'Excellent' ENDIF What would happen if someone here got a mark of 19 out of 20? Unfortunately, the message ‘Could do better’ would be displayed as 17 is larger than 5 and so the first condition is true. The other conditions will not be checked. The solution to this problem would be to change the sequence of the conditions so that the highest mark is checked first. Iteration (definite and indefinite loops) Iteration is the construct used to repeat sections of code. Iteration is commonly called looping. Tech term Increment Add a value to; ‘increment by one’ adds 1 to the counter each time the loop repeats. Key point Iteration can be definite (count-controlled) or indefinite (conditioncontrolled). Definite iteration repeats a set number of times (e.g. ‘repeat this code 5 times’) whereas indefinite iteration repeats based on a Boolean condition. Definite loops Definite iteration uses the FOR and ENDFOR keywords in AQA Pseudo-code. This uses a variable to act as a counter for the loop. By default, the program will assign this variable a starting value and then automatically increment the variable by one every time the code repeats. When this counter equals the final value, the loop stops. This code and flowchart are logically identical. Both print out the 8 times table from 1 to 10. Worked example FOR p ← 1 TO 10 OUTPUT p * 8 ENDFOR Start p=1 print (p * 8) p=p+1 Is p = 10? No Yes End Figure 2.2 Flowchart representing this code Both the program code and flowchart will print out the values 8, 16, 24 and so on, up to and including the final value 80. 9781510484306.indb 33 33 19/08/20 8:31 PM 2 Programming Key point A FOR loop will automatically increment the value of the counter variable by 1 each time; there is no need to do this manually. Using the FOR loop shown so far, the programmer defines exactly how many times the given code will repeat. Examples in high-level languages – FOR loops Python repeats using the RANGE keyword. Uniquely, the number given controls how many times the loop iterates, but the values will start at 0. The code below will therefore print out 0 to 9. for x in range(10): print(x) VB.Net uses syntax very similar to the AQA Pseudo-code. The following code will print out 0 to 9. For x = 0 To 9 Console.WriteLine(x) Next x C# uses a for loop that has three parts: the initialisation of the counter variable, the condition to check whether to repeat again, and the step to increase the counter variable by 1 each time (in this case ++ meaning to add one to the variable). The following code will print out 0 to 9. for (int x=0; x<10; x++) { Console.WriteLine(x); } Indefinite loops Indefinite iteration instead checks a condition each time around the loop to decide whether to repeat the loop again or continue. For this type of iteration, the programmer will not know how many times the code will repeat. AQA Pseudo-code, VB.Net and C# provide two types of indefinite iteration – WHILE loops and REPEAT UNTIL loops. These both perform a similar task but differ in when the condition is checked and whether the condition needs to be True or False to repeat again. Python only provides a WHILE loop. WHILE loop total ← 0 WHILE total < 20 num ← USERINPUT total ← total + num ENDWHILE OUTPUT 'done!' REPEAT UNTIL loop total ← 0 REPEAT num ← USERINPUT total ← total + num UNTIL total ≥ 20 OUTPUT 'done!' Note that the WHILE loop will repeat while the total is less than 20 whereas the REPEAT UNTIL loop will repeat until the total is larger than or equal to 20. They are both logically equivalent and produce exactly the same results but check different conditions at different times in the code. Key point 34 9781510484306.indb 34 A WHILE loops checks the condition before starting the loop whereas a REPEAT UNTIL loop checks the condition after the loop has completed. In some circumstances, this could mean that the WHILE loop will never start but the REPEAT UNTIL loop will always execute at least once. Both types of loop in the example above will run infinitely if the user simply repeats typing in negative values. 19/08/20 8:31 PM AQA GCSE Computer Science Examples in high-level languages – WHILE and REPEAT UNTIL loops Python uses indentation to decide what code fits inside the WHILE loop. The code below repeats, adding up numbers input by the user until the total equals 20 or above. total = 0 while total < 20: num = int(input("enter number")) total = total + num print ("done!") Python does not have an equivalent of the REPEAT UNTIL loop. VB.Net has both a WHILE and a REPEAT UNTIL loop (called DO LOOP UNTIL in VB.Net). Technically, the UNTIL statement can be put after the DO or after the LOOP keyword, depending on where it should be evaluated. 'WHILE loop example Dim total As Integer = 0 Dim num As Integer = 0 While total < 20 num = Console.ReadLine() total = total + num End While 'DO LOOP UNTIL example Dim total As Integer = 0 Dim num As Integer = 0 Do num = Console.ReadLine() total = total + num Loop Until total > = 20 VB also has a LOOP WHILE statement (where the condition is checked at the end of the loop) and also a DO WHILE loop, which is functionally equivalent to a WHILE loop. C# has a WHILE loop in the same way that Python and VB.Net do. C# does not have a DO UNTIL loop, but it has a DO WHILE loop, which can be used in an equivalent way. //WHILE loop example int total = 0; while (total < 20) { string num = Console.ReadLine(); total = total + Convert.ToInt32(num); } //DO WHILE loop example int total = 0; do { string num = Console.ReadLine(); total = total + Convert.ToInt32(num); } while (total < 20); The two C# examples given above are functionally equivalent, apart from where the condition is checked. The first example will check before beginning the first iteration whereas the second example is not checked until the loop has run at least once. 9781510484306.indb 35 35 19/08/20 8:31 PM 2 Programming Nested selection and iteration Selection and iteration can both be nested. This is where more than one statement is used in such a way that they occur inside each other. Worked example OUTPUT 'Enter an even number up to 100' num ← USERINPUT IF num > 100 THEN OUTPUT 'That is too large' IF num MOD 2 ≠ 0 THEN OUTPUT 'It isn’t an even number either!' ENDIF ENDIF The code will first check if the number is above 100 and output a warning message if this is the case. However, another IF statement is then nested inside this to give another warning message if the number is not even. Notice that if an odd number over 100 is entered, both warning messages are displayed, but if an odd number below 100 is entered, nothing is displayed; this differentiates it from simply using separate IF or ELSE IF statements. The nested IF statement is only checked if the first one is True. Worked example FOR a ← 1 TO 4 FOR b ← 1 TO 3 OUTPUT a * b ENDFOR ENDFOR In this example, the outer loop will repeat four times, but each of these repetitions involves three executions of the inner loop. This code will therefore produce 12 lines of output (4 × 3 = 12). WHILE loops can be nested in exactly the same way. Worked example age ← 0 WHILE age < 18 OUTPUT 'Enter your age' age ← USERINPUT IF age = 17 THEN OUTPUT 'You cannot vote but you can drive a car' ELSE IF age = 16 THEN OUTPUT 'You cannot vote but you can drive a ­ moped' ELSE IF age < 16 THEN OUTPUT 'Sorry, you are too young to vote or drive' ELSE OUTPUT 'Congratulations, you can vote and drive' ENDIF ENDWHILE 36 9781510484306.indb 36 In this example, the code checks whether the user can vote (if they are aged 18 or over) or drive (if they are aged 17 or 16). This is all nested inside a WHILE loop that repeats while the user’s age is under 18. 19/08/20 8:31 PM AQA GCSE Computer Science Knowledge check 2 State the three basic program constructs and describe each one. 3 What is the difference between a WHILE loop and a REPEAT UNTIL loop? 2.3 Arithmetic operations The arithmetic operators are used to carry out basic mathematical operations on values in computer programs. The common operators are: Operator + Name Addition Example a←b+c Comment Adds the two values given. – Subtraction x←y–1 Subtracts the second value from the first. * Multiplication score ← score * 10 Multiplies the two values together. / Division value ← num1 / num2 Divides the first value by the second. MOD Modulus r ← score MOD 2 Returns the remainder after dividing the first value by the second value. So, 5 MOD 2 would return 1. DIV Integer division q ← score DIV 2 Returns the whole number part after dividing the first value by the second value, ignoring any remainder. So, 5 DIV 2 would return 2. Examples in high-level languages – arithmetic operators Python uses the following operators: a = b + c x = y – 1 #addition #subtraction score = score * 10 value = num1 / num2 r = score % 2 #multiplication #division #modulus #integer division q = score // 2 (See C# for a note specific to Python version 2.) VB.Net uses the following operators: a = b + c x = y – 1 'addition 'subtraction score = score * 10 value = num1 / num2 r = score Mod 2 q = score \ 2 'multiplication 'division 'modulus 'integer division 37 9781510484306.indb 37 19/08/20 8:31 PM 2 Programming C# uses the following arithmetic operators: a = b + c; x = y – 1; //addition //subtraction score = score * 10; r = score % 2; Tech terms Operator precedence The order in which operations are carried out. BIDMAS This stands for Brackets, Indices, Division, Multiplication, Addition and Subtraction and is the order in which mathematical operations should take place if no other information is provided. //multiplication //modulus For division, the ‘/’ operator is used, but the output depends on whether the values given are defined as integers or real numbers. If integers are divided then the answer will always be an integer and the operation is integer division. If real numbers are divided then the answer will always be a real number and the operation is normal division. This is also the case for Python prior to version 3. Operator precedence is the same as in GCSE Mathematics, with BIDMAS being important. Any operators in brackets are applied first, with indices next, then division and multiplication, then addition and subtraction last. Worked example For example, (3 + 6) + 7 * 2 means that the (3 + 6) is completed first, with the multiplication (7 * 2) completed next and then the results of these added together: (3 + 6) + 7 * 2 =9+7*2 = 9 + 14 = 23 This gives the answer of 23. If the calculation was (incorrectly) carried out sequentially then this would give the wrong answer of 32. Key point The MOD operator can be used to decide if a number is odd or even. If we hold a value in the variable num, then num MOD 2 will give 0 if the value is an even number and 1 if the value is an odd number. It can also be used in the same way to decide if a number is an exact multiple of another smaller number. 2.4 Relational operations The relational operators (also known as comparison operators) all evaluate to a Boolean True or False outcome. Operator in AQA Pseudo-code Name Examples Comment = Equal to 7 = 7 (true) Some languages such as Python and C# may use a double == sign. 8 = 2 (false) ≠ Not equal to 7 ≠ 9 (true) Some languages may use <> or != 3 ≠ (1+2) (false) < 38 9781510484306.indb 38 Less than 4 < 7 (true) 4 < 4 (false) Note – gives a False output if the values are equal. 19/08/20 8:31 PM AQA GCSE Computer Science Operator in AQA Pseudo-code Name Examples Comment ≤ Less than or equal to 7 ≤ 7 (true) Note – gives a True output if the values are equal. 6 ≤ 4 (false) <= used in high-level languages. > Greater than 2 > 1 (true) 3 > 3 (false) ≥ Greater than or equal to 9 ≥ 9 (true) 6 ≥ 9 (false) Note – gives a False output if the values are equal. Note – gives a True output if the values are equal. >= used in high-level languages. Examples in high-level languages – relational operators Python uses the following relational operators: a == b a != b #equal to #not equal to a < b #less than a <= b #less than or equal to a > b #greater than a >= b #greater than or equal to VB.Net uses the following relational operators: a = b a <> b 'equal to 'not equal to a < b 'less than a <= b 'less than or equal to a > b 'greater than a >= b 'greater than or equal to C# uses the following relational operators: a == b a != b //equal to //not equal to a < b //less than a <= b //less than or equal to a > b //greater than a >= b //greater than or equal to 39 9781510484306.indb 39 19/08/20 8:31 PM 2 Programming 2.5 Boolean operations If multiple conditions need to be evaluated, the Boolean operators AND and OR can be used. A condition can also be reversed using the NOT operator. These function in the same way as the Boolean logic gates with the same names in section 4.2. The AND operator requires both conditions to be True for the overall condition to be True. If either or both of the conditions are False, the overall condition will be False. Worked example OUTPUT 'enter your username' username ← USERINPUT OUTPUT 'enter your password' password ← USERINPUT IF username = 'admin' AND password = 'changeme123' THEN OUTPUT 'Correct details. Logged in' ELSE OUTPUT 'Incorrect details' ENDIF In this example, the AND operator means that both conditions (the username and password both matching the correct ones) need to be True for the user to be logged in. The OR operator requires one or the other (or both) of the conditions to be True for the overall condition to be True. If both of the conditions are False, the overall condition will be False. Worked example num1 ← USERINPUT num2 ← USERINPUT IF num1 > 10 OR num2 > 10 THEN OUTPUT 'Accepted' ELSE OUTPUT 'Rejected' ENDIF In this example, the OR keyword means that either of the entries needs to be over 10 for the ‘Accepted’ message to be displayed. If both are over 10 then the accepted message is still displayed. The ‘Rejected’ message is only displayed if neither of the two entries is over 10. Key point Each condition given to an AND or OR operator must be a full condition. A very common mistake is to write code such as: IF num1 OR num2 > 10 THEN This is incorrect; the first condition (num1) has nothing to be compared against and so this code will not work as intended. The correct way, to write this is: IF num1 > 10 OR num2 > 10 THEN 40 9781510484306.indb 40 19/08/20 8:31 PM AQA GCSE Computer Science The NOT operator reverses the True or False outcome from a comparison. Worked example num1 ← USERINPUT num2 ← USERINPUT IF NOT(num1 = num2) THEN OUTPUT 'the numbers are not the same' ENDIF In this example, the equivalence operator (=) would give a True outcome if the input values were the same. However, the NOT operator reverses this, so that a False is returned if the numbers are the same and True if they are not the same. Of course, this is logically the same as using the ≠ operator. Examples in high-level languages – Boolean operators Python uses the following Boolean operators: a and b a or b not a #AND #OR #NOT VB.Net uses the following Boolean operators: a And b a Or b Not a 'AND 'OR 'NOT C# uses the following Boolean operators, which are considerably different from the AQA Pseudo-code: a & b a | b !a //AND //OR //NOT Knowledge check 4 What is the value assigned to the variable x in each of the following? (a) x ← 23 – 4 * 3 (b) x ← (12 – 5) * 3 (c) x ← 6 * 2 / 3 (d) x ← 8 / (5 – 1) (e) x ← 19 MOD 5 (f) x ← 22 MOD 4 * 2 (g) x ← 28 DIV 6 (h) x ← 23 DIV 2 * 4 5 What will be returned by the following comparisons? (a) a ≠ b if a = 8 and b = 5 (b) a ≥ b if a = 9 and b = 5 (c) a > b OR c < d if a = 5, b = 5, c = 3, d = 2 (d) a ≤ b AND c ≠ d if a = 6, b = 6, c = 2, d = 4 41 9781510484306.indb 41 19/08/20 8:32 PM 2 Programming 2.6 Data structures The concept of data structures and arrays We have seen that a variable can be used to store a single item of data in a computer program. An array is a data structure that allows a programmer to store multiple items of data under a single identifier. This would be useful, for example, in a computer game to store the names of multiple players. Arrays have a fixed number of items that they can store; this number is defined when the array is created. Each item in the array must also be the same data type. 1-dimensional arrays A 1-dimensional array (1D) is accessed via a single numeric index value. Tech term Index value A number that corresponds to the location of an item of data in an array. Array index Data 0 ‘Fletcher’ 1 ‘Imogen’ 2 ‘Tia’ 3 ‘Muhammad’ The above 1D array can be created using the following code in AQA Pseudo-code Students ← ['Fletcher', 'Imogen', 'Tia', 'Muhammad'] Note that the array index starts at 0, so an array of 10 values would have indexes from 0 to 9. Key point Python does not have simple arrays. Instead, it uses another data structure concept called a list. Lists are similar in operation but have two key differences. First, lists can contain a mix of different data types. Second, lists are not of a fixed size and can be added to or reduced in size during the running of the program. Arrays are commonly used with FOR loops to access each item in the array in turn. The code below uses a FOR loop to add up each element in an array called scores, which has already been set up, where the array has eight items. Array index Data 0 5 1 7 2 0 3 10 4 8 5 3 6 7 7 3 total ← 0 FOR i ← 0 TO 7 total ← total + scores[i] ENDFOR OUTPUT total This code would return a total of 43. Examples in high-level languages – arrays Python uses the following syntax to define and iterate through a one-dimensional list of eight scores and add up the total. Note that Python uses lists and not arrays, although at GCSE level the two are almost interchangeable. 42 9781510484306.indb 42 scores = [5,7,0,10,8,3,7,3] total = 0 for x in range(8): total = total + scores[x] print(total) 19/08/20 8:32 PM AQA GCSE Computer Science VB.Net uses the following syntax. Note that an array is initially defined using {curly braces} but then is accessed using (brackets). Dim scores As Integer() = {5,7,0,10,8,3,7,3} Dim total As Integer = 0 For x = 0 To 7 total = total + scores(x) Next x Console.WriteLine(total) C# uses the following syntax: int[] scores = new int[] {5,7,0,10,8,3,7,3}; int total = 0; for (int x = 0; x < 8; x++) { total = total + scores[x]; } Console.WriteLine(total); Examples in high-level languages – arrays using FOR EACH All three languages offer an alternate way to loop through arrays using a FOR EACH loop. This is not given as an example in AQA Pseudo-code but is possible in Python, VB.Net and C#. Instead of using the index of the array, each element is accessed in turn. Python uses the following syntax to do this: scores = [5,7,0,10,8,3,7,3] total = 0 for item in scores: total = total + item print(total) VB.Net uses the following syntax: Dim scores As Integer() = {5,7,0,10,8,3,7,3} Dim total As Integer = 0 For Each item As Integer In scores total = total + item Next Console.WriteLine(total) C# uses the following syntax: int[] scores = new int[] {5,7,0,10,8,3,7,3}; int total = 0; foreach (int item in scores); { total = total + item; } Console.WriteLine(total); 2-dimensional arrays A 2-dimensional (2D) array is accessed using two index numbers. It can be represented using a table as shown here: 43 9781510484306.indb 43 19/08/20 8:32 PM 2 Programming Table 2.1 2D array showing scores for a game 0 0 19 1 27 2 84 3 102 1 123 99 35 33 2 85 75 20 7 To access each value, two index numbers are required, such as scores[0][2]. Worked example To access each element in this 2D array, the following nested FOR loops could be used. FOR x ← 0 TO 2 FOR y ← 0 TO 3 OUTPUT scores[x][y] ENDFOR ENDFOR This code accesses the [0][0] entry first, then the [0][1] entry, then [0][2], then [0][3]; then [1][0], [1][1], [1][2] and so on up until the final [2][3] entry. Key point There is no one ‘correct’ way to represent a 2D array; it can be thought of as being accessed via [row] [column] or [column][row]. A table is simply an abstract representation of how the data is stored in a 2D array. Any exam question using a table for this will tell you how to access the array. Examples in high-level languages – 2D arrays Tech term Nested loop One loop that sits within another loop. Python does not have true 2-dimensional arrays, but instead allows programmers to use lists where each element is itself a list. The following code will access each element in these lists using nested FOR loops. for x in range(3): for y in range(4): print(scores[x][y]) VB.Net uses the following syntax, with each element accessed by two index numbers in brackets separated by a comma. For x = 0 To 2 For y = 0 To 3 Console.WriteLine(scores(x,y)) Next y Next x C# uses the following syntax, with square brackets used and a comma to separate the index values. for (int x = 0; x < 3; x++) { for (int y = 0; y < 4; y++) { 44 9781510484306.indb 44 } } Console.WriteLine(scores[x,y]); 19/08/20 8:32 PM AQA GCSE Computer Science Knowledge check 6 The contents of the array fruit is displayed below, where fruit[1][3] is ‘Apple’. 0 1 2 0 Pear Raspberry Strawberry 1 Grape Blueberry Greengage 2 Banana Blackcurrant Lemon 3 Damson Apple Lime 4 Orange Grapefruit Kiwi (a) What is the value of fruit[1][1]? (b) What is the value of fruit[2][3]? (c) What is the array element for ‘Blackcurrant’? The use of records to store data A record is a data structure that allows multiple data items to be stored, using field names to identify each item of data. To create a record, we must first define the field names that will make up each record. For a record about a student, these might be: ● FirstName ● Surname ● YearGroup ● Email. We can then store data under these field names in a database management system using a table. Table 2.2 Table called 'Student' showing three records FirstName Surname YearGroup Bradley Jenkins 9 Ghita Cable 10 Charlotte Pegg 9 Email bjenkins@notreal.co.uk gcable@notreal.co.uk cpegg@notreal.co.uk Worked example To define the record structure shown above, we could use the following code: RECORD Student FirstName: String Surname: String YearGroup: Integer Email: String ENDRECORD 45 9781510484306_AQA_GCSE_Computer_Science_CH02.indd 45 19/08/20 10:35 PM 2 Programming 2.7 Input and output Inputs and outputs It is useful to be able to input data from a user and output data back to the user. The keywords USERINPUT and OUTPUT are used for these purposes in AQA Pseudo-code. Worked example The following program in AQA Pseudo-code calculates the circumference of a circle using pi and user input before outputting the answer. constant pi ← 3.14159 radius ← USERINPUT circumference ← 2 * pi * radius OUTPUT circumference Examples in high-level languages – input and output Python uses the following syntax: PI = 3.14159 #capitals to denote constant radius = float(input(“enter a radius”)) circumference = 2 * PI * radius print(circumference) VB.Net uses the following syntax: Dim radius As Single Dim circumference As Single Const pi As Single = 3.14159 radius = Console.Readline() circumference = 2 * pi * radius Console.WriteLine(circumference) C# uses the following syntax: const float pi = 3.14159f; float radius; float circumference; radius = Convert.ToSingle(Console.ReadLine()); circumference = 2 * pi * radius; Console.WriteLine(circumference); 2.8 String-handling operations Just as integers and real numbers can be manipulated using arithmetic operators, strings can be manipulated using basic string-handling keywords. Length, substring and character codes The following table gives the AQA Pseudo-code keywords used for length, substring and character code operations. 46 9781510484306.indb 46 19/08/20 8:32 PM AQA GCSE Computer Science Keyword LEN(string) POSITION(string, character) SUBSTRING(x,y, string) Use To count how many characters are contained in a string To return the position of a character in the string, with indexing starting at 0 unless otherwise stated To extract characters from the middle of a string where x is the starting point (beginning at 0) and y is the end point (beginning at 0) Examples name←'Seth Bottomley' LEN(name) gives 14 POSITION('Seth', 'h') gives 3 SUBSTRING(2,6, name) gives 'th Bo' SUBSTRING(0,2, name) gives 'Set' CHAR_TO_CODE(char) To find the ASCII value of a character CHAR_TO_CODE('D') gives 68 CODE_TO_CHAR(int) To find the character that relates to the CODE_TO_CHAR(68) gives 'D' integer ASCII value given. (For more on ASCII see section 3.5.) Examples in high-level languages – string handling Python uses the following syntax: len(string) #returns the number of characters in the string string[2:7] #returns the 3rd through to 7th characters ord(char) #returns the ASCII value of a character chr(int) #returns the character of a given ASCII code VB.Net uses the following syntax: Len(string) 'returns the number of characters in the string Left(string,x) 'returns x characters from the left Right(string,x) 'returns x characters from the right Mid(string,x,y) 'returns y characters from position x Asc(char) 'returns the ASCII value of a character Chr(int) 'returns the character of a given ASCII code C# uses the following syntax: string.Length; //returns the number of characters in the string string.Substring(x,y); //returns y characters starting at position x Convert.ToInt32(char); //returns the ASCII value of a character Convert.ToChar(int); //returns the character of a given ASCII value Note that C# does not have specific tools for converting to and from ASCII, but rather an ASCII value is implicitly returned or used when converting between a char and an integer. Concatenation Concatenation of strings means to join multiple strings together. This is done using the + operator. When strings are concatenated, they are joined together in the order given. Worked example texta ← 'this is a message' textb ← 'great string' new ← 'GCSE ' + SUBSTRING(5,6,texta) + ' ' + SUBSTRING(0,4,textb) OUTPUT new The above code would join together extracts from the two strings to give ‘GCSE is great’. 47 9781510484306.indb 47 19/08/20 8:32 PM 2 Programming Key point The + operator is ‘overloaded’, which means that it does different things depending on what data is given to it. Compare the result of 2+7 (which gives 9) with the result of “hello”+”world” (which gives “helloworld”). The + operator concatenates strings but adds numeric values. String/type conversion Key point Note that some languages such as Python also allow programmers to treat a string as an array of characters and extract individual characters this way. This would be an acceptable alternative in examination questions. Tech term Casting Changing one data type to another. Type conversion (sometimes referred to as casting) means to convert data from one data type to another. String conversion means converting another data type to or from the string data type. To do this, the following keywords are used in AQA Pseudo-code: Keyword Examples Comment STRING_TO_INT() a ← STRING_TO_INT('123') b ← STRING_TO_INT('7') Converts a value stored as a string to an integer. STRING_TO_REAL() c ← STRING_TO_REAL ('12.9') d ← STRING_TO_REAL ('46') Converts a value stored as a string to a real/floating point number. INT_TO_STRING() e ← INT_TO_STRING(17) f ← INT_TO_STRING(140) Converts a value stored as an integer to a string. REAL_TO_STRING() g ← REAL_TO_STRING(17.2) h ← REAL_TO_STRING(3.8) Converts a value stored as a real/ floating point number to a string. Not all values can be converted to another data type. For example, if a programmer attempted to convert the string ‘hello’ to an integer value using newval ← STRING_TO_INT('hello') an error would be raised. There is no sensible way of deciding what integer value ‘hello’ should be converted to. Figure 2.3 Example of an error in Python when converting a string to an integer Examples in high-level languages – type conversion Python uses the following syntax: int(x) str(x) float(x) #returns an integer version of x #returns a string version of x #returns a real version of x VB.Net uses the following syntax: CInt(x) CStr(x) CSng(x) 'returns an integer version of x 'returns a string version of x 'returns a real version of x 48 9781510484306.indb 48 19/08/20 8:32 PM AQA GCSE Computer Science C# uses the following syntax: Convert.ToInt32(x); Convert.ToString(x); Convert.ToSingle(x); //returns an integer version of x //returns a string version of x //returns a real version of x Note that in VB.Net and C#, the ‘single’ data type is another name for float/real. (There is also a double data type that uses twice as many bits of storage and therefore increases precision.) Knowledge check 7 If text ← 'Computing is fun', what is returned by: (a) (b) (c) (d) LEN(text) SUBSTRING(0,1,text) SUBSTRING(8,LEN(text)-1),text) What command returns the string ‘fun’? 2.9 Random number generation Random numbers can be generated by a programming language. For example, a game may require that a random score between 1 and 10 is given when a certain action is completed. Typically, upper and lower limits to the random number are passed in as parameters and a random value between these two limits is returned by the subroutine. In the AQA Pseudo-code, the RANDOM_INT(x, y) subroutine allows us to do this. Programmers can pass in the lowest and highest number required as arguments (in brackets): RANDOM_INT(1, 5) #chooses a random integer between 1 and 5 RANDOM_INT(20, 30) #chooses a random integer between 20 and 30 Worked example r ← RANDOM_INT(1, 10) OUTPUT RANDOM_INT(20, 50) Here, the RANDOM_INT subroutine is used in two different ways. The first line shows how the returned value can be assigned to a variable, to be used later on in the program. The second line shows another random value being printed for the user. 49 9781510484306.indb 49 19/08/20 8:32 PM 2 Programming Examples in high-level languages Python uses the following syntax to pick a random value between 1 and 100: import random print(random.randint(1, 100)) VB.Net uses the following syntax to pick a random value between 1 and 100: Dim r As Random = New Random Console.WriteLine(r.Next(1, 101)) C# uses the following syntax to pick a random value between 1 and 100: Random r = new Random(); Console.WriteLine(r.Next(1, 101)); 2.10 Structured programming and subroutines Subroutines When programs grow in size, they can become hard to manage. Ideally, larger programs should be broken down into subroutines. When a subroutine is called, control passes from the main program to the subroutine. Once the subroutine has completed, control is passed back to the main program. Main program running Subroutine called Subroutine runs Main program continues Figure 2.4 A subroutine being called from a main program 50 9781510484306.indb 50 19/08/20 8:32 PM AQA GCSE Computer Science Defining subroutines A subroutine is a section of code that is defined outside the main body the program. It is given its own identifier, which is then used to call the subroutine as many times as required. #subroutine definition SUBROUTINE timestableuser() OUTPUT 'enter times table' num ← USERINPUT FOR x ← 1 TO 10 OUTPUT num * x ENDFOR ENDSUBROUTINE #call the subroutine to run timestableuser() Here, a subroutine to produce a printed times table is defined once and given the identifier timestableuser. Once defined, the subroutine can be called from within a program as many times as required. Passing parameters Note the use of brackets after the subroutine’s identifier. In the last example, these brackets were empty but they can be used to pass parameters into the subroutine. A parameter is a value that the subroutine will use. We can rewrite the previous subroutine with parameters to control which times table will be produced and how many numbers to print out. Worked example #subroutine definition with parameters SUBROUTINE timestable(tt, nums) FOR x ← 1 TO nums OUTPUT tt * x ENDFOR ENDSUBROUTINE #call the subroutine to run timestable(8, 10) timestable(9, 12) Another subroutine is defined, called timestable . This time it expects two parameters to be passed into the subroutine in the form of variables tt and nums. When the subroutine is called, these parameter values also have to be passed in as part of the subroutine identifier. In this example, the first time the subroutine is called the values 8 and 10 are passed into the subroutine. The second time the subroutine is called, the values 9 and 12 are passed into the subroutine. The first call will print out the 8 times table from 8 × 1 to 8 × 10 whereas the second call will print out the 9 times table from 9 × 1 to 9 × 12. Using this subroutine is far more flexible and efficient than writing new code for each different times table. 51 9781510484306.indb 51 19/08/20 8:32 PM 2 Programming Beyond the spec In computer science generally, the term ‘argument’ is sometimes used instead of parameter. However, only the term ‘parameter’ will be used in examinable material. Key point A subroutine can have any number of parameters passed in, including none. Each parameter passed is separated by a comma. Subroutines that return values A subroutine can return a value back to the main program when it returns control. This value can then be stored, printed or otherwise used in the main program. Subroutines that do this are also known as functions. To return a value from a subroutine in AQA Pseudo-code, the RETURN keyword is used. Worked example #subroutine definition that returns a value SUBROUTINE circle_area(radius) const pi ← 3.14159 area ← pi * radius^2 RETURN area ENDSUBROUTINE #how to use the subroutine new ← circle_area(10) OUTPUT new Key point Subroutines that return a value are known as functions. Subroutines that do not return a value are known as procedures. In this example, a new subroutine is defined called circle_area(radius), which calculates the area of a circle. It requires a value for the circle’s radius to be passed to it as a parameter. Note that in the subroutine definition the RETURN keyword is used to define what is passed back to the main program – in this case the value of the circle’s area. When the subroutine is called, this value is then stored in a variable new. Note that the programmer could have chosen to do something else with the returned value for the area, such as print it out or use it in a further calculation. This choice is made when the function is called. Examples in high-level languages – subroutines that return a value Python uses the following syntax to define a subroutine that returns a value: def circle_area(radius): PI = 3.14159 area = PI * radius**2 return area VB.Net uses the following syntax to define a subroutine that returns a value: 52 9781510484306.indb 52 Function circle_area(ByVal radius As Single) As Single Dim area As Single Const pi As Single = 3.14159 area = pi * radius^2 Return area End Function 19/08/20 8:32 PM AQA GCSE Computer Science C# uses the following syntax to define a subroutine that returns a value: static single circle_area(single radius) { const single pi = 3.14159; single area = pi * radius * radius return area; } Note, in both VB.Net and C# the subroutine definition may appear very differently, depending on whether data types and the scope of the subroutine are declared. Local variables In the circle_area() subroutine used previously, the values area and pi are both used within the subroutine definition. However, these variables have local scope. This means that these variables only exist within the subroutine and cannot be accessed by the main program. Such variables are called local variables. If a user attempts to access one of these variables from the main program in a high-level language, an error will occur. Worked example Here, the circle_area subroutine has been defined in Python and then called by passing a radius of 10 in as a parameter. The programmer has also tried to print the value of the local variable area . def circle_area(radius): PI = 3.14159 area = PI * (radius**2) return area print(circle_area(10)) print(area) However, attempting to print the local variable area will result in an error as the variable does not exist outside the subroutine. The exact same issue occurs in VB.Net and C#. print(area) NameError: name 'area' is not defined Advantages of the structured approach to programming The structured approach to programming involves modularised programming, clear welldocumented interfaces and utilising return values from subroutines. Splitting programs down into multiple subroutines rather than having one large program is known as modularised or structured programming. By doing this, code repetition can be reduced and the overall number of lines of code can be smaller. This also means that important values and processes can be modified in one place in the code without having to find every time that process is used. For these subroutines, parameters should be passed in rather than asking the user for input. Local variables should also be used where possible inside the subroutine and the final result returned back to the main program. 53 9781510484306.indb 53 19/08/20 8:32 PM 2 Programming The main advantages of the structured approach to programming include the following: ● It reduces the overall size of the program as code does not need to be repeated in multiple places. ● It makes the code much easier to maintain as it is easier to read and understand the purpose of each subroutine. ● It reduces development and testing time as code is much easier to write and debug. ● It allows reuse of code between programs, especially where pre-written and pre-tested subroutines can be used. A function is a subroutine that returns a value. A procedure is a subroutine that does not return a value. Worked example Example A price1 ← 100 * 1.2 OUTPUT 'the price is £' OUTPUT price1 price2 ← 87 * 1.2 OUTPUT 'the price is £' OUTPUT price2 price3 ← 35 * 1.2 OUTPUT 'the price is £' OUTPUT price3 price4 ← 17 * 1.2 OUTPUT 'the price is £' OUTPUT price4 Example B SUBROUTINE calculate(price) newprice ← price * 1.2 RETURN 'the price is £' + newprice ENDSUBROUTINE OUTPUT(calculate(100)) OUTPUT(calculate(35)) OUTPUT(calculate(87)) OUTPUT(calculate(17)) OUTPUT(calculate(99)) OUTPUT(calculate(400)) price5 ← 99 * 1.2 OUTPUT 'the price is £' OUTPUT price5 price6 ← 400 * 1.2 OUTPUT 'the price is £' OUTPUT price6 Both of the examples above take six prices and add 20% in tax before printing the value. However, Example A achieves this through copying and pasting the code whereas Example B uses a subroutine. Example B is not only shorter but is much easier to follow through. Crucially, if the tax rate changes from 20%, it only needs to be changed in one place in Example B. This makes the program much easier to maintain. Example B utilises a modular approach where the calculate() subroutine accepts a parameter for the price to be used, has a local variable newprice and then returns the overall calculation result rather than printing it. Knowledge check 8 Describe one difference between a function and a procedure. 9 Describe two benefits of using subroutines. 54 9781510484306.indb 54 19/08/20 8:32 PM AQA GCSE Computer Science 2.11 Robust and secure programming Tech terms Robust program A program that functions correctly under less than ideal conditions. Secure program A program that allows use by authorised users only. It is not enough for a programmer to just ensure that their programs work correctly for them on their own computer – what happens when the software is used by a different user, with different input data or on different hardware? A robust program will handle all of this and still function correctly. A secure program is one that only allows authorised users to access it. A secure program is necessarily robust as attackers often deliberately cause errors to find ways into a system. Data validation When designing programs to be robust, it is important to think about problems that could potentially occur and then nullify them before they happen. For example, if a user is asked for a number between 1 and 10, how do we know that they will actually enter a number at all, let alone one in the correct range? A well-designed robust program will check both of these things. Misuse of a program could be deliberate (users looking for ways to hack into a system by making it crash) or accidental (users not understanding what they are supposed to enter or accidentally entering the wrong data). However, we must deal with all of these possible events. Worked example The following is a program for a banking application that allows the user to withdraw money from their account. balance ← 100 withdraw ← USERINPUT balance ← balance - withdraw OUTPUT 'your balance is now' + balance We could check that this program works if the balance is £100 by withdrawing £20, expecting to see a balance of £80 remaining. However, if we check for robustness, some potential issues become clear: Should we be allowed to withdraw more than our balance? What if we withdraw a negative balance? This will effectively add money to our account. ➤ What if the user enters an amount in words, such as ‘TEN POUNDS’? Will this cause the program to crash? ➤ What if the user enters numbers with the preceding £ sign? Will this cause the program to crash? All of these issues would need to be dealt with by the programmer before the program could be said to be robust. ➤ ➤ Data validation is the process of checking input data against rules defined by the programmer to ensure that the data is sensible and as expected. Key point Validation cannot check that data is correct, only that it follows certain rules. For example, a programmer could validate that a phone number entered by the user is made up entirely of numbers and starts with a 0. If a phone number of 123ABC was entered, this would be rejected as invalid. However, if a user entered a valid phone number but mistakenly entered the wrong digits, such as someone else’s phone number, the program would not identify this. 55 9781510484306.indb 55 19/08/20 8:32 PM 2 Programming Data can be validated to ensure that is: ● within a sensible range, such as between 1 and 100, for example. OUTPUT 'Enter a number' num ← USERINPUT IF num ≥ 1 AND num ≤ 100 THEN valid ← True ELSE valid ← False ENDIF ● of the correct length, such as eight characters or more, for example. OUTPUT 'Enter some text' text ← USERINPUT IF LEN(text) ≥ 8 THEN valid ← True ELSE valid ← False ENDIF Tech term Presence check A data validation method that prevents users from leaving a field blank. ● present, to stop users leaving certain information empty. This is known as a presence check. OUTPUT 'Enter some text' text ← USERINPUT IF text ← " THEN valid ← False ELSE valid ← True ENDIF Worked example The following program shows the previous banking application program, but this time the input has been partially validated. The user cannot enter a value below 0 and cannot withdraw more than their current balance. The WHILE loop ensures that they are continually asked for a withdrawal amount until the value entered is validated successfully. balance ← 100 withdraw ← 0 WHILE (withdraw ≤ 0) OR (withdraw > balance) OUTPUT 'Enter amount to withdraw in £' withdraw ← USERINPUT IF withdraw < 0 THEN OUTPUT 'You must withdraw a positive amount' ELSE IF withdraw > balance THEN OUTPUT 'You cannot withdraw more than your balance' ENDIF ENDWHILE balance ← balance – withdraw OUTPUT 'Your balance is now' OUTPUT balance Of course, there would be more to do to fully validate the input because the user can still leave the input amount blank or enter a non-numeric value. 56 9781510484306.indb 56 19/08/20 8:32 PM AQA GCSE Computer Science Knowledge check 10 State what is meant by validation. 11 Suggest two rules that could be used to validate a date of birth entered by a user. 12 Explain why someone’s name could not easily be validated. 13 In the validated code in the worked example above, write down what happens in each line of code when the user enters: (a) −10 (b) 110 (c) 60 Authentication Some computer systems, such as online shopping websites, are available to all users on an anonymous basis. Users can search for products without giving any personal details. However, in order to do any more than this, users are often required to be authenticated. Some computer systems require authentication to be allowed to use the system at all. Authentication is the process of establishing a user’s identity. How can the computer system be sure that the user is who they say they are? Users can be authenticated in many ways: ● Usernames and passwords are perhaps the most common method. A username and a secret password are chosen by the user and when these are entered, a computer system can check that they match those of a known user. ● Possession of an electronic key, device or account can be used for authentication since only one person will have access to that particular device. Some computer systems will check that an email address or phone number belongs to you by sending an email or text containing a secret code. ● Biometrics is the use of measurements relating to biological traits. If your school uses fingerprint scanners to identify you, you will have experience of this. Banks are increasingly using voice recognition to authenticate users who use telephone services. For example, the following pseudo-code will only allow entry if a username and password are both entered correctly: Tech term Two-factor authentication Establishing a user’s identity by means of two separate methods, such as passwords and biometrics. username ← 'admin' pwhash ← '5fa345bcd3c1' OUTPUT 'enter a username' un ← USERINPUT OUTPUT 'enter password' pw ← USERINPUT IF un = username AND hash(pw) = pwhash THEN loggedin ← True ELSE loggedin ← False ENDIF Two-factor authentication is where two of the above are checked simultaneously. For example, you may log in to a system with a username and password (first method of authentication) and are immediately then sent a text message or email to respond to (second method of authentication). 57 9781510484306.indb 57 19/08/20 8:32 PM 2 Programming Beyond the spec Key point Authentication can be typically reduced to three areas: what you know (passwords), what you have (devices and keys) and what you are (biometrics). Passwords should never be stored by computer systems in plain text as this would be very dangerous – if a hacker was able to get access to the database then they would know everyone’s passwords. Instead, a hash of each password is stored; this is a mathematical one-way encryption process that returns a very long number for each password. When a user enters a password, this is hashed and compared with the hash stored in the database; if the two match, the user can be authenticated without ever having their password stored. Knowledge check 14 What is meant by the term ‘robust programming’? 15 Explain two robust programming considerations. 16 What is meant by ‘two-factor authentication’? Testing Tech terms Destructive testing Instead of simply checking whether a program works as intended, destructive testing actively tries to find ways to break the program. Test plan A list of test data, expected and actual results. No matter how well written a program is, there is always the chance that errors have crept in. Testing allows us to systematically check that a program functions as it should in all circumstances. The purpose of testing is to ensure that the program functions as expected and meets all requirements. However, testing should also be destructive; that is, we should not simply aim to prove that the program works, but we should also try to do all we can to break it. Only by knowing that it cannot easily be broken can we be satisfied that it works fully. For instance, if we create a program for a hot drinks machine that should give us a selection of drinks for £1 each, is it enough to test it by inserting £1 and pressing the corresponding drink? This is a starting point, but if we only relied on this test we may not realise that someone is able to get a drink without inserting any money. Only by testing it destructively, and trying to see if there is any other way of getting a drink for less than £1, can we be sure that it works fully. Selecting test data To test a program effectively, a test plan is needed. This plan lists all of the tests that will be carried out, the test data to be used and the expected outcome. Test data should cover as many of the following situations as possible. Normal Normal (or typical) test data is data of the correct type that would be expected from a user who is correctly using the system. This should be accepted by the program without causing errors. 58 9781510484306.indb 58 19/08/20 8:32 PM AQA GCSE Computer Science Boundary Boundary (or extreme) test data is data that is of the correct type but is on the very edge of being valid. Boundary test data should be accepted by the program without causing errors. Erroneous Erroneous test data is data that should be rejected, either because it is outside the expected values or because it is of the incorrect type. For example, if a program expected a numeric input between 1 and 10, both the value 25 and a string such as ‘hello’ would be erroneous inputs. Consider this example: A system allows a user to enter a value between 0 and 100, with the number being rounded up or down to the nearest 10. Any numbers outside the range 0 to 100 should be rejected. There are many possible tests but a typical test plan could be as shown in the table below. Key point Test data should be listed on a test plan using the actual data that would be entered. A very common mistake is to simply describe the test data (such as ‘a number larger than 100’). This is not specific enough; instead an actual value (e.g. 101) should be included in the test plan. Test data 47 32 0 100 -5 150 “Twelve” Type of test data Normal Normal Boundary Boundary Erroneous Erroneous Erroneous Reason To check if values round up To check if values round down To check the low boundary To check the high boundary To check if numbers below 0 are rejected To check if numbers above 100 are rejected To check data of the wrong type is rejected Expected result 50 30 0 100 Rejected Rejected Rejected Knowledge check 17 State what is meant by normal test data. 18 Describe how boundary test data is different from erroneous test data. 19 A system should allow passwords that are between 8 and 15 characters in length. Suggest one suitable item of erroneous test data for this system. 20 Complete the following test plan for a system that checks if users are 18 years of age or older. Anyone younger than 18 should be rejected. Anyone 18 or over should be accepted. Test data Type of test data Expected result Normal Boundary Erroneous 59 9781510484306_AQA_GCSE_Computer_Science_CH02.indd 59 19/08/20 10:28 PM 2 Programming Correcting errors in algorithms and programs Refining an algorithm means to improve it. If testing has picked up any errors, an obvious improvement would be to fix the problem. Worked example The code below should allow values between 1 and 10. OUTPUT 'Enter a value between 1 and 10' num ← USERINPUT IF num > 1 AND num < 10 THEN OUTPUT 'Allowed' ELSE OUTPUT 'Not allowed' ENDIF When this code is tested thoroughly, a number of errors are discovered. These are shown in the test plan below. Test data 5 1 10 0 11 Type of test data Reason Normal Check valid data Boundary Check low extreme of range Boundary Check high extreme of range Erroneous Check low outside range Erroneous Check high outside range Expected result Allowed Allowed Actual result Allowed Not Allowed Allowed Not Allowed Not allowed Not allowed Not allowed Not allowed The code can then be refined to ensure that it works on these boundaries and then tested again. OUTPUT 'Enter a value between 1 and 10' num ← USERINPUT IF num ≥ 1 AND num ≤ 10 THEN OUTPUT 'Allowed' ELSE OUTPUT 'Not allowed' ENDIF This refined code should then be tested again using suitable test data. Depending on the changes that were made to the code, new test data might need to be selected – do not assume you can simply use the same test plan. Identifying syntax and logic errors If errors are found, they could either be syntax errors or logic errors. A syntax error breaks the grammatical rules of the programming language in some way, such as missing off a quotation mark, misspelling a keyword or using assignment incorrectly. A syntax error will cause the program to stop running (or not run in the first place) because the translator does not understand the instruction given. 60 9781510484306.indb 60 19/08/20 8:32 PM AQA GCSE Computer Science Worked example The code below contains a number of syntax errors. num ← USERINPTU 10 ← x OUTTPU num + x Firstly, the USERINPUT statement on the first line is misspelled as USERINPTU. Next, the assignment statement on the second line is the wrong way around (it should be x ← 10). Lastly, the keyword OUTPUT has been misspelled as OUTTPU. Any one of these errors would stop the program from running and produce a syntax error. A logic error by comparison is an error in the algorithm that does not stop the program from running but means it does not always produce the correct output. This is usually caused by the programmer writing instructions that have the correct syntax but do not do what was intended. Worked example The code below contains a logic error. SUBROUTINE addup(a, b) RETURN a * b ENDSUBROUTINE The intention of the subroutine is to add up the two numbers passed in as parameters. However, the subroutine instead multiplies the two numbers. This is a valid instruction and would not cause the program to stop, but it is certainly not what the programmer intended. Note that a logic error in the code does not necessarily mean that the program always produces an incorrect output – it may only occur in some parts of the program, or only when certain inputs are used. Key point Syntax errors are errors relating to the rules of the programming language. A program containing a syntax error will not run and hence it is clear that there is an issue with it. Programs containing logic errors do run but do not produce the desired output. Logic errors are harder to spot because it is not always immediately clear that there is an issue with the program. Knowledge check 21 Give one difference between a logic error and a syntax error. 22 A programmer writes the line IF x > FOR . Explain whether this would cause a logic or a syntax error. 23 A programmer finds that a subroutine to calculate values does not output the result expected when negative values are passed in as parameters. Explain whether this would cause a logic or a syntax error. 61 9781510484306.indb 61 19/08/20 8:32 PM RECAP AND REVIEW 2 PROGRAMMING Important words You will need to know and understand the following terms: integer real numbers Boolean character string variable constants identifier declared assignment sequence selection iteration looping definite iteration nested selection nested iteration arithmetic operators modulus (MOD) integer (DIV) relational operators comparison operators Boolean operators data structure array record field names string handling substring length (string) character code concatenation string conversion random numbers subroutines call (subroutines) parameters return values (subroutines) local variables 2.1 Data types Integers are whole numbers, positive or negative, that have no decimal or fractional part. For example, 99 is an integer. The real data type is used for numbers, positive or negative, that have (or may have) a decimal or fractional part. For example, 18.779 is a real number. Boolean variables only ever store True or False values. For example, True is a Boolean value. A character is a single item from the character set used on the computer. For example, @ is a character. When assigning a character to a variable, quotation marks are required. A string data type stores a collection of characters. For example, “Hello world” is a string. When assigning a string to a variable, quotation marks are required. 2.2 Programming concepts Variables, constants and assignments A variable stores a single piece of data. It is a label for an allocated area of memory. The value of a variable can be changed during the execution of the program. ■ A constant is also a label for an allocated area of memory. Unlike a variable however, the value of a constant cannot change during the execution of the program. Variables and constants are given an identifier (or name). Their identifiers can be almost anything but must: ■ ■ ■ not contain spaces not start with a number not be particular words reserved for use in the programming language. In most high-level languages, variables and constants are declared (defined) at the beginning of the program. (Note, however, that Python does not use declarations.) Variables and constants have values assigned to their identifiers. ■ ■ ■ This is done using the ‘= ’ operator in Python, VB.Net and C#. This done using the ‘←’ operator in AQA Pseudo-code. 62 9781510484306.indb 62 19/08/20 8:32 PM AQA GCSE Computer Science Important words modularised programming robust program secure program validation authentication testing test data normal test data boundary test data erroneous test data syntax error logic error Variables can be assigned new values throughout a program, which overwrite the previous value. Constants can only be assigned a value once. Sequence and selection Sequence is the execution of statements one after the other, in order. A program runs from top to bottom and each instruction completes fully before the next one is executed. Selection is the construct used to make decisions in a program. These decisions are based on Boolean conditions and the program then takes one of two paths based on this condition. Condition False True Code to execute if condition is True Code to execute if condition is False Selection can be implemented using IF, ELSE and ELSE IF statements: name ← USERINPUT IF name = 'George' THEN OUTPUT 'Hello George' ELSE IF name = 'Lorne' THEN OUTPUT 'Great work Lorne' ELSE IF name = 'Kirstie' THEN OUTPUT 'Nice to see you again' ELSE OUTPUT 'Hello stranger' ENDIF Iteration Iteration is used to repeat sections of code. Iteration is commonly called looping. 63 9781510484306.indb 63 19/08/20 8:32 PM 2 Programming Definite iteration repeats code a defined number of times. FOR loops can be used to implement count-controlled iteration. A step can also be defined. For example: FOR p ← 1 TO 10 OUTPUT p * 8 ENDFOR Indefinite iteration checks a condition each time around the loop and decides whether to repeat the code again or continue. WHILE loops and REPEAT UNTIL loops can be used to implement condition-controlled iteration. WHILE loop total ← 0 WHILE total < 20 num ← USERINPUT total ← total + num ENDWHILE OUTPUT 'done!' REPEAT UNTIL loop total ← 0 REPEAT num ← USERINPUT total ← total + num UNTIL total ≥ 20 OUTPUT 'done!' Note that the WHILE loop will repeat while the total is less than 20 whereas the REPEAT UNTIL loop will repeat until the total is larger than or equal to 20. They produce exactly the same results but check different conditions at different times in the code. Nested selection and iteration Nesting is where multiple iteration or selection constructs are used inside each other. The following code nests an IF statement inside a FOR loop, meaning that the question is asked and evaluated ten times. FOR x ← 1 TO 10 OUTPUT 'Enter a number' num ← USERINPUT IF num MOD 2 = 0 THEN OUTPUT 'even' ELSE OUTPUT 'odd' ENDIF ENDFOR 64 9781510484306.indb 64 19/08/20 8:32 PM AQA GCSE Computer Science 2.3 Arithmetic operations A computer program uses operators to perform some sort of action. Arithmetic operators can be used to carry out basic mathematical operations on numeric values. Operator + – * / MOD DIV Name Addition Subtraction Multiplication Division Modulus – returns the remainder after division Integer – returns the whole number after division Example r ← 5 MOD 2 would give the result that r = 1 q ← 5 DIV 2 would give the result that q = 2 2.4 Relational operations Relational (or comparison) operators are used to evaluate expressions to a Boolean True or False outcome. Operator = ≠ < ≤ > ≥ Name Equal to Not equal to Less than Less than or equal to Greater than Greater than or equal to 2.5 Boolean operations Boolean operators allow multiple conditions to be evaluated. ■ The AND operator requires both conditions to be True for the overall condition to be True. The OR operator requires one or the other (or both) of the conditions to be True for the overall condition to be True. ■ The NOT operator reverses the True or False outcome from a comparison. ■ 2.6 Data structures ■ A variable can be used to store a single item of data in a computer program. Data structures allow a programmer to store multiple items of data under a single identifier. ■ Common data structures are arrays and records. ■ 65 9781510484306.indb 65 19/08/20 8:32 PM 2 Programming 1D and 2D arrays A 1-dimensional array allows a programmer to store multiple items of data under a single identifier. Array index 0 Data ‘hello’ 1 ‘world’ 2 ‘how’ 3 ‘are’ 4 ‘you?’ The above 1D array can be created using the following code in AQA Pseudo-code: phrase ← ['hello', 'world', 'how', 'are', 'you?'] A 2-dimensional array allows a programmer to store multiple items of data using two identifiers. It can be represented in a table form as shown below: 0 23 3 0 1 1 83 92 2 5 4 3 15 7 4 64 5 Any exam question using a table for this will tell you whether you access the array as [row, column] or [column, row]. Lists Python does not have simple arrays. Instead, it uses another data structure concept called a list. Lists are similar in operation to arrays but have three key differences. ■ Lists can contain a mix of different data types. ■ Lists are not of a fixed size and can be added to or reduced in size during the running of the program. ■ There is no such thing as 2D list – however, you can create lists within lists that perform the same function as a 2D array. The use of records to store data ■ ■ A record is a data structure that allows multiple data items to be stored, using field names to identify each item of data. To create a record, we must first define the field names that will make up each record. We can then store data under these field names in a database management system using a table. For example: ■ Table called ‘Student’ showing three records. FirstName Bradley Ghita Charlotte Surname Jenkins Cable Pegg YearGroup 9 10 9 Email bjenkins@notreal.co.uk gcable@notreal.co.uk cpegg@notreal.co.uk 66 9781510484306.indb 66 19/08/20 8:32 PM AQA GCSE Computer Science 2.7 Input and output ■ ■ It is useful to be able to input data from a user and output data back to the user. The keywords USERINPUT and OUTPUT are used for these purposes in AQA Pseudo-code. 2.8 String-handling operations Length, substring and character codes The following table gives the AQA Pseudo-code for the following string-handling operations: Keyword LEN(string) Use To count how many characters are contained in a string SUBSTRING(x,y, string) To extract characters from the middle of a string where x is the starting point (beginning at 0) and y is the end point (beginning at 0). Examples name ← ‘Seth Bottomley’ LEN(name) gives 14 SUBSTRING(2,6,name) gives ‘th Bo’ SUBSTRING(0,2,name) gives ‘Set’ CHAR_TO_CODE(char) To find the ASCII value of a character CHAR_TO_CODE(‘D’) gives 68 CODE_TO_CHAR(int) To find the character that relates CODE_TO_CHAR(68) gives “D” to the integer ASCII value given. (For more on ASCII see section 3.5.) Concatenation ■ ■ Concatenation means joining multiple strings together. This is done using the + operator. When strings are concatenated, they are joined together in the order given: texta ← 'two' textb ← 'words' new ← texta + textb OUTPUT(new) The above code would join together the two strings to print out ‘twowords’. String/type conversion String conversion (or casting) means to convert data from or to the string data type. The following keywords are used in AQA Pseudo-code: ■ 67 9781510484306.indb 67 19/08/20 8:32 PM 2 Programming Keyword STRING_TO_INT() STRING_TO_REAL() INT_TO_STRING() REAL_TO_STRING() Examples a ← STRING_TO_INT(‘123’) b ← STRING_TO_INT(‘7’) c ← STRING_TO_REAL (‘12.9’) d ← STRING_TO_REAL (‘46’) e ← INT_TO_STRING(17) f ← INT_TO_STRING(140) g ← REAL_TO_STRING(17.2) h ← REAL_TO_STRING(3.8) Comment Converts a value stored as a string to an integer. Converts a value stored as a string to a real/floating point number. Converts a value stored as an integer to a string. Converts a value stored as a real/ floating point number to a string. Not all values can be converted to the string data type! 2.9 Random number generation ■ ■ ■ Random numbers can be generated by a programming language. Typically, upper and lower limits to the random number are passed in as parameters and a random value between these two limits is returned by the subroutine. In AQA Pseudo-code, the RANDOM_INT(x, y) subroutine allows us to do this. Programmers can pass in the lowest and highest number required as arguments (in brackets): RANDOM_INT(1, 5) #chooses a random integer between 1 and 5 RANDOM_INT(20, 30) #chooses a random integer between 20 and 30 2.10 Structured programming and subroutines ■ ■ ■ It is good practice to break programs down into subroutines. A subroutine is a section of code that is defined outside the main body of the program. It is given its own identifier which is then used to call the subroutine as many times as required. When a subroutine is called, control passes from the main program to the subroutine. Once the subroutine has completed, control is passed back to the main program. Here is a subroutine and a call from the main program: ■ #subroutine definition with parameters SUBROUTINE timestable(tt, nums) FOR x ← 1 TO nums OUTPUT tt * x ENDFOR ENDSUBROUTINE 68 9781510484306.indb 68 19/08/20 8:32 PM AQA GCSE Computer Science #call the subroutine to run timestable(8, 10) timestable(9, 12) This subroutine is called timestable. The values in brackets are parameters that can be passed from the main program into the subroutine. Some subroutines return a value back to the main program when it returns control. This value can then be stored, printed or otherwise used in the main program. For example: #subroutine definition that returns a value SUBROUTINE circle_area(radius) const pi ← 3.14159 area ← pi * radius^2 RETURN area ENDSUBROUTINE #how to use the subroutine new ← circle_area(10) OUTPUT new The area of a circle is calculated in this subroutine and passed to the variable new in the main program. This is done by using the RETURN keyword. Local variables ■ ■ ■ Variables used in the subroutine are called local variables. Local variables only exist within the subroutine and cannot be called by the main program. It is good practice use local variables because any errors associated with them will be limited to the subroutine rather than across the whole program – this makes code easier to debug. Advantages of the structured approach to programming Structured programming is an approach that involves modularised programming, using clear well-documented interfaces and returning values from subroutines. Splitting programs down into multiple subroutines rather than having one large program is known as modularised programming. A well-documented interface for a subroutine means that parameters are passed in rather than values being asked for from the user, local variables are used to complete any processing and the overall result is returned back to the main program using the RETURN keyword. The main advantages of the structured approach to programming include: ■ ■ Reduces the overall size of the program as code does not need to be repeated in multiple places. Makes the code much easier to maintain as it is easier to read and understand the purpose of each subroutine. 69 9781510484306.indb 69 19/08/20 8:32 PM 2 Programming ■ ■ Reduces development and testing time as code is much easier to write and debug. Allows reuse of code between programs, especially where pre-written and pre-tested subroutines can be used. A function is a subroutine that returns a value. A procedure is a subroutine that does not return a value. 2.11 Robust and secure programming A robust program will anticipate unusual user behaviour and still function correctly. ■ A secure program is one that only allows authorised users to access it. Robust and secure programs can be written using the following techniques. ■ Data validation This is used to ensure that users have entered data in the correct form, for instance: ■ ■ ■ ■ of the correct type within a sensible range of the correct length not empty. Authentication This is used to ensure that only authorised users access a system by establishing a user’s identity. This can be one in a number of ways: ■ ■ using usernames and passwords through possession of an electronic key, device or account using biometrics. Two-factor authentication is where two of the above are checked simultaneously. ■ Testing The purpose of testing is to ensure that the program functions as expected and meets all requirements. ■ Testing should not simply aim to prove that the program works, but also try to break it. Only by knowing that it cannot easily be broken can we be satisfied that it works. Test data should be chosen so that the system as a whole can be tested destructively, checking for errors wherever they may occur. Test data should be chosen to include as many of the following as possible: ■ ■ Normal test data is data of the correct type that would typically be expected from a user who is correctly using the system. This should be accepted by the program without causing errors. 70 9781510484306.indb 70 19/08/20 8:32 PM AQA GCSE Computer Science ■ ■ Boundary test data is test data that is of the correct type but is on the very edge of being valid. Boundary test data should be accepted by the program without causing errors. Erroneous test data is test data that is outside the expected values or of the incorrect type, and should be rejected by the system. A test plan lists all of the tests that will be carried out, the expected result and the actual result in each case. For example: Test data Type of test data Normal Boundary Erroneous Reason Expected result Actual result A syntax error is one that breaks the grammatical rules of the programming language. Examples include misspelling a keyword, missing a bracket or using a keyword in the wrong way. Syntax errors will stop the program from running. A logic error is one that causes the program to produce an unexpected or incorrect output but will not stop the program from running. 71 9781510484306.indb 71 19/08/20 8:32 PM QUESTION PRACTICE 2 Programming 01 A program needs to perform the following tasks: • Input two numbers from the user. • Subtract the first number from the second number and output the result. 01.1 Complete the pseudo-code for this program. [2 marks] OUTPUT 'Enter first number' num1 ← USERINPUT OUTPUT 'Enter second number' num2 ← USERINPUT result ← …………………………………………… ……………………………………………………………………… 01.2 Identify one variable used in this program. [1 mark] 01.3 Give one reason why you have identified this item as a variable. [1 mark] 02 Alisha is writing a program to say whether or not someone is old enough to vote. The program needs to perform the following tasks: • Input an age from the user. • Compare the number to 18 and output either “Old enough to vote” or “Too young to vote”. 02.1 Complete the pseudo-code for this program. [4 marks] OUTPUT 'Enter your age' age ← USERINPUT ……………………… age ……………… 18 THEN ……………………………………………………………… ELSE ……………………………………………………………… ENDIF 02.2 Identify two basic programming constructs that have been used in this algorithm. [2 marks] 72 9781510484306.indb 72 19/08/20 8:32 PM AQA GCSE Computer Science 03 The following program outputs a series of numbers: FOR i ← 1 TO 5 FOR k ← 1 TO 4 OUTPUT (i * 2) + k ENDFOR ENDFOR 03.1 Give the first three numbers that will be printed by this algorithm. 03.2 Tick () one box in each row to identify whether each programming construct has or has not been used in this program. Has been used [1 mark] [3 marks] Has not been used Sequence Selection Iteration 03.3 Line 3 of the program is changed to OUTPUT i - k. Give the first three numbers that will now be printed. [1 mark] 04 Write an algorithm that uses a condition-controlled loop to print out the numbers 5 to 1 in descending order. [3 marks] 05 Bob is writing a program to calculate the area of a circle using the formula: area = π * r2 The program needs to perform the following tasks: • Input a whole number from the user. • Check that the number is between 1 and 20. • Repeat bullets 1 and 2 if the user enters a number that is too large or small. • Calculate the area of the circle and output the result. 05.1 Complete the pseudo-code for this program. [4 marks] radius ← 0 WHILE radius < 1 ………… radius >…………… OUTPUT 'Enter the radius' radius ← USERINPUT ENDWHILE area ← 3.14159 * radius…………… OUTPUT……………………………………… 9781510484306.indb 73 73 19/08/20 8:32 PM 2 Programming 05.2 Identify one item in the program that could have been written as a constant. [1 mark] 05.3 Give one reason why you have identified this item as a constant. [1 mark] 06 A computer system stores data about train journeys, including the destination, cost, number of changes needed and whether first class seating is available. Destination Cost (£) Number of changes First class Birmingham 120.00 0 TRUE London 75.60 1 TRUE Liverpool 98.40 3 FALSE Newcastle 143.50 2 FALSE Edinburgh 174.00 2 TRUE Identify a suitable data type for each field and justify your choice. [8 marks] Destination: Cost: Number of changes: First class: 07 The following shows values stored in an array called fruit at the start of a process: 0 1 2 3 0 Apple Cherry Banana Pear 1 Lemon Orange Raspberry Damson 2 Grape Pineapple Peach Plum The value of fruit[0][3] is Pear. 07.1 What is the value of fruit[1][3]? [1 mark] 07.2 What location stores the word ‘Orange’? [1 mark] 07.3 Redraw the table after the following series of commands has been executed: fruit[0][0] ← 'Lime' fruit[2][0] ← 'Strawberry' fruit[2][2] ← ' ' [3 marks] 74 9781510484306.indb 74 19/08/20 8:32 PM AQA GCSE Computer Science 08 Gill is writing a program to calculate an employee’s weekly pay. It requires the user to input the number of hours worked in a week and the hourly rate of pay. Describe two examples of robust programming that Gill should consider when writing her program [4 marks] 09 The program below should only allow values from 1 to 100 as valid inputs. If the data entered breaks this validation rule an error message is displayed. 09.1 Using AQA Pseudo-code, complete the following program to output “Invalid input” if the data does not meet the validation rule. [3 marks] OUTPUT 'Enter minutes played' mins ← USERINPUT IF mins < 1 ………………… mins ………………… THEN ……………………………… 'Invalid input' ENDIF 09.2 Explain what is meant by data validation. 10 10.1 U sing either AQA Pseudo-code or a high-level programming language, write a program to authenticate a user. [1 mark] [7 marks] The program should perform the following tasks: • Take the username and password as inputs. • Check that the username is Andy123 and the password is ComputerX. • Repeat bullets 1 and 2 if either input is incorrect. • When the username and password are correct say “Access Granted”. 10.2 Explain the purpose of using authentication to confirm the identify of a user. [2 marks] 75 9781510484306.indb 75 19/08/20 8:32 PM 2 Programming 11 The following program takes two numbers as inputs from the user and then says which is the biggest. OUTPUT 'Enter number' m ← USERINPUT OUTPUT 'Enter number' n ← USERINPUT IF m > n THEN OUTPUT m + ' is biggest' ELSE IF n > m THEN OUTPUT n + ' is biggest' ELSE OUTPUT 'They are the same' ENDIF 11.1 Explain, using examples from the program, two ways in which the maintainability of the program can be improved. [4 marks] rite a subroutine that takes a number between 1 and 12 as a parameter 12 12.1 W and returns the corresponding month of the year as a string. [5 marks] 12.2 Describe a local variable. [2 marks] 12.3 Describe one benefit to a programmer of using subroutines. [2 marks] 13 Raheem is writing a computer program that will take in the base and height of a triangle as inputs from the user and outputs the area using the formula: area = 0.5(base x height). 13.1 Using AQA Pseudo-code, complete the program: [4 marks] OUTPUT 'Enter the base' base ← USERINPUT OUTPUT 'Enter the height' height ← …………… …………… ← 0.5 ………… …………… (base * height) area Raheem decides to rewrite the calculation as a subroutine called areaTriangle ( ), that takes the base and height as parameters and returns the area. 76 13.2 Rewrite the program above as a subroutine, using either AQA Pseudo-code or a high-level programming language you have studied. 9781510484306.indb 76 [4 marks] 19/08/20 8:32 PM AQA GCSE Computer Science 14Olivia has written a program to work out whether a number is a prime number. Her first attempt is shown below: 01 OUTPUT 'Enter a number' 02 number ← USERINPUT 03 IF number > 1 THEN FOR i ← 2 TO number − 1 04 05 IF (number MOD 1) = 0 THEN 06 OUTPUT number + ' is not a prime number' 07 BREAK 08 ELSE 09 OUTPUT nubmer + ' is a prime number' 10 ENDIF 11 12 ENDFOR ELSE 13 14 OUTPUT number + ' is not a prime number' ENDIF The program contains a logic error in line 05. 14.1 State what is meant by a logic error. [1 mark] 14.2 Give a corrected version of line 05 that fixes the error. [1 mark] The program also contains a syntax error in line 09. 14.3 State what is meant by a syntax error. [1 mark] 14.4 Give a corrected version of line 09 that fixes the error. [1 mark] 14.5 Olivia has been testing her program regularly as she writes it. State the name given to this type of testing. [1 mark] 15 The program below checks whether a number entered by the user is between 1 and 20. If the number is not within this range an error message is displayed. OUTPUT 'Enter a number' userNumber ← USERINPUT WHILE userNumber < 1 OR userNumber > 20 OUTPUT 'Invalid input' ENDWHILE OUTPUT 'Number accepted' 9781510484306.indb 77 77 19/08/20 8:32 PM 2 Programming 15.1 Complete the following test plan for the program above. Test data 15 Test type Normal [3 marks] Expected result Value accepted Invalid input Erroneous message displayed Boundary 15.2 Explain the purpose of testing the program in this way. [1 mark] 16 A teacher stores test results for each student in their class in a 2D array. The figure below shows part of the array, with five students and five sets of test results. The index labels are also shown. Test ID 0 1 2 3 4 Student ID 0 1 15 18 9 14 14 15 18 17 11 12 2 11 15 12 18 13 3 12 10 11 16 12 4 14 13 10 15 15 The teacher wants to output the results that student 2 achieved in the last test that they took (Test ID 4). She writes the following code: OUTPUT testScores[2][4]. The output is 13. 16.1 Write the code to output the score achieved by student 1 in the second test that they took. [1 mark] 16.2 State the output if the teacher runs the code: [1 mark] OUTPUT testScores[0][1] 16.3 State the output if the teacher runs the code: [1 mark] OUTPUT testScores[3][1] + testScores[4][3] 16.4 The teacher writes a program to work out the average mark achieved by a student across all five tests. total ← 0 total ← total + testScores[1][0] total ← total + testScores[1][1] total ← total + testScores[1][2] total ← total + testScores[1][3] total ← total + testScores[1][4] 78 9781510484306.indb 78 19/08/20 8:32 PM AQA GCSE Computer Science average ← total / 5 OUTPUT average Refine the program to be more efficient. Write the refined version of the algorithm using either AQA Pseudo-code or a high-level language you have studied. [4 marks] 17 The program below gives the grade achieved for an exam mark entered by the user. m ← 0 WHILE m < 1 OR m > 100 OUTPUT 'Please enter your mark:' m ← USERINPUT ENDWHILE IF m > 80 THEN OUTPUT 'Grade A' ELSE IF m > 65 THEN OUTPUT 'Grade B' ELSE IF m > 50 THEN OUTPUT 'Grade C' ELSE OUTPUT 'Fail' ENDIF 17.1 Describe two ways in which the maintainability of this algorithm could be [4 marks] improved. 17.2 Complete the test plan for the program above. Test data 70 Test type Erroneous Boundary [4 marks] Expected result ‘Grade B’ ‘Grade A’ 79 9781510484306.indb 79 19/08/20 8:32 PM 3 FUNDAMENTALS OF DATA REPRESENTATION CHAPTER INTRODUCTION In this chapter you will learn about: 3.1 Number bases ➤ Decimal, binary and hexadecimal ➤ Understand that computers use binary ➤ Explain why hexadecimal is used 3.2 Converting between number bases ➤ Converting between binary and decimal ➤ Converting between decimal and hexadecimal ➤ Converting between binary and hexadecimal 3.3 Units of information ➤ Bits ➤ Bytes ➤ Prefixes 3.4 Binary arithmetic ➤ Adding binary numbers ➤ Binary shifts ➤ Simple multiplication and division 3.5 Character encoding ➤ 7-bit ASCII and Unicode character sets ➤ Grouping of character codes 3.6 Representing images ➤ Pixels ➤ Bitmap images ➤ Image size and colour depth ➤ Factors affecting bitmap file size ➤ Calculating bitmap file size ➤ Conversion of binary data into a bitmap image ➤ Conversion of a bitmap image into binary data 80 9781510484306.indb 80 19/08/20 8:32 PM AQA GCSE Computer Science 3.1 Number bases 3.7 Representing sound ➤ Digital representation of sound ➤ Sample rate and resolution ➤ Calculating sound file size 3.8 Data compression ➤ The need for compression ➤ Different ways to compress data ➤ Huffman coding ➤ Huffman trees ➤ Calculating bits required for Huffman coding ➤ Run-length encoding Tech term Denary An alternative name for the decimal system. Decimal, binary and hexadecimal The number system we use in everyday life is called the decimal system. In the decimal (or denary) system we are used to using ten symbols or values: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. We use these symbols to write numbers. For example: the number 527 is made up of ● 5 lots of 100, ● 2 lots of 10 and ● 7 lots of ones. Written into a table this can be seen more clearly: 100 10 1 5 2 7 The column values are ten times larger than the previous value as we move from right to left, for example the ‘hundreds’ are ten times bigger than ‘tens’, and ‘tens’ are ten times bigger than ‘ones’. Because it uses ten different symbols, decimal is known as a base 10 number system. However, decimal is not the only number system. An alternative number system is called binary. In binary we just have two symbols or values: 0 and 1. This means that in the binary number system each column heading is twice as big as the previous one, as we move from right to left: 128 64 32 16 8 4 2 1 1 0 0 1 0 1 0 1 The column headings for the binary system are written in decimal, from right to left, as ‘ones’, ‘twos’, ‘fours’, ‘eights’, and so on. Because binary uses only two symbols it is known as a base 2 number system. An example of a binary number is: 10010101. Computer scientists use another number system called hexadecimal (or hex for short). Hexadecimal is a base 16 number system, so it requires 16 separate symbols. It uses 0–9 for the first ten symbols, just like decimal. However, it requires extra symbols to represent the decimal numbers 10–15. It uses the letters A–F for this purpose. Base 10 (decimal) Base 2 (binary) Base 16 (hex) 0 0 0 1 1 1 2 10 2 3 11 3 4 100 4 5 101 5 6 110 6 81 9781510484306.indb 81 19/08/20 8:32 PM 3 Fundamentals of data representation Base 10 (decimal) Base 2 (binary) Base 16 (hex) 7 111 7 8 1000 8 9 1001 9 10 1010 A 11 1011 B 12 1100 C 13 1101 D 14 1110 E 15 1111 F The column values for base 16 (hex) are based on multiples of 16 so the first two columns are: 16 1 As hexadecimal is base 16, the second column heading is 16 times larger than the first column. Binary representation in computers Computers use millions of tiny switches to store and process data and each of these switches has just two states: either 1 (on) or 0 (off). 1 and 0 are the two symbols in the binary number system. This means that, in a computer, all data – numbers, characters, sounds and images – are represented in binary. So, for instance, the binary number 01000100 could represent: ● part of an image ● a small segment of a sound recording ● the letter ‘D’ ● the number ‘68’. So, a computer needs to understand what the kind of data each binary number represents. Hexadecimal in computer science Programmers often work with binary numbers, such as 11011100 and 11011000. These numbers can be hard for computer scientists to remember and communicate without introducing errors. However, these same two numbers can be represented in hexadecimal as DC and D8. These are far easier for programmers to use and remember. So, hexadecimal is used by programmers because it is simpler to work with and easy to convert to and from binary. You will learn how to convert between binary and hexadecimal in the next section. Knowledge check 1 Why do computers work in binary? 2 Why do programmers use hexadecimal? 82 9781510484306.indb 82 19/08/20 8:32 PM AQA GCSE Computer Science 3.2 Converting between number bases Converting between binary and decimal Using the table we have already seen for binary, we can calculate the decimal equivalent of a binary number. Worked example What is the binary number 100111 in decimal? First, copy the binary number 100111 into the table, always ensuring that you fill the table from the right so that the ‘1’ column is filled with either a ‘0’ or ‘1’: 128 64 32 16 8 4 2 1 1 0 0 1 1 1 We have: ➤ 1 lot of 32 no lots of 16 ➤ no lots of 8 ➤ 1 lot of 4 ➤ 1 lot of 2 ➤ 1 lot of 1 This is equal to: ➤ 32 + 4 + 2 + 1 = 39 100111 in binary is 39 in decimal. Notice that the 128 and 64 columns are blank. This is the same as if they contained ‘0’. Key point You are expected to be able to work with binary numbers with up to eight digits, that is, 11111111, but also with values containing fewer than eight binary digits. To convert decimal numbers into binary we use the column heading values from the binary table. For each column value, starting at the left-hand side, we decide if it is smaller than or equal to our decimal number. If it is smaller or equal to, we record 1 in the table, and subtract that column value from the original decimal number. We then check if that new number is smaller than the next column value. If it is not smaller, we check the original decimal number against the next column, and so on. We continue this process until we are left with the 1s column. A worked example will make this clearer. 83 9781510484306.indb 83 19/08/20 8:32 PM 3 Fundamentals of data representation Worked example Convert the decimal number 142 into binary. Is 128 smaller than 142? YES, it is, so we record 1 in the first column from the left-hand side. 128 64 32 16 8 4 2 1 1 We subtract 128 from 142, to leave 14. We now check 14 against the next value, 64. 64 is not smaller than 14, so we record 0 in the second column. 128 64 1 0 32 16 8 4 2 1 We continue to check 14. We can see that 32 is not smaller than 14, and neither is 16, so we can record 0s in the next two columns. 128 64 32 16 1 0 0 0 8 4 2 1 We find that 8 is smaller than 14, and so we put a 1 in the column under 8. We subtract 8 from 14, leaving 6. We now check the next column against 6. 4 is smaller than 6. We put a 1 in the column under 4 and subtract 4 from 6, leaving 2. We now check the next column against 2. 2 is the same as 2 so we put a 1 in the column under 2. Subtracting 2 from 2, leaves 0. We have been left to check the number 0 against the last column. No positive number is smaller than zero, so the final entry is a 0 under the 1 column. The final table looks like this: 128 64 32 16 8 4 2 1 1 0 0 0 1 1 1 0 We can check our answer by converting the binary number 10001110 back into decimal. We find that: 10001110 = 128 + 8 + 4 + 2 = 142 We have confirmed that 10001110 is indeed the decimal number 142. 84 9781510484306.indb 84 19/08/20 8:32 PM AQA GCSE Computer Science Worked example Convert the decimal number 83 into binary. We follow the same process as before. 128 is larger than 83 so the entry in the 128 column is 0. 64 is smaller than 83, so we enter a 1 in the column under 64. We also note that 83 − 64 = 19 and we will now check 19 against the next columns. The next column that is smaller than 19 is 16, so we enter 0 in the 32 column and 1 in the 16 column. We then note that 19 − 16 = 3 and we will now check 3 against the next columns. The next column that is smaller than 3 is the 2 column. So we enter a 1 in the 2 column and note that 3 − 2 = 1. We check this 1 against the 1 column, and 1 is of course equal to 1. So, we enter a 1 in the final column. So, the binary number is: 128 64 32 16 8 4 2 1 0 1 0 1 0 0 1 1 We can check our answer by converting 01010011 back into decimal. We find that: 01010011 = 64 + 16 + 2 + 1 = 83 We have confirmed that 01010011 is indeed the decimal number 83. Note that we only really need seven binary digits to represent this binary value. So we can leave the column for 128 blank and give the answer as 1010011. (We do the same thing in decimal – for instance in this example we could have referred to the number 083. However, we don’t need the left-hand zero as it doesn’t provide any additional information, and so we normally drop it and just call the number ‘83’.) Key point Knowledge check You need to show your working in the examination so using a table and showing the key subtractions will demonstrate clearly how you arrived at the answer. 3 Convert the following binary numbers to decimal. (a) 1001 (b) 11101 (c) 110001 (d) 10001100 (e) 11011011 (f) 11111100 4 Convert the following decimal numbers to binary. (a) 20 (b) 46 (c) 75 (d) 98 (e) 147 (f) 213 85 9781510484306.indb 85 19/08/20 8:32 PM 3 Fundamentals of data representation Converting between decimal and hexadecimal To convert hexadecimal numbers to decimal we ● Convert the individual symbols to their decimal equivalent. ● Multiply the decimal values by the column values – either 16 or 1. ● Add the results. As a reminder, here are the decimal equivalents of hexadecimal: Base 10 (decimal) Base 16 (hex) 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 A 11 B 12 C 13 D 14 E 15 F Worked example To convert AF in hexadecimal to decimal 16 1 A F A is equivalent to 10 and F is equivalent to 15 in decimal. (10 × 16) + (15 × 1) = 175 AF in hexadecimal is 175 in decimal. To convert decimal numbers to hexadecimal: ● We check if 16 will divide into the number. ● If it does, we write down how many times using the correct hexadecimal symbol in the 16s column. ● We then convert the remainder into its hexadecimal symbol and write it in the 1s column. 86 9781510484306.indb 86 19/08/20 8:32 PM AQA GCSE Computer Science Worked example Convert 189 in decimal to hexadecimal. 189 divided by 16 = 11 remainder 13. The hex symbol for 11 is B. The hex symbol for 13 is D. 189 in decimal is BD in hexadecimal. You can check this: 11 x 16 = 176 189 − 176 = 13 Knowledge check 5 Convert the following decimal values to hexadecimal. (a) 52 (b) 67 (c) 165 (d) 191 (e) 201 6 Convert the following hexadecimal numbers to decimal. (a) 12 (b) 58 (c) 5D (d) AE (e) CA Converting between binary and hexadecimal We have discussed how hexadecimal is used by programmers because it is far easier for them to use, remember and communicate without introducing errors. Now we will show how easy it is to convert hexadecimal to and from binary. Here’s a table to help you. Base 10 (decimal) Base 2 (binary) Base 16 (hex) 0 0000 0 1 0001 1 2 0010 2 3 0011 3 4 0100 4 5 0101 5 6 0110 6 7 0111 7 8 1000 8 87 9781510484306.indb 87 19/08/20 8:32 PM 3 Fundamentals of data representation Key point Another name for a 4-bit number is a nibble. Base 10 (decimal) Base 2 (binary) Base 16 (hex) 9 1001 9 10 1010 A 11 1011 B 12 1100 C 13 1101 D 14 1110 E 15 1111 F Consider the binary number 11001110. This can be represented using just two hexadecimal digits. To do so: ● First, split it into two nibbles (4-bit numbers): 1100 and 1110. Each 4-bit binary number is equivalent to one hexadecimal symbol. To represent the values 10 to 15 we use A through to F. ● Now work out the hexadecimal for each 4-bit number. ● In this case, 1100 is equivalent to the decimal number 12, which is the hex number C. ● 1110 is equivalent to the decimal number 14, which is the hex number E. ● 11001110 is, therefore, CE in hexadecimal. 4-bit binary numbers can represent 16 different values – from 0000 up to 1111, which is 0 to 15 in decimal. Hence any 4-bit binary number can be represented by just one hexadecimal number 0–F. Worked example (a) Converting 11011111 in binary to hexadecimal The binary number is divided up into two nibbles and the equivalent hexadecimal symbol identified. This can be done by first converting the binary to decimal if necessary. 1101 1111 D F 11011111 in binary is DF in hexadecimal. (b) If we have a binary number with fewer than eight digits, the same process applies. We split up the number into nibbles (4-bit numbers), starting from the right-hand side. We add leading 0s to make it up to 8 bits. For example, using the binary number 1011101: 0101 1101 5 D 1011101 in binary is 5D in hexadecimal. We use a similar process to convert between hexadecimal and binary; we replace the hex symbol with the equivalent binary nibble. 88 9781510484306.indb 88 19/08/20 8:32 PM AQA GCSE Computer Science Worked example (a) Converting B8 in hexadecimal to binary B 8 1011 1000 B in hexadecimal is 11 in decimal, which is 1011 in binary. 8 in hexadecimal is 8 in decimal, which is 1000 in binary. B8 in hexadecimal is 10111000 in binary. (b) Converting 3A from hexadecimal to binary 3 A 0011 1010 3 in hexadecimal is 3 in decimal, which is 0011 in binary. A in hexadecimal is 10 in decimal, which is 1010 in binary. 3A in hexadecimal is 111010 in binary. Knowledge check 7 Convert the following binary numbers to hexadecimal. (a) 10011100 (b) 110011 (c) 11111111 (d) 111001 (e) 1001110 8 Convert the following hexadecimal numbers to binary. (a) 95 (b) AB (c) 1D (d) A3 (e) 56 Key point You will be expected to work with and convert between all three number systems in the range: 00–FF in hex 0–255 in decimal 00000000–11111111 in binary. 89 9781510484306.indb 89 19/08/20 8:32 PM 3 Fundamentals of data representation 3.3 Units of information Tech term Transistor An electronic ­component that acts like a switch – transistors can be manufactured at such a small size that hundreds of millions can fit on a small computer chip. A computer uses electronic circuits etched onto computer chips to store data and instructions. These circuits contain electronic switches made from tiny transistors. Each switch can be in one of two states: on or off. The two states are represented by the numbers 1 or 0. A computer uses combinations of these 1s and 0s to represent data and instructions. As we have discussed, the binary number system only uses the two values 1 and 0 and therefore binary is used to describe the on–off status of all the switches in a computer. One binary digit is called a bit. The symbol for this is b. A bit is the fundamental unit of information. Computers often group 8 bits together as one unit of data. These 8 bits together are called a byte with the symbol B. 4 bits grouped together are called a nibble, which has no symbol. In standard scientific notation the prefix ‘kilo’ means 1000 – for instance 1 kilometre is 1000 metres. We use kilo, and a whole set of other units, based on this scientific notation: 8 bits (b) 1 byte (B) 1000 B 1 kilobyte (KB) 1000 KB 1 megabyte (MB) 1000 MB 1 gigabyte (GB) 1000 GB 1 terabyte (TB) 1000 TB 1 petabyte (PB) Worked example For example, if we have a file that is 2.5 MB, what is that in a) kilobytes, and b) bytes? (a) 2.5 MB = 2.5 × 1000 = 2500 kilobytes (b) 2.5 MB = 2.5 × 1000 × 1000 = 2 500 000 bytes Key point You need to show your working in the examination so show the key multiplications to demonstrate clearly how you arrived at the answer. Beyond the spec In some sources, you will find people referring to 1 kilobyte as 1024 bytes, 1 megabyte as 1024 kilobytes, etc. There are historical reasons for doing so but now there are different prefixes for this – for example, 1024 bytes is called 1 kibibyte. (1024 might seem like a strange number in decimal but is used because it is a ‘neater’ number in binary.) 90 9781510484306.indb 90 19/08/20 8:32 PM AQA GCSE Computer Science 3.4 Binary arithmetic Adding binary numbers together Adding binary numbers is very similar to the way we add decimal numbers. Let’s look at the way we add two decimal numbers. Worked example Adding 357 and 264 + + 3 5 7 2 6 4 3 5 7 2 6 4 1 7 + 4 = 11 So we write down 1 and carry 1 to be added in the next column. 5 + 6 + 1 = 12 So we write down 2 and carry 1 to added in the next column. 1 + 3 5 7 2 6 4 2 1 3 5 7 2 6 4 6 2 1 3+2+1=6 So we write down 6. 1 + So, the key thing to remember is that as soon as a column adds up to a number bigger than 9, we have to carry. When adding binary numbers, we follow the same process. The difference is that as soon as a column adds up to a number bigger than 1, we have to carry. We also need to note that: 1 + 1 = 10 in binary (that is 1 + 1 = 2 in decimal, which is written as 10 in binary) 1 + 1 + 1 = 11 in binary (that is 1 + 1 + 1 = 3 in decimal, which is written as 11 in binary) To summarise, for binary addition: Sum Result Carry 0+0 0 0 0+1 1 0 1+1 0 1 1+1+1 1 1 91 9781510484306.indb 91 19/08/20 8:32 PM 3 Fundamentals of data representation Or alternatively: 0 + 0 0 + 0 1 1 0 0 1 + 1 1 + 0 1 1 1 1 1 Worked example Let’s look at an example adding two 4-bit numbers, 1101 and 1111. + + 1 1 0 1 1 1 1 1 1 1 0 1 1 1 1 1 0 1 + 1 is 10 in binary so we write down 0 and carry 1 to be added in the next column. 0 + 1 + 1 is 10 in binary so we write down 0 and carry 1 to be added in the next column. 1 + + 1 1 0 1 1 1 1 1 0 0 1 1 1 1 0 1 1 1 1 1 0 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 0 1 1 1 1 + 1 + 1 + 1 is 11 in binary so we write down 1 and carry 1 to be added in the next column. 1 + 1 + 1 is 11 in binary so we write down 1 and carry 1 to be added in the next column. The carried 1 is the final digit in the sum giving the answer 11100. We can use the same method to add up binary numbers made of different numbers of bits. 92 9781510484306.indb 92 19/08/20 8:32 PM AQA GCSE Computer Science Worked example Let’s look at adding together three binary numbers made up of a different number of bits: 100101 + 10010 + 10000110 1 + 1 0 0 0 0 1 0 1 1 0 0 1 0 0 0 1 1 0 1 + 0 + 0 is 1 in binary so we write down 1 1 1 + 1 0 0 0 0 1 0 1 1 0 0 1 0 0 0 1 1 0 0 1 0 + 1 + 1 is 10 in binary so we write down 0 and carry the 1 to the next column 1 1 + 1 0 0 1 + 0 0 1 0 1 1 0 0 1 0 0 0 1 1 0 1 0 1 1 1 0 0 1 0 1 1 0 0 1 0 1 0 0 0 0 1 1 0 1 0 1 1 1 1 0 1 1 1 1 + 0 + 1 + 1 is 11 in binary so we write down 1 and carry the 1 to the next column We continue this process to complete the addition Key point In your exam you will be expected to be able to add up to three binary numbers, using a maximum of 8 bits per number. The sum will never exceed 8 bits. Knowledge check Complete the following additions in binary. 9 1100 + 110 10 1011 + 1001 11 110011 + 100101 + 1001 12 110110 + 10101 + 10000000 13 100001 + 11000 + 1000000 93 9781510484306.indb 93 19/08/20 8:32 PM 3 Fundamentals of data representation Binary shifts Moving the digits in a binary number left or right is called a binary shift. Multiplication Each time the value is shifted one place to the left, the number is multiplied by 2. For example, 40 in binary is 101000: 128 64 32 16 8 4 2 1 1 0 1 0 0 0 If we shift the digits one place to the left, and add 0 to the right-hand 1 column, we get: 128 64 32 16 8 4 2 1 1 0 1 0 0 0 0 The original number that has been moved one place to the left has been highlighted. In decimal this new number has the value 64 + 16 = 80. This is 40 multiplied by 2. If we shift another place to the left – which is two places to the left from the original number – we get: 128 64 32 16 8 4 2 1 1 0 1 0 0 0 0 0 In decimal, this new number is 128 + 32 = 160. This is the original number 40 multiplied by 4. Division Each time the value is shifted one place to the right the number is divided by 2. Starting again with the decimal number 40, which is 101000 in binary, if we shift the digits one place to the right we get: 128 64 32 16 8 4 2 1 1 0 1 0 0 In decimal, the new number 10100 is 16 + 4 = 20, which is 40 divided by 2. If we shift another place to the right we get: 128 64 32 16 8 4 2 1 1 0 1 0 In decimal this is 8 + 2 = 10, which is 20 divided by 2 (or the original number 40 divided by 4). 94 9781510484306.indb 94 19/08/20 8:32 PM AQA GCSE Computer Science Knowledge check Apply the shifts described to the following binary numbers and state the decimal equivalents before and after the shift. Comment on what has happened to the value. 14 1100 shift 2 places to the right. 15 11010 shift 1 place to the left. 16 101 shift 3 places to the left. 17 110000 shift 3 places to the right. 18 111 shift 4 places to the left. 19 10000000 shift 4 places to the right. 20 10011 shift 3 places to the left. 21 101100 shift 2 places to the right. 22 What would you need to do to a binary number to a) multiply it by 8, b) divide by 16? 23 What is the effect of shifting left by 3 then right by 2? 3.5 Character encoding Using binary codes to represent characters When you press the keys on a keyboard, the computer registers this as a binary code to represent each character. This character code can then be used to identify and display a character on screen or for printing. It is important for all computer systems to agree on these codes and their meanings if the data is to make any sense. There are, therefore, agreed international standards that are used to represent the character set for a computer system. Character sets and bits per character The character set of a computer is all the characters that are available to it. The number of characters in the character set depends upon how many characters can be represented by the associated codes. The first agreed standard was based on English with a limited number of extra symbols. Wider use of computers, and the need for many more languages and other symbols, has led to the development of more advanced coding standards for character sets. ASCII In 1960, the American Standards Association agreed a set of codes to represent the main characters used in English. This is called ASCII (American Standard Code for Information Interchange). This system was designed to provide codes for the following: All the main characters, i.e. 26 uppercase and 26 lowercase 52 characters All the numeric symbols 0–9 10 characters 32 punctuation and other symbols plus ‘space’ 33 characters 32 non-printable control codes 32 characters In total, this is 127 characters. The decimal number 127 is 1111111 in binary, which is a 7-bit number. This means that each character can be represented by a different 7-bit number, from 0000001 to 1111111. Initially the ASCII character set used 127 codes for the characters, with 0000000 meaning ‘no character’. This gave a total of 128. 95 9781510484306.indb 95 19/08/20 8:32 PM 3 Fundamentals of data representation One additional bit was used for error-checking purposes. This means that each ASCII character is represented by an 8-bit number and therefore that each character required 1 byte. Some ASCII codes are: 7-bit binary code Hex Decimal Character 0100000 20 32 ‘space’ 1000001 41 65 A 1000010 42 66 B 1000011 43 67 C 1100001 61 97 a 1111001 79 121 y 1111010 7A 122 z 1111111 7F 127 ‘delete’ Beyond the spec One additional bit is used at the beginning of the ASCII code, for error-checking purposes. This means that each ASCII character is represented by an 8-bit number and therefore that each character requires one byte. However, you will only be expected to perform calculations using the 7 bits of the ASCII codes that encode the first 128 characters. This referred to as the 7-bit ASCII code. (ASCII later removed the need for error checking and used all 8 bits to represent 256 characters.) Unicode Unicode was first developed to use 16 bits rather than the 7 bits of ASCII. This provided the ability to store 216 or 65 536 unique characters. Later developments of Unicode used even more bits to represent billions of different characters, including graphical symbols and emojis. This allows Unicode to represent many different alphabets and special symbols, which is a major advantage over ASCII. To ensure compatibility of all of these systems, the original ASCII character codes are the same within Unicode. ASCII is now considered to be a subset of Unicode. The ASCII and Unicode codes for the main alphabetic characters are allocated to the uppercase characters in sequence, followed by the lowercase characters in sequence. For instance, A is 65, B is one more at 66, C is next at 67, and so on. This means that if you are given the character code for one letter, you can work out the character code for another letter. There are also ASCII codes for the decimal numbers 0–9. These codes also run in order – for instance, the code for ‘1’ in ASCII is 49, ‘2’ is 50 and so on. If you are given the code for one number, you can work out the code for another number. Lowercase characters start with ‘a’ as 97, ‘b’ as 98, and so on. This means that when we sort text, ‘Z’ (which is 90) comes before ‘a’. For example, if the animals goat, bear, ape, zebra and deer were written as Goat, Bear, ape, Zebra, deer and sorted using ASCII values, they will be in the order: Bear, Goat, Zebra, ape, deer 96 9781510484306.indb 96 19/08/20 8:32 PM AQA GCSE Computer Science Character set Number of bits Number of characters Examples ASCII 7 128 Upper and lowercase, numbers, punctuation, some control characters. Unicode 16/32 bits 65 000/ 2 billion + As above plus all known language characters and different characters including wingdings and emojis. Knowledge check 24 If the ASCII value of A is 65 what is the ASCII value of (a) F (b) G (c) J 3.6 Representing images How an image is represented as a series of pixels and in binary A simple image can be made up of black or white blocks. Binary numbers can represent these black and white blocks, using 1 for black and 0 for white, with 8 bits (one byte) per row. These blocks are the smallest element of an image and are called pixels. Pixel is short for picture element. 0 1 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 1 1 1 1 1 1 0 1 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 1 0 0 1 0 1 0 0 1 1 1 1 0 0 1 1 0 0 0 0 1 1 Figure 3.1. A simple black and white image 97 9781510484306.indb 97 19/08/20 8:32 PM 3 Fundamentals of data representation Figure 3.1 is just 8 pixels wide by 8 pixels high. As each row is represented by 1 byte, and there are eight rows, this image requires 8 bytes to store it. Most images are not ‘blocky’ like this. This is because they are made up of many more pixels. For instance, the simple black and white drawing in Figure 3.2 is 100 × 152 pixels. This requires 15 200 bits to store it, which is 15 200/8 = 1900 bytes, or just under 2 kilobytes. Beyond the spec Figure 3.2 A higher resolution image Tech term Resolution How many bits there are in an image of a given width and height in mm – a higher resolution image has more pixels in the same width and height than a lower resolution image. If there are two bitmap images of the same size and one is made up of more pixels, it is said to have a higher resolution. Higher resolution images have a larger file size because they are made of more pixels and so more data is required to store them. Size of a bitmap image The image size is the width of the image in pixels multiplied by the height of the image in pixels. So, a bitmap that is 5 pixels wide and 7 pixels high would be 5 × 7 pixels. A bitmap image is always measured as width × height in pixels. Colour depth For colour images we need to store more than just a 1 or 0 for each pixel – we need to be able to store extra data to represent a range of colours. For instance, if we want each pixel to be one of four different colours, we would need to represent each pixel as one of four different values. We can use a 2-bit binary number to do this, with the values 00, 01, 10 and 11. The colour of each pixel will be represented by one of these four binary codes. For instance, if 11 is black, 10 is green, 01 is red and 00 is white: 11 10 01 00 Figure 3.3 Binary representation of four colours Then we can create a colour image as follows: 00 00 00 01 01 00 00 00 00 00 10 01 01 10 00 00 10 10 11 10 10 11 10 10 10 00 01 01 01 01 00 10 10 00 01 01 01 01 00 10 10 00 01 00 00 01 00 10 00 00 01 00 00 01 00 00 00 00 01 00 00 01 00 00 Figure 3.4 Image of four-colour space invader with binary codes 98 9781510484306.indb 98 19/08/20 8:32 PM AQA GCSE Computer Science The bitmap to represent this image can be seen in the figure and each row can be written down as: 0000000101000000 (top row) 0000100101100000 (second row) 1010111010111010 (third row) and so on. With 2 bits for each pixel, we can store 22 = 4 colours. If we use more bits to represent each pixel, we can represent more colours: 3 ● With 3 bits per pixel, each pixel can be one of 2 = 8 colours. (In binary, these eight codes are: 000, 001, 010, 011, 100, 101, 110, 111.) 8 ● With 8 bits per pixel, each pixel can be one of 2 = 256 colours. 16 ● With 16 bits per pixel, each pixel can be one of 2 = 65 536 colours! Colour depth is the number of bits used per pixel. The more bits per pixel, the larger the range of colours we can have in the image. The more colours we have available, the better the representation of the image. However, the more colours we have, the more bits per pixel we need. This means that more data is required to store each pixel. Consequently, the higher the colour depth, the larger the file needed to store the image. Look at the images in Figure 3.5 of a sign in Portmeirion. The original image has a bit depth of 8, which is 28 or 256 colours, and is 1.2 MB in size. As we reduce the number of colours available the image become less well defined but the size of the file reduces, to 787 KB for eight colours and 542 KB for four colours. Figure 3.5 The same image with 256 colours, eight colours and with just four colours 99 9781510484306.indb 99 19/08/20 8:32 PM 3 Fundamentals of data representation Calculating bitmap image size To calculate the image size of an image file, we need to know the colour depth and the width and height of the image in pixels. File size = image width (px) × image height (px) x colour depth This can also be written as: File size (bits) = W × H × D To calculate the file size in bytes, we need to divide this file size in bits by 8: File size (bytes) = (W × H × D) / 8 Worked example For example, what is the file size of an image with an 8-bit colour depth that is 1200 pixels high and 2000 pixels wide? File size = 8 × 1200 × 2000 = 19 200 000 bits We divide by 8 to get this in bytes: 19 200 000/8 = 2 400 000 bytes or 2.4 MB Knowledge check 25 What do we mean by colour depth? 26 How many colours can be represented using a 4-bit colour depth? 27 If 1 represents black and 0 represents white, draw the 5 × 5 pixel image with the bit pattern 00100 01010 10001 11111 10001 28 If 1 represents black and 0 represents white, what is the bit pattern for this image? 29 What is the file size for an image 200 pixels wide, 300 pixels high and with a colour depth of 4 bits per pixel? 30 A 10 pixel by 10 pixel image has 16 colours. Calculate the size of the image. 3.7 Representing sound Tech term Discrete Individual distinct values. Sound is a series of vibrations that move through the air as waves. These vibrations vary continuously and can take any value, which means they are analogue. To store it on a computer, we need to convert this analogue signal into discrete digital values. To do this we convert the analogue data into binary numbers. 100 9781510484306.indb 100 19/08/20 8:32 PM AQA GCSE Computer Science How sound can be sampled and stored in digital form The continuously varying sound signal is sampled at set time intervals. Each sample is a snapshot of the soundwave, represented by discrete digital values that can be stored by a computer. In Figure 3.6, the horizontal x-axis represents time. The vertical y-axis represents the size (or amplitude) of the vibration. At each time interval, the sound is sampled and the amount of vibration is measured and stored as a digital value. From the graph, we can read the corresponding amplitude for the curve at each sample point. The first few are shown in the table. Amount (amplitude) of vibration 80 70 X Y 60 1 30 50 2 50 40 3 30 30 4 10 20 5 0 10 6 30 0 0 2 4 6 8 10 12 14 16 Time Figure 3.6 Sound is sampled at set time intervals Notice that in the example the amplitude of vibration can only be stored to the nearest ten and the sampling is only done once per second. When the computer uses these values to recreate the sound we get a shape that is a similar shape to the original but not as smooth and missing a lot of the detail in the original. In this case the sampled sound will not be an accurate version of the original sound. 80 Amount (amplitude) of vibration 70 60 50 40 30 20 10 0 1 2 3 4 5 6 7 8 9 Time 10 11 12 13 14 15 16 Figure 3.7 Digital sound replayed by the computer 101 9781510484306.indb 101 19/08/20 8:32 PM 3 Fundamentals of data representation Factors affecting size and playback quality of a sound file The sample rate is the number of samples taken per second and is usually measured in hertz (Hz). One sample per second = 1 Hz. A typical commercial CD contains music sampled at 44.1 kHz (44 100 samples per second). If we sample more frequently, we will get a better approximation of the original sound. However, each sample requires a certain amount of data to store it, so a higher sample rate means we require a larger file to store the data. The sample resolution is the number of bits used to store each sampled value. The more bits we use, the more accurately we can represent the amplitude of vibration for that sample point, providing a better representation of the original sound. However, the more bits we use to store each data point, the larger the file needed to store the data. A typical CD recording has a resolution of 16 – which means 16 bits are used for each sample. This means that the analogue data can take one of 216 = 65 536 different digital values. In Figure 3.8 the original sound wave has been sampled again but this time there are twice as many samples per second and the resolution has increased so that now the amplitude of vibration can be measured to the nearest ‘1’. You can see how much more accurate this sample is, compared to the original in Figure 3.6 80 Amount (amplitude) of vibration 70 60 50 40 30 20 10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Time Figure 3.8 Higher sample rates and resolutions produce a better approximation to the original sound The duration of the sampling is how many seconds of sound are sampled. The more seconds of sound that are sampled, the more data we need to store and the larger the file needed to store that data. The size of the file needed to store sound data depends on these three factors: ● sample rate ● sample resolution ● duration in seconds. Size of file (in bits) = sample rate × sample resolution × duration in seconds If there is more than one channel, for example stereo recordings use two channels, we need to multiply by the number of channels. 102 9781510484306.indb 102 19/08/20 8:32 PM AQA GCSE Computer Science Worked example A sample of a sound is made at 44.1 kHz for 1 minute at a resolution of 16 bits. Calculate the size of the resulting sound file. 44.1 kHz means 44 100 samples per second. Each sample consists of 16 bits. So, the number of bits per second = 44 100 × 16 = 705 600 bits per second. The sample lasts for 60 seconds, so the total number of bits = 705 600 × 60 = 42 336 000 bits Divide by 8 to get the number of bytes: = 42 336 000 / 8 = 5 292 000 bytes Divide by 1 000 000 for megabytes: = 5 292 000 / 1 000 000 = 5.29 MB If this were a stereo recording, we would need to multiply by 2, the number of channels, to get 10.58 MB. Knowledge check 31 What is meant by sample resolution? 32 How does the sample rate affect the sound quality and the file size of the file used to store the sampled sound? 33 Calculate the file size for a 30 second sample at a resolution of 8 bits sampled at 1000 Hz 3.8 Data compression The need for compression As we have seen, storing images and sound can lead to large file sizes. When large amounts of data are transmitted across the internet, for example sending images or videos from one phone to another, it can be very slow and may be expensive. If we want to store lots of data on a device with limited memory, for example music and videos on a mobile phone, we might run out of space. However, we can compress data to make the file size smaller. This reduces transmission times between devices and allows more files to be stored. There are two types of compression – lossy and lossless. Lossy Lossy compression is where some data is removed to make the file smaller. The removed data is chosen as that least likely to be noticed by the human senses. For example, certain frequencies of sound are inaudible or barely audible to the human ear, so these frequencies can be discarded without any significant differences being detected. In 9781510484306.indb 103 103 19/08/20 8:32 PM 3 Fundamentals of data representation images, large areas of similar colour pixels are combined into one block of data so that the image still looks very similar to the original. For example, within an image we might replace 20 different shades of blue with just ten. Reducing the bit depth in an image will remove the data and reduce the file size. Figure 3.9 The same image compressed to 364 KB and to 166 KB In Figure 3.9, you can see that there is some loss of detail in the second image but the essential features of the original image are still clearly visible. It is important to note that once a file has been compressed using lossy compression techniques, it cannot be reversed and restored to its original condition. Lossless For some files, losing any information is simply not possible – for example, computer programs will only work if all the instructions are present. In text files, removing any of the words or characters would alter the sense of the document. For these files, lossy compression techniques are not suitable. However, there are techniques for compressing files without losing any of the original information. This is called lossless compression. This involves storing enough data and information to be able to recreate the original file exactly as it was before compression. 104 9781510484306.indb 104 19/08/20 8:32 PM AQA GCSE Computer Science Key point You will not be asked to construct a Huffman Tree in the examination but you might be asked to interpret one. To understand how to interpret one, it is helpful to understand how they are constructed. Huffman coding Huffman coding is a lossless compression method based on how frequently a particular piece of data occurs. It uses binary trees, which are structures that branch from a root node. In Huffman coding, this tree is used to represent the letters in a text file. Here is an example using the phrase ‘HOT HOT HOTTER’. Step 1: Create a frequency table for each character, including the space. H–3 O–3 T–4 E–1 R–1 Space (Sp) – 2 Step 2: Sort these into increasing frequency, i.e. how many times each letter occurs in the phrase. Character E R Sp H O T Frequency 1 1 2 3 3 4 Step 3: List the characters as a series of nodes in order of increasing frequency from right to left. T4 Tech term Branches The different routes available from a node in a tree structure O3 H3 Sp 2 R1 E1 Step 4: Start to build the tree by replacing the top two nodes with a new node made from their combined frequencies. Put this into the list in the correct order. Keep track of the original two nodes by putting them in a new column and linking them with ‘branches’ to the new node. In this case, E and R both have a frequency of 1, so they are replaced with a node labelled as 2 (i.e. which is 1 + 1). E1 2 R1 Sp 2 H3 O3 T4 Step 5: We now combine the next top two nodes in the first column to form a new node. As before, we replace them with a new node that is labelled with their combined frequencies and then insert that new node in the correct position. In this case the ‘2’ node is combined with the node for ‘space’ which also has a frequency of 2. So, the new node has a combined frequency of 4 (i.e. 2 + 2). We need to insert the new node with the combined frequency of 4 into the table at the correct position. In this case it goes after H and O both with value 3. Again, we keep track of the nodes that the new ‘4’ node has replaced by linking them with branches. 9781510484306.indb 105 105 19/08/20 8:32 PM 3 Fundamentals of data representation H3 O3 E1 2 R1 4 Sp 2 T4 Step 6: We continue to combine the two least frequent nodes at the top of the original list into a single node and then insert it in the correct order. In this case we combine H and O into a new node, with combined value 6 (i.e. 3 + 3) and insert this into the list in the correct position – at the end of the list after T, which has the value 4. E1 4 T4 6 2 R1 Sp 2 H3 O3 Step 7: Again, we combine the top two nodes from the original column – in this case the ‘4’ node and T, to make a new node of frequency 8 (i.e. 4 + 4). 8 is larger than 6 so the new node goes to the bottom of the list. 6 H3 O3 E1 2 4 R1 8 Sp 2 T4 Step 8: Finally, we combine the final two nodes in the original column, adding their frequencies together, so that we are left with only one node in the original column. In this case we add the 6 and 8 nodes to make a new 14 node. This completes the diagram. 6 H3 O3 14 E1 4 2 R1 8 T4 Sp 2 You can check that there are no errors by comparing the frequency in the root node (14) with the total sum of the frequencies in the table. 106 9781510484306.indb 106 19/08/20 8:32 PM AQA GCSE Computer Science Step 9: We now label the branches starting at the root node, 14. We can label either branch as 1 or 0, as long as we are consistent. However, it is common to label the upper/right branch with a 1 and the lower/left branch with a 0. So in this case we label the first branch from 14 to 6 with a 1 and the branch from 14 to 8 with a 0. We continue to label each pair of branches in this way until they are all labelled with 1 or 0. 1 H3 0 O3 6 1 14 1 0 1 4 8 0 0 T4 1 E1 0 R1 2 Sp 2 The tree can also be drawn with the root at the top and the branches below it. 14 1 0 8 6 1 0 T4 0 4 1 O3 H3 1 0 Sp 2 2 1 0 R1 E1 We can now allocate a code to each letter in the text by following the path from the root node to the character. For example, E is 0111: 14 0 1 8 0 6 1 T4 0 4 O3 0 1 H3 1 Sp 2 2 0 R1 1 E1 Similarly, Space is 010, and H is 11. 107 9781510484306.indb 107 19/08/20 8:32 PM 3 Fundamentals of data representation Calculating bits required for Huffman coding Using Huffman coding means that the most frequently used characters are encoded using fewer bits. The text ‘HOT HOT HOTTER’ uses: Character E R Sp H O T Frequency 1 1 2 3 3 4 Huffman code 0111 0110 010 11 10 00 Number of bits 4 4 3 2 2 2 Total (number of bits x frequency) 4 4 6 6 6 8 Using the Huffman codes, the phrase HOT HOT HOTTER is: 11100001011100001011100000 01110110 The text can be represented by a total of 34 bits (4 + 4 + 6 + 6 + 6 + 8) using this Huffman coding. In ASCII, each character is represented by a 7-bit binary number. Therefore, the total number of bits required to code the phrase in ASCII is the total number of characters in the phrase multiplied by 7. In this case the ASCII code would require: 14 letters × 7 bits = 98 bits. (If we had created a fixed-length code to represent just these letters, we would have needed to use 3-bit codes to represent the 6 characters.) The saving of Huffman coding versus ASCII coding is calculated as: Saving in bits = number of bits for ASCII coding − number of bits for Huffman coding. Method ASCII Fixed length Huffman Number of bits 112 42 34 Saving versus ASCII coding (in bits) 0 70 78 Using these Huffman codes, we can also represent new phrases made from the same letters, such as ‘THE TREE’: 0011011101000011001110111 By looking at the codes, we are also able to decode a phrase such as this by comparing the sequences to the codes generated for the characters. 001101110100110101000 00 11 0111 T H E 010 0110 10 10 00 R O O T The most frequently used symbols are allocated the smallest number of bits. Huffman coding provides significantly better compression of longer text files than for the short phrase in the example, and usually significantly better than using a fixed-length code. Run-length encoding (RLE) Run-length encoding (RLE) is a very simple method for compressing data by specifying the data and the number of occurrences of that data in sequence. For example, repeated text such as AAAABBBBBBCCCCCCC would be stored by using the fact it is (4,A) (6,B) (7,C). Using 7-bit ASCII codes (written in decimal) to represent the letters, this RLE sequence would be: 04 65 06 66 07 67 108 9781510484306.indb 108 19/08/20 8:32 PM AQA GCSE Computer Science (Remember that 65 is the ASCII code for A, 66 is the ASCII code for B, etc.) Each of these ASCII codes takes up 7 bits, and there are six codes, which means the RLE sequence takes up just 7 × 6 = 42 bits. If we coded the original text directly in ASCII, this would be the sequence: 65 65 65 65 66 66 66 66 66 66 67 67 67 67 67 67 67. Each ASCII code takes up 7 bits and there are 17 codes, which means coding the original phrase directly in ASCII would take up 7 × 17 = 119 bits. RLE only works well when there are large sequences of repeated data. For more random distributions of characters, the method could easily make the ‘compressed’ file larger than the original. RLE can also be used to compress bitmap image files by coding long runs of the same colour in the same way. For example, in this black and white bitmap image we have long runs of black pixels and white pixels. There are 4 black, 16 white, then 12 black. If we use 1 to represent black and 0 to represent white, we can see there are 4 ‘1’s followed by 16 ‘0’s followed by 12 ‘1’s. We can use RLE to encode this as: 4 1, 16 0, 12 1 If we wanted to represent this in binary, we can use 1 bit to identify the colour and 7 bits to identify the number of pixels in that colour. So the first four black pixels can be written as: 10000100, where the first number in bold (1) represents ‘black’ and 0000100 represents ‘4’. Similarly, the next 16 white pixels are represented by 00010000, and the final 12 black pixels are represented by 10001100. So, this image can be represented by three 8-bit numbers: 10000100 00010000 10001100 In contrast, if we simply stored the image in the normal way (as discussed on page 97), then as there are 32 pixels, the image would require 32 bits or 4 bytes. RLE works well with images if there are a lot of similar colour pixels in a row. 109 9781510484306.indb 109 19/08/20 8:32 PM 3 Fundamentals of data representation Knowledge check 34 What is file compression? 35 Describe the difference between lossy and lossless compression. 36 Create a Huffman tree for the phrase ‘PIED PIPER’ and calculate the total number of bits used. How many bits would be used if this were stored as an ASCII file? 37 For this Huffman tree, with the left-hand paths labelled with 1 and the right-hand paths labelled with 0, identify the codes for S and W. S Sp E L T W H 38 If 1 is black and 0 is white, create the RLE for this sequence of pixels. 39 Draw the sequence of pixels given by this RLE if 1 is black and 0 is white. 2 0, 3 1, 3 0, 4 1 110 9781510484306.indb 110 19/08/20 8:32 PM RECAP AND REVIEW 3 FUNDAMENTALS OF DATA REPRESENTATION Important words You will need to know and understand the following terms: decimal binary hexadecimal bit byte kilobyte megabyte gigabyte terabyte petabyte binary shift character set ASCII Unicode pixel image size colour depth analogue sample amplitude sample rate sample resolution lossy compression lossless compression Huffman coding run-length encoding (RLE) 3.1 Number bases Programmers use three different number systems: ■ decimal (base 10) ■ binary (base 2) ■ hexadecimal (base 16). Computers work in binary because: ■ Computers use millions of tiny switches to store and process data. ■ These switches have just two states, either 1 (on) or 0 (off), which can be represented in binary. Therefore, all data – numbers, characters, sounds and images – are represented in a computer as binary. Hexadecimal (or hex) is used in computer science because it is easy to convert binary to and from hex, and hex is easier for people to work with than binary. ■ 3.2 Converting between number bases Converting binary to decimal Our everyday counting system is called decimal (or denary). ■ Decimal is a base-10 number system. ■ This means it uses ten symbols or values: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. We write decimal numbers such as 348 as follows: 100 3 10 4 1 8 In decimal each column heading in the table is ten times as big as the column to the right. Binary is a base-2 number system. This means it uses two symbols or values: 0 and 1. ■ Each binary digit (0 or 1) is called a bit. We write binary numbers as follows: 10100010 ■ 128 64 32 1 0 1 16 0 8 0 4 0 2 1 1 0 In binary each column heading is twice as big as the previous one. 111 9781510484306.indb 111 19/08/20 8:32 PM 3 Fundamentals of data representation To convert a binary number into a decimal one, add together the column heading values for every column with a 1 in the binary number. 10100010 is 128 + 32 + 2 = 162 in decimal. Converting decimal to binary Create a table with eight columns representing an 8-bit binary number. 128 64 32 ■ ■ ■ 16 8 4 2 1 Starting from the left-hand 128 column, find the first column that is smaller than the decimal number you are converting and write a 1 in that column. Subtract that column heading value from the decimal number to get a remainder. Repeat the process using the remainder, that is, finding the next column value that is smaller than the remainder. Eventually there will be a remainder of either 1 or 0 to be entered into the right-hand 1 column. For example, the decimal number 84 is: ■ 128 64 32 0 1 0 16 1 8 0 4 1 2 0 1 0 Converting hexadecimal to decimal Hexadecimal (hex) is a base-16 number system. ■ This means it uses 16 symbols or values: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F. It uses the letters A–F for the decimal values 10–15. In hex, each column heading is 16 times as big as the previous one. ■ 16 1 To convert hex to decimal: ■ Convert each hex digit to its decimal equivalent. ■ Multiply the column headings by the equivalent decimal value. Add the two values together. For example, to convert CA to decimal: ■ 16 C 1 A 112 9781510484306.indb 112 19/08/20 8:32 PM AQA GCSE Computer Science C is 12 in decimal and A is 10 in decimal. 16 × 12 = 192 1 × 10 = 10 192 + 10 = 202 CA is 202 in decimal. Converting decimal to hexadecimal Divide the decimal number by 16 and write down the correct hexadecimal symbol for the result in the 16s column. ■ Convert the remainder to the correct hexadecimal symbol and right that down in the 1s column. For example, 175 in decimal is: 175/16 = 10 remainder 15. The hexadecimal number is therefore: ■ 16 A 1 F Converting binary to hexadecimal Write the binary number as an 8-bit number. ■ Split the 8-bit binary number into two 4-bit numbers (called nibbles). ■ Convert each nibble to the corresponding hex symbol. For example, convert 11110 to hex: ■ 11110 is a 5-bit number – this is 00011110 as an 8-bit number ■ first nibble is 0001 ■ second nibble is 1110 ■ 0001 in decimal is 1, which is 1 in hex ■ 1110 in decimal is 14, which is E in hex. Hence 11110 is 1E in hex. ■ Converting hexadecimal to binary To convert from hex to binary, we simply replace each hex symbol with the equivalent binary nibble. For example, convert 3F to binary: ■ 3 is 3 in decimal, which is 0011 in binary – this is the first nibble. ■ F is 15 in decimal, which is 1111 in binary – this is the second nibble. So, 3F is 00111111 in binary. This could also be written as just 111111. In the exam you will only need to convert between number systems in the following ranges: ■ ■ ■ 0 to 255 in decimal 00000000 to 11111111 in binary 00 to FF in hexadecimal. 113 9781510484306.indb 113 19/08/20 8:32 PM 3 Fundamentals of data representation 3.3 Units of information ■ ■ ■ Each stored binary digit is called a bit (binary digit). A group of 8 bits is called a byte. Half a byte, 4 bits, is called a nibble. 4 bits (b) 8 bits (b) 1000 B 1000 KB 1000 MB 1000 GB 1000 TB 1 nibble 1 byte (B) 1 kilobyte (KB) 1 megabyte (MB) 1 gigabyte (GB) 1 terabyte (TB) 1 petabyte (PB) It is important to use the correct symbol, lowercase b for bit and uppercase B for byte. 3.4 Binary arithmetic Adding binary numbers together When adding binary digits together there are several possibilities including: ■ 0 + 0 = 0 ■ 0 + 1 = 1 ■ 1 + 1 = 10 in binary, or 2 in decimal. In this case we write down 0 and carry 1 to the next column. ■ 1 + 1 + 1 = 11 in binary or 3 in decimal. In this case we write down 1 and carry 1. Binary shifts Moving the binary digits to the left or right is called a binary shift. Moving to the left multiplies the value by 2 for each place the value is shifted. Moving to the right divides the number by 2 for each place the value is shifted. 128 64 32 16 8 4 2 1 1 1 0 1 0 The binary number 11010 is 26 in decimal. If we shift the binary number one place to the left, we get: 128 64 32 16 8 4 2 1 1 1 0 1 0 0 Which is 52 (= 26 × 2) in decimal. 114 9781510484306.indb 114 19/08/20 8:32 PM AQA GCSE Computer Science If we shift the original binary number one place to the right, we get: 128 64 32 16 8 4 2 1 1 1 0 1 This number is 13 (= 26/2) in decimal. 3.5 Character encoding Each character (e.g. A, b, !, etc.) is represented in a computer by a binary code. ■ The character set of a computer is a list of all the characters available to the computer. It is important that computer systems all agree on these codes and there are some agreed standards. ■ ASCII ■ ■ ■ ■ ■ ASCII is an 8-bit binary code able to represent the Roman alphabet, numbers, some symbols and some control characters. 7 bits are used for characters with 1 bit used as an error check. There are 27 or 128 characters available. Uppercase and lowercase letters have different codes. For example, the letter D is represented by the binary code 1000100 in ASCII, which is the decimal number 68. Unicode ■ ■ Unicode originally used a 16-bit binary code to represent many additional non-Roman characters and a wide range of symbols. The 16-bit code has 216 or 65 536 characters available. Unicode has since been extended to use even more bits to represent billions of characters. ■ The original ASCII codes are the same in Unicode so ASCII can be considered a subset of Unicode. The binary codes for the main alphabetic characters are allocated to the uppercase characters in sequence, followed by the lowercase characters in sequence. For instance, A is 65, B is one more at 66, C is next at 67, and so on. This means that if you are given the character code for one letter, you can work out the character code for another letter. Lowercase characters start with a as 97, b as 98, and so on. This means that when we sort text, Z (which is 90) comes before a. ■ 3.6 Representing images ■ ■ ■ ■ Images are represented on screen as a series of pixels. A pixel is the smallest element of an image – these are the dots that make up the image on screen or in a printout. Pixels are stored in a computer as binary codes. The number of bits used for each pixel determines how many colours each pixel can represent. 115 9781510484306.indb 115 19/08/20 8:32 PM 3 Fundamentals of data representation Image size The image size is expressed as: width in pixels × height in pixels Colour depth With 1 bit for each pixel we have just two possibilities: 0 or 1. This means a pixel can only be one of two colours. The following row of pixels is described by the binary code 11100000, where 1 means black and 0 means white: 1 1 1 0 0 0 0 0 Figure 3.10 Two-colour encoding with one bit To store more than two colours we need more bits per pixel: 2 ■ 2 bits to store 4 (2 ) colours per pixel 3 ■ 3 bits to store 8 (2 ) colours per pixel 8 bits to store 256 (28) colours per pixel. For instance, if we wanted each pixel to be either white, black, green or red then each pixel would be represented by 2 bits, where white is 00, red is 01, green is 10 and black is 11: ■ 00 00 11 01 10 11 00 00 Figure 3.11 Four-colour encoding with two bits Colour depth is the name for the number of bits used per pixel. Calculating bitmap image size The greater the number of pixels, the more data needs to be stored and the larger the file size. ■ For instance, a 2-bit bitmap image measuring 10 × 20 pixels would be a larger file than a 2-bit bitmap image measuring 5 × 10 pixels. The higher the colour depth the more data needs to be stored and the larger the file size. For instance, a 5 × 10 bitmap image with 4-bit colour depth would be a larger file than a 5 × 10 bitmap image with a 2-bit colour depth. A higher colour depth means a larger number of colours can be represented, giving a better quality image. An image file size can be calculated using the following formula: ■ Size in bits = width in pixels (W) × height in pixels (H) × colour depth in bits (D) Size in bytes = (width in pixels (W) × height in pixels (H) × colour depth in bits (D)) / 8 116 9781510484306.indb 116 19/08/20 8:32 PM AQA GCSE Computer Science 3.7 Representing sound Sounds are a series of vibrations that vary continuously and can take any value – this means they are analogue. In order to store it on a computer, sound is sampled at regular intervals by a device that converts analogue to digital signals, and the digital values are stored as binary numbers. ■ ■ ■ ■ ■ ■ A sample is a measure of the amplitude (or amount) of sound vibration at a particular moment in time. The sample rate is the number of samples taken per second, measured in Hz (hertz). 1 Hz means one sample per second. The sample resolution is the number of bits used to represent the amplitude of vibration for each sample. The duration is the length of time that the sound is sampled for, measured in seconds. The higher the sample rate, the more frequently the sample is taken, and the better the approximation to the original sound – but the larger the file needed to store the data. The greater the sample resolution, the more accurate the measurement of vibration amplitude and the better the quality of the sound – but the larger the file needed to store the data. The longer the duration, the larger the file needed to store the data. The file size (in bits) is determined by: sample rate × resolution × duration in seconds. This can also be written as: ■ File size (bits) = rate × res × secs where rate is sample rate, res is sample resolution and secs is duration in seconds. For multiple channels, e.g. stereo, we need to multiply this by the number of channels. File size (bits) = rate × res × secs × channels 3.8 Data compression When transmitting files, storing very large files or storing a large number of files, we sometimes need to compress the data to make file sizes smaller. Lossy compression: ■ Some of the data is removed to make the file smaller. ■ Algorithms remove data that is least likely to be noticed. The original file cannot be restored from the compressed version. Lossless compression: ■ None of the information is removed. ■ Algorithms look for patterns in the data so that repeated data items only need to be stored once, together with information about how to restore them. ■ ■ The original file can be restored. 117 9781510484306.indb 117 19/08/20 8:32 PM 3 Fundamentals of data representation Huffman coding Huffman coding uses a binary tree to represent data, allocating a binary code to each data element (such as a character). The most frequently occurring data elements are encoded with the shortest binary codes – this helps to reduce the overall encoded file size. The right-hand paths are normally labelled as 1 and the left-hand paths as 0. The binary code for each character is created by reading the path to the character from the root of the tree. For example, in this Huffman tree, O would be encoded as 10, R would be 0110. 0 0 1 1 0 1 O H T 0 1 Sp 0 R 1 E Figure 3.12 Huffman trees are often drawn with the root at the top and branches below Calculating bits required for Huffman coding Once data has been encoded using Huffman coding, you can calculate the number of bits required to store the coding. ■ ■ ■ Use the Huffman tree to work out how many bits are needed for each character. For each character, multiply the number of bits by the frequency of the character to get the total number of bits that character needs in the whole phrase. Add all of these totals for each character together to work out the number of bits for the entire phrase. 118 9781510484306.indb 118 19/08/20 8:32 PM AQA GCSE Computer Science For example: Character Frequency Huffman code Number of bits Total (number of bits × frequency) E 1 0111 4 4 R 1 0110 4 4 Sp 2 010 3 6 H 3 11 2 6 O 3 10 2 6 T 4 00 2 8 The total number of bits for the phrase HOT HOT HOTTER is: 4 + 4 + 6 + 6 + 6 + 8 = 34 If you calculate the number of bits needed for the same phrase in ASCII, you can compare this to the number of bits needed for Huffman coding. To calculate the number of bits needed for the phrase in ASCII: Count how many characters there are in the phrase, including spaces. ■ Multiply this number by 7 (because each character in ASCII is represented by a 7-bit code). Therefore, the number of bits required for the phrase HOT HOT HOTTER in ASCII is: 14 × 7 = 98 Huffman coding this phrase instead of ASCII coding has therefore saved: 98 − 34 = 64 bits. ■ Run-length encoding (RLE) RLE compresses data by specifying how many times a character or pixel repeats, followed by the value of the character or pixel. The text AAAABBBBBCCCCC is made of 14 characters. To store this in ASCII would take 7 × 14 = 84 bits. We can, however, code the same text in RLE as: 4 65 5 66 5 67. To store the RLE would take 7 × 6= 42 bits. For the black and white image below, we can use the same idea to count the number of each colour that occurs in a row: Figure 3.13 Black and white image In this 32-pixel image we have eight black pixels, 17 white pixels and seven black pixels. In binary, we can store this in three bytes with the most significant bit being 1 or 0 to represent the colour, the remaining seven bits the number of pixels of that colour. 10001000 00010001 10000111 This uses just three bytes instead of four. Alternatively we could just represent this as: 8 1, 17 0, 7 0. 119 9781510484306.indb 119 19/08/20 8:32 PM QUESTION PRACTICE 3 Fundamentals of data representation 01 01.1 Convert the decimal value 77 to binary held in a single byte. [1 mark] 01.2 Convert the decimal value 59 to a hexadecimal value. [1 mark] 01.3 How many bits are needed to store a single hexadecimal digit? [1 mark] 01.4 Add together the following three binary numbers, showing your working. [3 marks] 1 1 0 1 1 1 1 0 0 1 0 1 0 0 1 0 1 02 For the binary number 11001: 02.1 Convert the binary to decimal. [1 mark] 02.2 Show the result of applying a binary shift of two places to the left. [1 mark] 02.3 Convert this result to decimal. [1 mark] 02.4 What is the effect of applying a binary shift of two places to the left? [1 mark] 02.5 What would be the result of applying a binary shift of 1 place to the right? 03 03.1 What is meant by the character set of a computer? 03.2 Explain how ASCII represents the character set of a computer. [2 marks] [1 mark] [2 marks] 03.3 Write down the sort order of the following list of words in a program using ASCII or Unicode to represent the character set: ‘Apple, grape, cherry, Damson’. [1 mark] 03.4 Describe two differences between using an ASCII character set and a Unicode character set. [2 marks] 120 9781510484306.indb 120 19/08/20 8:32 PM AQA GCSE Computer Science 03.5 In Unicode, the character G is represented by the numeric code 71. Select the character represented by the numeric code 68 from the following: A: d B: J C: D D: F [1 mark] E: 4 04 04.1 How does the number of pixels in an image affect the size of the file? [1 mark] 04.2 Describe two differences between an image with a 1-bit colour depth and an image with a 2-bit colour depth. (Both images have the same number of pixels.) [2 marks] 04.3 How many colours can be represented using a 4-bit colour depth? Show your working. [2 marks] 04.4For the two-colour bit map image below, 0 represents white and 1 represents black. Complete the Run Length Encoding to store the data for this image. [3 marks] 2 0 2 1 2 0 05 05.1Explain how a continuously changing sound can be captured and stored on a computer. [3 marks] 05.2 Explain what is meant by lossy compression. [2 marks] 05.3 Explain what is meant by lossless compression and when it should be used. [3 marks] 121 9781510484306.indb 121 19/08/20 8:32 PM 3 Fundamentals of data representation 06 This is a Huffman tree for the text string ‘PIED PIPER’. E I P D Sp R The bit pattern for R is 1101 because it is right, right, left, right to get to R from the root of the tree. 06.1 Complete the bit patterns for the rest of the characters in the string. Character Number of occurrences P 3 I 2 E 2 D 1 R 1 Space 1 [5 marks] Bit pattern 1101 06.2 Calculate how many bits are needed to store the text string using Huffman codes. [3 marks] 06.3 Calculate how many bits would be required to store the text string using ASCII codes. [2 marks] 122 9781510484306.indb 122 19/08/20 8:32 PM 4 COMPUTER SYSTEMS CHAPTER INTRODUCTION In this chapter you will learn about: 4.1 Hardware and software 4.2 Boolean logic ➤ Truth tables ➤ NOT, AND, OR and XOR gates ➤ Logic circuit diagrams ➤ Boolean expressions 4.3 Software classification ➤ Systems and application software ➤ Utility programs ➤ Functions of operating systems (OS) 4.4 Programming languages and translators ➤ Low and high-level languages ➤ Machine code and assembly language ➤ Program translators 4.5 System architecture ➤ Von Neumann architecture ➤ Components of a CPU ➤ Performance of a CPU ➤ Fetch–Execute cycle ➤ Main memory – RAM and ROM ➤ Secondary storage – solid state, magnetic and optical ➤ Cloud storage ➤ Embedded systems 123 9781510484306.indb 123 19/08/20 8:32 PM 4 Computer systems 4.1 Hardware and software All computer systems are a combination of hardware and software. Hardware refers to the physical components of a computer system, such as the CPU, keyboard, mouse, monitor, graphics card, primary and secondary storage. It is anything that can be seen or touched. Software refers to all the programs that run on a computer, including the operating system. The software instructions tell the computer how to work, and the hardware actually carries out the processes and displays the results. 4.2 Boolean logic George Boole (1815–1864) was an English mathematician who identified that all logical decisions could be reduced down to simple True and False values. In computer science, the Boolean data type, which can only be True or False, is named after him. Figure 4.1 George Boole, the inventor of Boolean logic Computers store everything using electronic switches known as transistors, with ON and OFF represented by the binary symbols 1 and 0. These same 1s and 0s can also be used to represent the True and False values of Boolean logic. Key point When discussing Boolean logic, 1 and 0 are usually used but you may see True and False, T and F or even On and Off used. We will use 1 and 0, which is also how any questions in your examination will be presented. 124 9781510484306.indb 124 19/08/20 8:32 PM AQA GCSE Computer Science Truth tables A truth table is a table used to display all possible inputs and associated outputs from a logic system. Inputs are usually labelled with the letters from the beginning of the alphabet and the output labelled using the letter P or Q. As an example, consider the situation of learning to drive, where both a practical driving test and a theory test are needed in order to be given a driving licence. We can assign the input A to ‘passes the practical test’, input B to ‘passes the theory test’ and output P to ‘can be given a driving licence’. The table below shows a truth table for this logic system. Practical test (A) 0 0 1 1 Theory test (B) 0 1 0 1 Driving licence (P) 0 0 0 1 Note how every possible combination of inputs is listed and the associated output for those inputs given. We can see that if someone fails both tests, they cannot be given a driving licence. If they pass one of tests but fail the other, they are still unable to be given a driving licence. However, if both tests are passed then a driving licence can be given. Some logic systems have more than two inputs. For example, a logic system with four inputs would be labelled with inputs A, B, C and D. The more inputs, the more rows are needed in the truth table; a four-input truth table would have 16 possible permutations of input values. Truth tables for logic gates using the operators NOT, AND, OR and XOR Logic gates are circuits within a computer that produce a Boolean (i.e. 1 or 0) output when given Boolean inputs. There are four logic gates that we need to know about for GCSE: NOT, AND, OR and XOR. NOT gate The NOT gate is sometimes known as an inverter. It only has one input and it outputs the opposite of the value input. If a 0 is supplied as an input then a 1 will be produced as the output. Conversely, if a 0 is supplied as the input then a 1 will be the output. This is shown in the truth table below: Tech term Inverter An alternative name for a NOT gate, because the output inverts the input. A P 0 1 1 0 The symbol used to denote a NOT gate is shown in Figure 4.2. Input A Output P Figure 4.2 A NOT gate The Boolean expression shown above is written as P = NOT A or P = A. When writing Boolean expressions, the inputs are usually given letters such as A, B and C, and the output is usually represented by P or Q. The expression can also be written using standard symbols to represent logic gates. The overbar is used to represent a NOT gate, so this expression can also be written as P = A. 125 9781510484306.indb 125 19/08/20 8:32 PM 4 Computer systems AND gate An AND gate has two inputs and only produces a 1 as output if both of the inputs are 1s. If any other combination of inputs is given then the output will be 0. A B P 0 0 0 0 1 0 1 0 0 1 1 1 The symbol used to denote an AND gate is shown in Figure 4.3. Inputs A B Output P Figure 4.3 An AND gate The Boolean expression above is written as P = A AND B. The · symbol is used to represent an AND gate, so this expression can also be written as P = A . B. OR gate An OR gate has two inputs and produces a 1 as output if either or both of the inputs are 1s. If both inputs are 0 then the output will be 0. A B P 0 0 0 0 1 1 1 0 1 1 1 1 The symbol used to denote an OR gate is shown in Figure 4.4. Inputs A B Output P Figure 4.4 An OR gate The Boolean expression above is written as P = A OR B. The + symbol can be used to represent an OR gate, so this expression can also be written as P = A + B. 126 9781510484306.indb 126 19/08/20 8:32 PM AQA GCSE Computer Science XOR gate An XOR gate, or Exclusive OR gate, has two inputs and produces a 1 as output only if one of the inputs is a 1. If both inputs are 0, or both inputs are 1, then the output will be 0. A B P 0 0 0 0 1 1 1 0 1 1 1 0 The symbol used to denote an XOR gate is shown in Figure 4.5. Inputs A B Output P Figure 4.5 An XOR gate The Boolean expression above is written as P = A XOR B. The symbol ⊕ is used to represent an XOR gate, so this expression can also be written as P = A ⊕ B. An easy way to remember the purpose of these logic gates is to think about what inputs produce a 1 output. ● NOT requires the input NOT to be 1. ● OR requires inputs A OR B (or both!) to be 1. ● AND requires inputs A AND B to be 1. ● XOR requires either A OR B (but not both) to be 1. Truth tables and logic diagrams for logic circuits combining NOT, AND, OR and XOR gates The four logic gates shown can be combined to form more complex logic circuits. AND and NOT gates In this example, we could connect a NOT gate to the output of an AND gate. Inputs A Output P B Figure 4.6 An AND gate connected to a NOT gate The output for this can be worked out in stages. It is useful to think about the output of each gate in turn and we can even add this to our truth table. Here, R is the output from the first part of the logic diagram and P is the final output. A 0 0 1 1 B 0 1 0 1 R = (A AND B) 0 0 0 1 P = (NOT R) 1 1 1 0 Another way of describing this system is: P = NOT (A AND B) 127 9781510484306.indb 127 19/08/20 8:32 PM 4 Computer systems AND and OR gates This logic circuit has three inputs, with the output of an AND gate feeding into one input of an OR gate. An additional input C is also given. Another way of describing this circuit is: P = (A AND B) OR C Inputs A B R Output P C Figure 4.7 An AND and OR gate The truth table now requires more rows to cater for all of the possible input values. The first thing you need to do is fill in all of the different permutations of the inputs A, B and C. A B C 0 0 0 0 0 1 0 1 0 0 1 1 1 0 0 1 0 1 1 1 0 1 1 1 R = (A AND B) P = (R OR C) Then you need to fill in the values for R for all permutations of A and B. A B C R = (A AND B) 0 0 0 0 0 0 1 0 0 1 0 0 0 1 1 0 1 0 0 0 1 0 1 0 1 1 0 1 1 1 1 1 P = (R OR C) Once the R column is completed you can then fill in the values for P for all the permutations of R and C. 128 9781510484306.indb 128 A B C R = (A AND B) P = (R OR C) 0 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 1 1 0 0 0 0 1 0 1 0 1 1 1 0 1 1 1 1 1 1 1 19/08/20 8:32 PM AQA GCSE Computer Science Key point To make sure that you don’t miss out any permutations of input values, try counting up in binary, starting at 000 (if you have three inputs), then 001, 010, 011, and so on. Another check is that with n inputs, there should always be 2n rows in the truth table (not counting the heading). This means that for 3 inputs, we need 23 = 2 × 2 × 2 = 8 rows. XOR and AND gates This logic system has three inputs, with the output of the XOR gate feeding into one input of an AND gate. The other input C is the other feed into the AND gate. Another way of describing this system is: P = (A XOR B) AND C. Input A B R C Output P Figure 4.8 An XOR and AND gate This truth table again requires more rows to cater for all of the possible input values. The first thing to do is fill in all of the different permutations of the inputs A, B and C. A B C 0 0 0 0 0 1 0 1 0 0 1 1 1 0 0 1 0 1 1 1 0 1 1 1 R = (A XOR B) P = (R AND C) Then you need to fill in the values for R for all permutations of A and B. A B C R = (A XOR B) 0 0 0 0 0 0 1 0 0 1 0 1 0 1 1 1 1 0 0 1 1 0 1 1 1 1 0 0 1 1 1 0 P = (R AND C) 129 9781510484306.indb 129 19/08/20 8:32 PM 4 Computer systems Once the R column is completed you can then fill in the values for P for all the permutations of R and C. A B C R = (A XOR B) P = (R AND C) 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 1 1 1 1 1 0 0 1 0 1 0 1 1 1 1 1 0 0 0 1 1 1 0 0 Knowledge check Create truth tables for the following expressions: 1 P = C AND (A OR B) Input A B C P 2 P = A OR (NOT B) A P B 3 P = NOT (A OR B) A P B 4 P = (B AND C) XOR (NOT A) A P B C 5 P = (NOT A) AND ((NOT B) OR C) A P B C 130 9781510484306.indb 130 19/08/20 8:32 PM AQA GCSE Computer Science 6 P = ((NOT A) AND (NOT B)) OR C A C P B Boolean expression operations We have seen the operators that represent Boolean expressions: . to represent the AND gate + to represent the OR gate ⊕ to represent the XOR gate Overbar to represent the NOT gate We can combine these operators in expressions of multiple logic gates, for example: ● C AND (A OR B) could be written as C . (A + B) ● (A AND B) OR (NOT C) could be written as (A . B) + C Creating logic circuits from expressions So far, we have been given logic diagrams to work from. However, it is important that we can draw logic circuits when given the equivalent expression. Worked example Take the example P = NOT (A AND B) AND C. As with mathematics, use of brackets help us decide what has priority. In this case, A AND B is an expression in brackets and therefore should be looked at first. The logic diagram for this is: Inputs A Output R B Next, the NOT gate should be applied to this expression to give us NOT (A AND B). The logic diagram for this is: Inputs A B Output P Note that the NOT gate is applied to the output of the expression A AND B, not on either of the inputs. 131 9781510484306.indb 131 19/08/20 8:32 PM 4 Computer systems Finally, the expression is complete by joining the output of NOT (A AND B) in the previous stage to AND C. This gives us NOT (A AND B) AND C as shown in this logic diagram: A B P C The truth table for this logic diagram is as shown below. A B C P 0 0 0 0 0 0 1 1 0 1 0 0 0 1 1 1 1 0 0 0 1 0 1 1 1 1 0 0 1 1 1 0 Knowledge check Create logic diagrams for the following expressions: 7 P = A OR (NOT B) 8 P = A OR (B OR C) 9 P = A AND (NOT B) 10 P = (A OR B) OR (C OR D) Creating Boolean expressions from logic circuits Sometimes we may want to create a logical expression from a logic diagram. To do this, we need first to identify the different logic gates used in the circuit. Worked example In the example below we can see that there is an OR gate and an AND gate: Input A B C P We need to work through the circuit logically, working through the inputs in turn and then following these through to the output. Here, A OR B is fed through to the AND gate so we can use brackets to show that this needs to be done first: P = (A OR B) AND C 132 9781510484306.indb 132 19/08/20 8:32 PM AQA GCSE Computer Science Worked example In the following example there is an OR gate and a NOT gate. A P B Here, input A is fed directly into an OR gate. However, input B passes through a NOT gate before feeding into the OR gate. Therefore, we describe this as P = A OR (NOT B). Knowledge check Create Boolean expressions for the following logic circuits: 11 A B P 12 A P B Beyond the spec Reducing real-life problems down to Boolean logic statements can help us to decide what inputs are needed to produce certain outputs. Imagine a family looking to book a holiday. They have a budget of £2,000 for the holiday and they would like this to be somewhere with a pool. However, if a holiday in the USA was available at this price, with or without a pool they would be happy. We can construct this situation as a logic statement. Let P be the outcome of being happy with the holiday. The inputs can also be defined: ● A = costs £2,000 or less ● B = has a pool ● C = is in the USA. A must be met; the family only have this in their budget to spend. However, either B or C can be true for the family to be happy. This situation is equivalent to the logic statement P = A AND (B OR C) 133 9781510484306_AQA_GCSE_Computer_Science_CH04.indd 133 19/08/20 8:59 PM 4 Computer systems The truth table for this logic statement is therefore: A B C B OR C P = A AND (B OR C) 0 0 0 0 0 0 0 1 1 0 0 1 0 1 0 0 1 1 1 0 1 0 0 0 0 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 This then allows us to apply these logical rules to check if certain holidays are appropriate. 4.3 Software classification There are two main categories of software used on computers: systems software and application software. Systems software Systems software controls the hardware inside the computer and provides an interface for users to interact with it. It is comprised of the operating system and utility software. Utility software Utility software is a collection of programs, each of which does a specific housekeeping task to help maintain a computer system. Most computers have utility software installed alongside the operating system. Utility software is not essential for the computer to work, but it helps to configure the system, analyse its performance and make changes to ensure that it is running efficiently. Types of utility software include: ● encryption ● defragmentation ● data compression ● backup. Encryption software Tech term Ciphertext Encrypted text. Encryption is the scrambling of data into a form that cannot be understood if it is accessed by unauthorised users. It is used to protect data from unauthorised access. The encryption process uses an algorithm and a key to transform the plaintext into ciphertext. The same algorithm and key are needed to decode the information. Modern operating systems have built-in encryption utilities that enable the user to encrypt specific files or entire drives – for example BitLocker on Windows and File Vault on macOS. Files on systems like these are automatically decrypted when they are accessed by an authorised user. 134 9781510484306.indb 134 19/08/20 8:32 PM AQA GCSE Computer Science Defragmentation When data is stored on a magnetic hard disk drive it is saved to different areas of the disk depending on where there is space available. If the data file is larger than the free space available in one part of the disk, it is split into separate blocks of data that are saved in different places. This is called fragmentation. Over time, the contents of a hard disk drive become increasingly fragmented and it begins to affect performance. This is because hard disk drives are mechanical devices and the read/ write head has to move to the correct physical location on a disk to read its contents (see section 4.5). When the data is fragmented, the disk needs to be accessed more frequently to read all of the data. Defragmentation is the process of organising and moving the separate parts of the data files so that they can be stored together in one location on the hard disk drive. Since the heads do not have to travel between areas of the drive to access data, it makes the data faster to access. Defragmentation also groups all of the free disk space together so that new files can be stored in one place. This improves the performance of the computer. Fragmented data Over time, the contents on a hard drive becomes fragmented. Defragmented data Defragmentation reorganises and moves the data blocks. Data that belongs to the same file is grouped together in adjacent blocks. Free space is grouped together at the end of the disk. Figure 4.9 The process of defragmentation Tech terms Lossy compression Reduces file size by deleting some data. Lossless compression Reduces the file size but does not delete any data. It is important to note that there is no need to perform defragmentation on solid-state drives. This is because SSDs do not have any moving parts, so having data split up around the memory locations does not affect the read/write times. Data compression Data compression uses algorithms to reduce the size of files so that they take up less storage space. There are two types of compression: lossy and lossless. Compressed files can be transmitted much more quickly over the internet as the file size is smaller and therefore requires less bandwidth. Compression can also be useful when emailing files as attachments as there is usually a limit to the size of a file which can be transmitted. Compressing, or zipping, the file can reduce it to an acceptable size. Backup Data stored in secondary storage is potentially at risk from loss or damage and important data should be backed up. Special software on the system is able to perform backup operations. 9781510484306.indb 135 135 19/08/20 8:32 PM 4 Computer systems Knowledge check 13 Define what is meant by utility software. 14 Give three examples of utility software. 15 Explain why encryption software is used. 16 State one problem caused by a fragmented disk. 17 Identify two situations where compression software might be used. Application software Application software is the end-user programs that are designed to perform specific tasks, such as word processing, spreadsheets, photo editing, and so on, or used for entertainment such as playing games or watching videos. There are also many programs written for specific applications, such as company payroll systems, pupil management systems, air traffic control and traffic control. Operating systems Operating systems are found in almost all computing devices, from video game consoles to mobile phones, tablets to desktops, and data servers to supercomputers. They control the general operation of the system or device and provide a way for users to interact and run programs. Well-known operating systems include: ● Windows ● macOS ● Linux ● Ubuntu ● Android ● iOS. The operating system (OS) manages the hardware in a computer and provides an environment for applications to run. It is essential to the function of any computer. The functions controlled by the OS include the management of: ● processor(s) ● memory ● input and output devices ● applications ● security. Processor management Tech term Multitasking This is when a computer performs more than one task at the same time. Modern operating systems, such as Windows and macOS, allow several programs to be run simultaneously – this is known as multitasking. For example, a user may be working on a word-processed document, while listening to music that is being streamed via the internet. Processor management involves the allocation of processor time to each task that needs to be carried out by the different programs and processes running on a computer at any one time. Processor time is shared between applications so that they appear to be running simultaneously. 136 9781510484306.indb 136 19/08/20 8:32 PM AQA GCSE Computer Science If a computer has a multi-core processor then the operating system will allocate tasks to each core wherever possible. Memory management The memory manager controls the use of RAM by allocating blocks of free space in the main memory to each program as it starts up. Memory is typically broken up into fixed-size blocks, called pages. The operating system works out how much memory a program is likely to require and allocates sufficient pages to hold the program and its files. When a program is closed, the allocated pages are freed up so that they can be used by another program. The memory management system keeps a record of where the data for each program is being held so that it can be fetched when it is needed. Input/output device management Tech terms Peripherals Input or output devices connected to a computer. Device drivers Programs that ­allow a peripheral to ­communicate with a computer. API A set of rules that allows communication ­between, for instance, an application and ­operating system. Patch An update to a piece of software. There are a number of input devices, such as a keyboard or camera, and output devices, such as a monitor or printer, connected to a computer. These are known as peripherals and are controlled using device drivers, which are small programs that enable the devices to communicate correctly with the computer. Each input/output device has its own device driver and these enable the device manager to control the sending of data between the computer and the devices. The driver for a particular device can usually be downloaded automatically via the internet, and each driver can be updated independently if required. Application management Applications cannot function without an operating system. The operating system runs a program in order to install new applications and then interacts with them via the applications program interface (API). The API allows the application to communicate with the operating system, allocating memory space for it to be loaded and queueing jobs that need to be allocated processor time. Security management There are a number of ways in which the operating system helps to keep the system secure from threats: ● Access to the computer is controlled through user accounts, which can be created or deleted as required. ● Users are required to enter passwords to access the software and files on the computer. ● Different access rights and privileges can be set for each user, controlling what they can access and edit, such as administrator rights (which would allow changes to be made to the system, such as installing new software) or standard user rights. ● Updates to the OS can be downloaded automatically to ensure that any security loopholes are patched. ● Data stored on the secondary storage can be encrypted to ensure that it remains secure. Knowledge check 18 State two functions of an operating system. 19 Explain how an operating system manages memory in a computer system. 20 Explain what is meant by device drivers. 21 Explain how an operating system manages the processor of a computer system. 22 Identify two ways in which the operating system helps to keep the system secure. 137 9781510484306.indb 137 19/08/20 8:32 PM 4 Computer systems 4.4 Classification of programming languages and translators As we will see below, every computer contains a processor that fetches, decodes and executes instructions. These instructions are given to the computer by programmers, but the programming language that the instructions are given in could take many forms. Levels of programming language You may be able to name many programming languages; as part of the AQA GCSE Computer Science course you are required to have practical experience of programming in at least one textual language. You may have heard of Python, C#, Visual Basic and many more, but the surprising fact is that a processor cannot execute instructions in any of these languages – they all have to be translated into machine code before they can be executed. This is because machine code is in a binary form and processors can only execute instructions in binary. For example, a machine code program executed by the processor would have to look like this: 10110101 10010010 01101101 10110001 00100111 11100101 01101101 10110001 Figure 4.10 A simple program written in machine code In fact, early computers were programmed in this way. Binary 1s and 0s could be input into the computer by physically moving switches or connecting wires, as with ENIAC, the Electronic Numerical Integrator And Computer, which was first used in 1945. Figure 4.11 The early computer ENIAC (Electronic Numerical Integrator And Computer) 138 9781510484306.indb 138 19/08/20 8:32 PM AQA GCSE Computer Science Other computers allowed programs to be input using punched paper, tape or cards to represent the binary 1s and 0s. Entering instructions in binary was obviously very timeconsuming and mistakes were hard to spot. Every different type of computer also had its own version of machine code and so programs written on one computer would not necessarily run on another computer. Low-level languages Tech terms Abstraction Removing unnecessary detail. The programmer of a high-level language does not need to know exactly how a processor performs a particular task. High-level language instructions are an abstract representation of what is actually happening in the processor itself. Mnemonics In assembly language, a text code that represents a machine code operation. Machine code is a low-level language. This means that it can be run directly by the processor with no translation needed. It also has little or no abstraction; each instruction deals directly with the computer’s hardware and so a machine code programmer would need to understand the inner workings and hardware structure of the processor. Something as simple as adding two numbers up would take numerous steps to load data into registers, perform the addition and then store the answer in memory. The programmer would need to specify each of these steps. Low-level programs are also hardware-dependent – they will not run on a different type of computer or processor as the machine code instruction-set for that computer or processor is likely to be different. One simple way to make low-level programming slightly easier is to replace the binary machine code instructions with short mnemonics (for example, replacing a binary instruction such as 00100111 with ADD). This is known as assembly language. It requires a program called an assembler to convert the mnemonics back into their binary equivalents before the processor can execute them. While assembly language is easier to use than machine code, it is another example of a low-level language. Because each assembly language mnemonic represents one machine code instruction, machine code and assembly language have a oneto-one (1:1) correspondence. The ‘Little Man Computer’ is an excellent introduction to programming using assembly language. You can find an LMC simulator on a number of websites – simply search for ‘little man computer’. One advantage of a low-level language is the sheer speed of program execution. Using machine code allows a programmer to make their program run as fast as possible with no unnecessary routines. Programmers can also be very efficient in their use of primary and secondary storage. Low-level languages such as machine code were once the only way to program a computer. Nowadays, assembly language is often used to develop software for embedded systems (where speed on relatively low-powered hardware is important) or for controlling specific hardware components such as device drivers. High-level languages Languages like Python, C# and Visual Basic are examples of high-level languages. There are many different high-level languages, each with its own specific use. For example, PHP and JavaScript are used to create web applications, whereas Visual Basic is commonly used to create desktop applications for Windows computers. Key point Java and JavaScript are two very different languages. They may share the same first part of their name, but then again so do ham and hamster or car and carpet. 139 9781510484306.indb 139 19/08/20 8:32 PM 4 Computer systems Figure 4.12 A simple program written in Python Programs written in high-level languages use English-like syntax, with keywords such as WHILE or IF. This means that they are much easier for programmers to use and errors can be spotted much more easily. High-level programming languages also use abstraction to hide away details of the underlying processor operations that are needed to execute a line of code. For example, the line x = 1 + 2 in a high-level language actually requires the processor to do three things: ● load the first value into the accumulator (1) ● add the second value to it (+2) ● store the result at the memory address referenced by the label x . By hiding this underlying detail, programming becomes much easier. It also means that highlevel languages are hardware-independent – which means they can run on many different types of computer. However, programs written in high-level languages cannot be executed directly by the processor in a computer. They must first be translated into machine code, a process that takes time to carry out. Advantages Low-level language High-level language Can be run directly by the CPU Once programs have been translated, they can be run on different hardware Very fast to execute Easy to read and write code as the language is very similar to English Allows the programmer to focus on what needs to be done rather than how the computer works Disadvantages Hardware dependent so will only run on the specific hardware it is has been developed for Needs to be translated to be run by the CPU More difficult to read and write code as it is not like English Each language has its own specific syntax and keywords Slower to execute as needs to be translated Key point High and low in this context refers to the level of abstraction away from the processor. Low-level languages are so-called because they are very closely related to how the processor works. High-level languages are much further away. 140 9781510484306.indb 140 19/08/20 8:32 PM AQA GCSE Computer Science Knowledge check 23 Name three high-level languages. 24 Give two advantages that high-level languages have over low-level languages. 25 Give one advantage that low-level languages have over high-level languages. Program translators A translator is a piece of software that converts high-level code into low-level machine code that can then be executed by the processor. Without translators, high-level programming languages like Python would not exist. Beyond the spec Grace Hopper (1906–1992) was an American Computer Scientist who created the first ever translator in 1952. By 1956, she had programmed the UNIVAC (Universal Automatic Computer) to translate twenty English-like statements into machine code. Figure 4.13 Grace Hopper and colleagues programming the UNIVAC The characteristics of a compiler, an interpreter and an assembler Program translators come in three types: compilers, interpreters and assemblers. A compiler works through the high-level code, translating every line into machine code. A single high-level instruction can result in many machine code instructions. After the 141 9781510484306.indb 141 19/08/20 8:32 PM 4 Computer systems compilation process has completed, an executable file is produced that the processor can run directly. The executable file can be saved and run in the future without needing to compile it again. Most applications that a user would buy or download will have been compiled; the executable file is distributed so that users can run it. This also has the advantage of not allowing users to view or modify the high-level program code. In contrast, an interpreter translates one line of high-level code and then immediately runs this before moving on to translate and run the next line of code. Interpreters call appropriate machine code subroutines within their own code to carry out commands. No executable file is produced and when the program is run again in the future, the interpreter must retranslate every line of code again. If code that is run using an interpreter is distributed, all of the source code for the program must be shared. Therefore, interpreters are generally used when developing programs, or when a program needs to be run on different types of hardware. Another important difference between compilers and interpreters is the speed of execution of programs. Because compilers translate everything first, the program itself runs quickly. Conversely, because interpreters only translate one line at a time, this translation is happening while the program is running, which slows it down. An assembler is needed to convert assembly language into machine code. Assembly language has a 1:1 correspondence with machine code. This means that each instruction in assembly language translates into one instruction in machine code. Compiler Interpreter Assembler What does it do? Translates every line of a Translates one line of high-level program into high-level code and then machine code to produce executes it immediately an executable file Translates low-level assembly language into machine code When is it appropriate to use? Used to create an executable file that can be run at any time. As the source code is not shared, the program cannot be modified easily. This makes it ideal for distributing commercial software Used when developing programs as it makes it easy to spot where errors have occurred. This is because the program will stop running and will give an error message indicating what the problem is Is used to produce an executable file from assembly language that can be run at any time Key point Compilers, interpreters and assemblers are all hardware-dependent. Different processors or computers require different translators, even if the high-level language that is being translated is the same. 142 9781510484306.indb 142 19/08/20 8:32 PM AQA GCSE Computer Science Figure 4.14 Python interpreter available for PC (Windows, Linux) and Mac (OS X) Knowledge check 26 Give two ways that code can be translated from high-level code into machine code. 27 State the role of an assembler. 4.5 Systems architecture A computer system consists of hardware and software working together to process data. Hardware is the name for the physical components that make up the computer system. Software is the name for the programs that provide instructions for the computer telling it what to do. A computer system receives information as an input, processes and stores that information, and then outputs the results of that processing. Input Process Output Storage Figure 4.15 Input–process–output. Processing and storage is the job of the CPU Von Neumann architecture CPU architecture means how the different components in the Computer Processing Unit (CPU) are laid out and communicate with each other. The Von Neumann architecture describes a computer in which the data and instructions are stored in the same area of memory and are indistinguishable from each other. All instructions are stored as binary codes; for example, the instruction to add a number to the accumulator might have the code 1001. When it accesses some data in memory, the CPU must decide if the binary value 1001 represents an instruction to add a value to the accumulator, or 143 9781510484306.indb 143 19/08/20 8:32 PM 4 Computer systems if it represents the value 9. The CPU relies on the logic of the program to decide what to do. If it expects an instruction it will try to decode it and carry it out; if it expects a value it will treat it as a number. Von Neumann architecture is the fundamental design concept behind all modern computer systems. The components of a CPU The CPU is made up of billions of transistors, which are like very small ‘on–off’ switches. The arrangement of transistors creates logic circuits that process input data, carry out instructions and control the components of the computer. Figure 4.16 A CPU The CPU is made up from a number of components: Arithmetic Logic Unit (ALU): The ALU is responsible for the following: ● Arithmetic operations such as add, subtract, multiply and divide. ● Logical operations and comparisons such as AND, OR and NOT, and the result of less than, greater than, equal to comparisons. ● Binary shift operations (moving the binary digits in a binary value left or right). The ALU carries out the calculations and logical decisions required by the program instructions that the CPU is processing. Control unit (CU): The control unit’s purpose is to coordinate the activity of the CPU. It does this by: ● obtaining then decoding instructions from memory ● sending out signals to control how data moves around the parts of the CPU and memory to execute these instructions. Clock: A vibrating crystal that generates digital pulses at a constant speed. Typical modern computers work at speeds of up to 4 GHz (or 4 billion instructions per second). The clock synchronises all the CPU activity. 144 9781510484306.indb 144 19/08/20 8:32 PM AQA GCSE Computer Science Registers: Memory locations within the CPU that hold data temporarily and can be accessed very quickly. Their role in the CPU is to accept, store and transfer data and instructions for immediate use by the CPU. Buses: Communication channels through which data moves. They enable data and control signals to move around the CPU and main memory. Beyond the spec There are three main buses inside a computer: Data bus: This carries data between the CPU and memory. Control bus: This carries control signals around the CPU and memory. Address bus: This carries memory addresses for locations to be read from or written to. This is a simplified diagram of a CPU: Memory CPU Address bus Control bus Data bus Registers Control unit ALU Input/Output Figure 4.17 The Central Processing Unit The performance of the CPU Clock speed The CPU is constantly fetching and executing instructions and the speed at which it does this is determined by the clock. Each ‘tick’ of the clock represents one step in the Fetch– Execute cycle (see below). The faster the clock speed, the more instructions that can be executed every second. Typical modern computers work at speeds of up to 4 GHz (or 4 billion instructions per second). Size of cache memory Cache memory is used to hold data that needs to be accessed very quickly by the CPU. Its role in the CPU is to store instructions and data that are used repeatedly or are likely to be required next by the CPU. Accessing cache memory is very quick. It is located between the main memory and the CPU and can therefore be accessed much faster than main memory. To improve performance, the CPU control unit will first look in the cache for data or instructions, to see if they have been 145 9781510484306.indb 145 19/08/20 8:32 PM 4 Computer systems copied from main memory. If they are not in the cache memory then the CPU will locate them in the main memory, copy the data or instructions to cache and then to the CPU. Data sent to CPU CPU Data copied to cache Cache Request for data If data not in cache: request data from main memory Main memory Figure 4.18 Cache memory is used to store data waiting to be processed The more data that can be stored in the cache rather than main memory, the faster and more efficient the process. Data that is likely to be required will be transferred to cache, ready to be used. The more cache memory, the more likely it is that the required data will already have been copied across and will not need to be fetched from main memory. Cache memory is very expensive and while a mid-range laptop may have 8 GB of RAM (main memory) it is likely to have just a few MB of cache. Number of processor cores Another factor that can affect the performance of the CPU is the number of processor cores. Each core can fetch and execute instructions independently so a multiple-core processor can handle several instructions at the same time. While these multiple cores can work on separate programs or parts of a program at the same time, this is only possible if the program has been written to take advantage of this and the task that the program is attempting can be split up. Knowledge check 28 What is a meant by a quad core processor? 29 What is meant by 2.3 GHz when describing a CPU? 30 Describe how cache memory is used by the CPU. 31 Describe three characteristics of a CPU that affect its performance. The Fetch–Execute cycle The processor continually: ● fetches instructions from memory ● decodes these instructions ● and then executes them. 146 9781510484306.indb 146 19/08/20 8:32 PM AQA GCSE Computer Science Fetch Execute Decode Figure 4.19 The Fetch–Execute cycle This is called the Fetch–Execute cycle. It works as follows: Fetch Each instruction in a computer program is stored in particular location (or address) in memory. The data contained in the address of the next instruction is FETCHED and placed in a register. Decode The control unit DECODES the instruction to see what to do. Execute The decoded instruction is EXECUTED. This might mean performing a calculation using the ALU, reading or writing to or from main memory – or something else. Once the Execute part of the cycle is complete, the next Fetch–Execute cycle begins. Knowledge check 32 Describe the purpose of the CPU in a computer. 33 Describe the Fetch–Execute cycle. 34 What is the purpose of the registers in the CPU? 35 State two arithmetic and two logical operations carried out by the Arithmetic Logic Unit (ALU). Memory and secondary storage For a computer system to be useful it needs storage: main memory for data and programs that are currently in use and secondary storage for data and programs that can be accessed when required. Memory A computer system needs to have memory for any data that it needs to access quickly. This includes the start-up instructions, the operating system, programs that are running and any associated data. The memory in a computer is made up of main memory, cache memory and registers. Cache memory and registers have already been discussed. They are small but have the fastest access times. 147 9781510484306.indb 147 19/08/20 8:32 PM 4 Computer systems Main memory is directly accessible by the CPU. There are two main types of main memory: RAM and ROM. Random access memory (RAM) RAM is required to hold the operating system, applications that are running and any associated data while the computer is on and in use. When a program is loaded it is copied from secondary storage, such as a hard disk drive (HDD), into RAM. Any data associated with the program will also be stored in RAM so that the CPU can access both the data and the instructions. Data is transferred into RAM because accessing data in secondary storage is very slow compared to accessing data in RAM. With more RAM available, more data and applications can be stored in it. Because RAM has fast data access times this leads to better performance of the system. In practice, a system with more RAM can have more programs open at the same time without any noticeable decrease in performance. A typical laptop will now have around 8 GB of RAM available. Secondary storage Main memory Slowest access times Cache memory Data access speeds increase Registers Fastest access times Figure 4.20 Data transfer speeds RAM is volatile, meaning it needs electrical power to operate. Any data stored in RAM is lost when the power is turned off. RAM is read/write, which means it can be read from or written to by the computer. Read-only memory (ROM) In a typical computer system, ROM stores a small program and all the data needed to get the system up and running, ready to load the operating system from the secondary storage. This special program stored on ROM is called the Bootstrap Loader and we say the process ‘boots’ the computer – this means that it starts it from scratch. Computers would not be so useful if they had to be switched on all the time. ROM is nonvolatile memory and does not require power to maintain its contents. ROM is read-only. This means that the data stored in ROM is fixed and cannot be overwritten once it is created. The content is written to ROM either at the manufacturing stage or through a special process that can write to these devices. The differences between RAM and ROM A comparison of RAM and ROM RAM ROM Is volatile and needs power to maintain the content Is non-volatile and does not require power to maintain the content Is read and write – data can be read from and written to RAM by the computer Is read-only – the computer cannot overwrite its content Holds the operating system and any programs and data currently in use by the computer Holds the data and instructions required to start up (boot) the computer 148 9781510484306.indb 148 19/08/20 8:32 PM AQA GCSE Computer Science Knowledge check 36 State two differences between RAM and ROM. 37 What is held in RAM while the computer is working? 38 What is held in ROM on the computer? Secondary storage The need for secondary storage Computer systems would be of little value if we lost all of our data and programs every time we switched them off. We need to store the operating system, data, images, programs, documents and various other files so that they are available the next time we switch on the computer. This kind of data requires a lot of space so we need a low-cost, high-capacity, non-volatile storage medium. This is known as secondary storage. Secondary storage needs to keep this data safe and must be robust and reliable. The main difference between main memory and secondary storage is that primary storage is directly accessible by the CPU, whereas secondary storage is not. Knowledge check 39 Why do we need secondary storage? 40 What is stored on secondary storage in a computer system? Magnetic storage Magnetic storage uses the principle of magnetism to store data. Hard disk drives (HDDs) are magnetic and are the most common type of secondary storage. They are made of a stack of rigid disks (called platters) on a single spindle that rotates. Each platter is coated in a magnetic material that is effectively made up of billions of separate tiny magnets, which can either point ‘north’ or ‘south’. Each bit of data is represented by these tiny magnets – north for ‘1’ and south for ‘0’. A set of ‘heads’ move across the platters, reading or writing data by sensing or changing the north/south alignment of the magnets. Figure 4.21 A hard disk drive showing the platters and heads 149 9781510484306.indb 149 19/08/20 8:33 PM 4 Computer systems The magnetic hard disk is a reliable and cost-effective storage solution, providing high capacity at low cost. This makes it an ideal choice for large amounts of storage. Internal and external hard disk drive capacities are currently measured in terabytes (a million megabytes – see section 3.3). Large hard disk drives are, however, less portable than solid-state drives or optical disks and are subject to damage if dropped or brought near to strong electric or magnetic fields. Several drives can be combined in larger commercial systems to provide a significant amount of storage at a reasonable cost. At the other end of the scale, there are small portable hard disk drives that can easily be moved between computers. Figure 4.22 An external hard drive connected to a laptop Solid-state storage Tech terms Flash memory Memory made from electrical circuits. Latency A delay before data can be transferred. Fragmentation Data saved in different physical locations across a magnetic hard disk drive. Solid-state storage uses a technology called flash memory that uses electronic circuits to store data. It has very fast data access times, largely because there are no moving parts. Solid-state storage is, however, relatively expensive compared to hard disk drives and typically has a lower capacity. Solid-state storage is widely used for portable devices such as cameras (e.g. memory cards) and comes in a range of physical sizes and capacities to suit a wide range of applications. It is frequently used to back up or transfer data between devices (e.g. USB pen drives). Solid-state flash memory is used as the basis for solid-state drives (SSDs). SSDs have begun to replace magnetic hard disk drives (HDDs) because they have many advantages over them: ● In order to read data, magnetic HDDs have to line up the correct portion of the disk with the position of the read/write head. This means the magnetic disk platter has to rotate to the correct position and the head has to move across the platter. This in turn means there is a delay before data can be read or written. This delay is called latency. SSDs have lower latency times because there are no moving parts and access to the data does not require a platter to rotate or the read/write head to move. This improves access to the data and the performance of the device. ● The lack of moving parts means that SSDs have much lower power requirements and do not generate any heat or noise. ● HDDs can suffer from data being fragmented over the surfaces of the platters, producing very slow access speeds. This is not the case with SSDs. 150 9781510484306.indb 150 19/08/20 8:33 PM AQA GCSE Computer Science SSDs are significantly lighter, smaller and thinner than HDDs making them particularly suitable for small, thin portable devices such as tablet computers or other portable devices. ● Since there are no moving parts, SSDs are not susceptible to problems caused by sudden movements, making them ideal in hostile environments or in portable devices. Given the expense of SSDs, they are often combined with a magnetic disk drive to form a hybrid system. Frequently accessed data, such as the operating system, is stored on the SSD, while larger, less frequently required data files are stored on the magnetic disk. This provides the speed advantage of the SSD with the capacity advantage of the HDD at a reasonable cost compared to high-capacity SSDs. ● Figure 4.23 Various solid-state devices Optical storage Data can also be stored by using the properties of light. Typical optical storage media include CDs, DVDs and Blu-Ray disks. These are optical devices because they are written to and read from using laser light. The surface of each disk is covered in billions of small indentations. When light is shone onto the surface, it is reflected differently depending on whether or not it strikes an area with an indentation. The difference in reflections is detected and interpreted as a ‘1’ or ‘0’, to represent a bit of data. Some optical storage media are read-only but others can be written to by creating pits on the surface of the disk using laser light. Typically, a CD will hold around 700 MB of data and costs pennies. CDs are used to distribute data and programs or make semi-permanent copies of data. The DVD is very similar to the CD but has a larger capacity of 4.7–8.5 GB. This means that a DVD can store more data than a CD, such as standard resolution movies. A DVD has a faster access time than a CD and costs a little more, but is still only pennies. Blu-Ray is similar but with significantly larger capacity (25–50 GB) and access speeds. Blu-Ray disks can be used to store large amounts of data and the much higher access speed makes them particularly suitable for high-resolution movies and console games. They are slightly more expensive than DVDs but still reasonably inexpensive. Type CD DVD Blu-Ray Typical cost 18p 60p–80p £1.80–£3.00 Capacity 700 MB 4.7 GB single layer 25 GB single layer 8.5 GB dual layer 50 GB dual layer 151 9781510484306.indb 151 19/08/20 8:33 PM 4 Computer systems The table below summarises the advantages and disadvantages of the three types of storage. Advantages Magnetic storage ● ● ● ● Solid-state storage ● ● ● ● Optical storage ● ● ● ● Large capacity Small drives can be portable (but this increases risk of damage) Reliable for medium term storage of 5–7 years Low cost per GB Very fast access speed Very portable Very durable Likely to be reliable over a very long period of time Very portable Very durable – can withstand shocks and extreme conditions (but not scratches) Reliable for many, many years if looked after Cost of individual disks is small Disadvantages ● ● ● ● ● ● ● ● Slower access speeds than solidstate Large hard drives are not portable Not as durable – shocks and strong electric/magnetic fields can easily damage a hard drive Lower capacity than magnetic storage (though larger than optical storage) More expensive per GB than magnetic drives Limited capacity Slowest access times Cost per GB is expensive on CDs Cloud storage The cloud is a generic term that refers to storage, services and applications that are accessed via the internet rather than being stored locally on your computer, tablet or phone. The cloud is effectively a network of servers, some that store data and others that run applications. These servers use magnetic or solid-state storage and are housed in giant data centres around the world. Users do not actually need to know the geographical location where their data is stored. Examples of cloud storage services and applications include: ● file storage and sharing, e.g. Google Drive, DropBox, iCloud Drive, One Drive ● applications, e.g. Google Docs, Office 365, Gmail. Advantages of cloud storage One of the main advantages of cloud storage is that files and applications can be accessed from anywhere in the world with an internet connection. Cloud applications are always the latest, most up-to-date version and users do not have to update anything themselves. This reduces the need for network managers and technical support staff. The amount of storage space is flexible, and users can buy additional storage when they run out of space. All data that is stored in the cloud is regularly backed up and kept secure by the hosting company, and data can be shared easily with colleagues anywhere in the world. Disadvantages of cloud storage However, there are some drawbacks, most notably that an internet connection is required to access files and services. Users have little control over the security of their data and it is possible that data stored in the cloud could be targeted more easily by hackers than data stored locally. It can also be unclear who legally owns the data that is stored. Cloud providers can change their terms and prices with little notice, and ongoing fees may become expensive in the long run. 152 9781510484306.indb 152 19/08/20 8:33 PM AQA GCSE Computer Science Knowledge check 41 State what is meant by cloud computing. 42 Identify two services that can be accessed via the internet. 43 Explain two disadvantages of storing your data in the cloud. Embedded systems An embedded system is a computer system that has a dedicated function as part of a larger device. The main components of a computer are either manufactured onto a single chip (a microcontroller) or separate circuits for processing and memory are combined into a larger device. Figure 4.24 A microcontroller When a computer device is required to perform a single or fixed range of tasks it can be engineered to reduce its size and complexity in order to focus only on these tasks. Dedicated software will be programmed into the device to complete the necessary tasks and nothing else. The reduction of complexity of the hardware and the dedicated nature of the software will make the device more reliable and cost effective than using a general-purpose computer. The embedded system will typically include some ROM to store the dedicated program and some RAM to store user inputs and processor outputs. For example, in a washing machine, the ROM will store all of the data describing the separate washing cycles; the RAM will store the user’s selected options (inputs) and the data used to display choices and progress of the washing cycle (outputs). Embedded systems have the following characteristics: ● Low power so they can operate effectively from a small power source such as in a mobile phone. ● Small in size so they can fit into portable devices such as a personal fitness device. ● Rugged so that they can operate in harsh environments such as car engine management systems or in military applications. ● Low cost, making them suitable for use in mass-produced, low-cost devices such as microwave ovens. ● Dedicated software to complete a single task or limited range of tasks, such as in computer aided manufacture or control systems. 153 9781510484306.indb 153 19/08/20 8:33 PM 4 Computer systems Embedded systems contain many of the same components as non-embedded systems, but differ in two aspects: ● The software in an embedded system will be custom-written for the task and will not be typical general-purpose software found in non-embedded computer systems. ● The software in an embedded system will be written to non-volatile memory rather than being loaded into RAM, as in non-embedded computer systems. Examples of embedded systems Embedded systems are found within common household devices such as: ● washing machines ● set top boxes ● telephones ● televisions ● home security and control systems etc. Embedded systems are also widely used within larger and more complex systems such as: ● car engine management ● airplane avionics ● computer-controlled manufacturing ● military applications such as guidance systems. These systems are frequently connected to the internet via Wi-Fi to exchange data with third parties, for example water meters, energy smart meters, home security and heating monitoring systems. Embedded systems are particularly useful for those with physical disabilities as they can make items more accessible. This includes voice control for gadgets in the home and systems that adapt motorised vehicles so they can be operated using limited physical movements. Figure 4.25 An x-ray image of an engine control unit in a motorcycle 154 9781510484306.indb 154 19/08/20 8:33 PM AQA GCSE Computer Science Knowledge check 44 Explain why embedded systems have both RAM and ROM. 45 Identify one input and one output from the embedded system in a microwave oven. 46 Give two examples of systems that use embedded computer systems and explain why it is the most appropriate type of computer system to use in each case. 155 9781510484306.indb 155 19/08/20 8:33 PM RECAP AND REVIEW 4 COMPUTER SYSTEMS 4.1 Hardware and software Important words You will need to know and understand the following terms: hardware software Boolean logic Boolean data truth table NOT gate AND gate OR gate XOR gate logic circuits systems software application software utility software operating system (OS) programming language low-level language machine code assembler high-level language translator compiler interpreter Central Processing Unit (CPU) Von Neumann architecture Arithmetic Logic Unit (ALU) Control unit (CU) clock registers buses clock speed cache memory processor cores Fetch–Execute cycle RAM ROM volatile read/write non-volatile read-only secondary storage ■ ■ Hardware refers to the physical components of a computer system. Software refers to all the programs that run on a computer. 4.2 Boolean logic Boolean logic uses two values – True and False. This is a Boolean data type. These are represented in computer systems using binary 1 and 0 values. Truth tables Truth tables show all possible input permutations and the corresponding outputs for a logic system. If a logic system has n inputs, it will have 2n possible input permutations. This equals the number of rows in the truth table. For example, with three inputs it will have 23 = 2 × 2 × 2 = 8 rows. Inputs are labelled with letters from the start of the alphabet: (A, B, etc.). Outputs are typically labelled as P, Q and later letters from the alphabet. Truth tables for logic gates using the operators NOT, AND, OR and XOR NOT gate A 0 1 P 1 0 Input A Output P The Boolean expression for the NOT gate can be written as: P = NOT A or P = A AND gate A 0 0 1 1 B 0 1 0 1 P 0 0 0 1 156 9781510484306.indb 156 19/08/20 8:33 PM AQA GCSE Computer Science Important words hard disk drive (HDD) magnetic storage solid-state storage solid-state drive (SSD) optical storage cloud storage embedded system Inputs A B Output P The Boolean expression for the AND gate can be written as P = A AND B or P = A ∙ B OR gate A 0 0 1 1 B 0 1 0 1 Inputs A B P 0 1 1 1 Output P The Boolean expression for the OR gate can be written as: P = A OR B or P = A + B XOR gate A 0 0 1 1 A B B 0 1 0 1 P 0 1 1 0 Output P The Boolean expression for the XOR gate can be written as: P = A XOR B or P = A ⊕ B Truth tables and logic diagrams for logic circuits combining NOT, AND, OR and XOR gates These four logic gates can be combined to form more complex logic circuits. 157 9781510484306.indb 157 19/08/20 8:33 PM 4 Computer systems For instance, take the following expression: P = NOT (A AND B) AND C This can be represented by the following diagram: A B P C The truth table for this is: A B 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 C 0 1 0 1 0 1 0 1 P 0 1 0 1 0 1 0 0 Creating Boolean expressions from logic circuits and vice versa Given a logic circuit diagram, we can write down the equivalent Boolean expressions. To do so we need to work through the circuit logically, working through the inputs in turn and then follow these through to the output. Similarly, given a Boolean expression we can work through each term, starting with those in brackets, to construct each logic gate diagram before combining them together. 4.3 Software classification Systems software controls the hardware inside the computer and provides an interface for users. One example is utility software, which includes things such as: ■ ■ encryption software defragmentation software data compression software ■ backup software. Application software is the end-user programs that are designed to perform specific tasks, such as word processing, spreadsheets, gaming, video editing, and so on. The operating system (OS) controls the general operation of the computer system or device and provides a way for users to interact and run programs. The OS controls things such as: ■ ■ ■ processor(s) memory 158 9781510484306.indb 158 19/08/20 8:33 PM AQA GCSE Computer Science input and output devices ■ applications ■ security. Common examples of operating systems are: ■ ■ ■ ■ ■ ■ ■ Windows macOS Linux Ubuntu Android iOS. 4.4 Classification of programming languages and translators Low and high-level programming languages Examples of low-level languages are machine code and assembly language. Machine code is written in binary. Assembly language is written using mnemonics, such as ADD, LDA, and so on which are converted to binary by an assembler. Programs written in a low-level language: ■ machine code can be run directly by the CPU (assembly language needs assembling first) ■ use instructions that correspond directly to the CPU hardware ■ ■ are hardware-dependent – different hardware will have different sets of instructions are very fast to execute are difficult to write code for because they use a syntax quite different from English. Examples of high-level languages are C#, Python and VB.net. Programs written in a high-level language: ■ ■ need to be translated in order to run on a CPU use instructions that are specific to the language are not hardware-dependent – the languages can run on different hardware after translation are slower to execute because they need to be translated ■ are easier to write code for because they use a syntax similar to English. ■ ■ ■ Program translators A translator is a piece of software that converts high-level code into low-level machine code. There are three common types of translator: compilers, interpreters and assemblers. A compiler works through the high-level code, translating every line into machine code, and produces an executable file that is specific to the hardware. 159 9781510484306.indb 159 19/08/20 8:33 PM 4 Computer systems The original high-level code is not shared with the end user. ■ Once a program has been compiled, the executable file executes quickly and does not need compiling again for the hardware on which it will run. An interpreter translates one line of high-level code and then immediately runs this before moving on to translate and run the next line of code. Interpreters call machine code subroutines within their own code. ■ No executable file is produced – any hardware can run the code. ■ The original high-level code must be shared with the end user. ■ Each time a program is executed it must be re-interpreted. This slows down the speed of execution. An assembler is needed to convert assembly language into machine code. Assembly language has a 1:1 correspondence with machine code – one instruction in assembly language translates into one instruction in machine code. ■ 4.5 Systems architecture ■ ■ ■ ■ ■ ■ A computer system consists of hardware and software. Hardware is the term that describes the physical components of a computer. Software is the name for the programs that provide instructions for the computer. The computer inputs, processes and stores, and outputs data. The computer hardware works with the software to process data. The hardware that processes the data is called the Central Processing Unit (CPU). Von Neumann architecture CPU architecture refers to the internal logical structure and organisation of the computer hardware. ■ In Von Neumann architecture, data and instructions are stored in the same memory location. Nearly all modern computer systems are based on Von Neumann architecture. ■ The components of a CPU The purpose of a CPU is to carry out a set of instructions that is contained in a computer program. It does this using the following components: ■ ■ ■ Arithmetic Logic Unit (ALU): Carries out all arithmetic calculations (e.g. add, subtract etc.), logical decisions (e.g. AND, OR etc.) and comparisons (e.g. equal to, greater than etc.). Control unit (CU): Decodes instructions and sends signals to control how data moves around the parts of the CPU and memory to execute these instructions. Clock: A vibrating crystal that generates digital pulses at a constant speed of vibration that synchronise the CPU activity. 160 9781510484306.indb 160 19/08/20 8:33 PM AQA GCSE Computer Science ■ ■ Registers: Memory locations within the CPU that hold data. A register may hold an instruction, a storage address, or any kind of data. The data in registers can be accessed very quickly. Buses: Communication channels through which data moves. Buses connect the CPU with main memory and any input/output devices. The performance of the CPU Clock speed The CPU works at high speeds governed by the clock. ■ The faster the clock the more instructions that can be completed per second. ■ Clock chips currently operate at speeds of up to 4 GHz or 4 billion instructions per second. Cache memory ■ Cache memory is small in size but can be accessed very quickly by the CPU. ■ Having more cache will provide the CPU with faster access to data. Number of processor cores ■ ■ ■ Each core can perform separate Fetch–Execute cycles. If a CPU has multiple cores it may be able to process more instructions simultaneously. The Fetch–Execute cycle The CPU continually fetches, decodes and executes instructions. ■ Fetch – an instruction in the form of data is retrieved from main memory. Decode – the CPU decodes the instruction. ■ Execute – the CPU performs an action according to the instruction. ■ Memory and secondary storage A computer needs memory and storage. Memory A computer needs memory for data it needs to access quickly. This includes: ■ start-up or boot instructions the operating system ■ programs that are running ■ any data associated with the operating system or programs. Memory in a computer is made of main memory, cache memory and registers. ■ Registers are tiny memory locations within the CPU with very fast access times. ■ Cache memory sits between the processor and main memory (RAM). It is very small in size but has faster access than RAM. There are two main types of main memory: RAM and ROM. ■ 161 9781510484306.indb 161 19/08/20 8:33 PM 4 Computer systems RAM Random access memory (RAM) is volatile, which means it needs constant power to maintain it. If the power is turned off, the RAM loses its contents. ■ ■ ■ ■ RAM holds the operating system, and any applications and data currently in use by the computer. The CPU can access RAM quickly – much faster than it can access secondary storage such as a hard disk drive. The more RAM in a computer, the more programs and data it can run at the same time and the better the computer’s performance. RAM is read/write, which means it can be read or written to. ROM Read-only memory (ROM) is non-volatile, which means it does not need power to maintain it. If the power is turned off, ROM keeps its contents. ■ ■ ROM provides storage for data and instructions needed to start up the computer (also known as the boot process). ROM content is read-only, which means it cannot be overwritten. Information in ROM is normally written at the manufacturing stage. RAM Is volatile and needs power to maintain the content Is read and write – data can be read from and written to RAM by the computer Holds the operating system and any programs and data currently in use by the computer ROM Is non-volatile and does not require power to maintain the content Is read-only – the computer cannot overwrite ROM Holds the data and instructions required to start up (boot) the computer Secondary storage Secondary storage is needed to permanently store files and programs. It needs to be: ■ ■ ■ ■ non-volatile – i.e. it doesn’t lose data when switched off low cost high capacity reliable. Magnetic storage Magnetic hard disk drives (HDDs) are the most common type of magnetic secondary storage. ■ They use a stack of magnetic platters (or disks) that rotate. A moving read/write head moves across the surface of the platters to read and write data. ■ Magnetic disks are reliable and cost effective and provide high-capacity storage at low cost. ■ 162 9781510484306.indb 162 19/08/20 8:33 PM AQA GCSE Computer Science Solid-state storage Solid-state storage uses electronic circuits to store data. Solid-state storage is used in portable hand-held devices and increasingly in computers in the form of solid-state drives (SSDs). ■ ■ ■ ■ ■ ■ SSDs use flash memory and have no moving parts. No moving parts means access to data is faster than for a magnetic hard disk drive. No moving parts also means power requirements are low and no noise or heat is generated. SSDs are robust, lightweight and compact, making them ideal for use in portable devices. SSDs have a smaller storage capacity than magnetic hard disk drives and the cost per unit of storage is higher. SSDs are commonly used in tablet computers, mobile phones, cameras and general-purpose computers. Optical storage Optical devices use the properties of light to store data. ■ The most common optical storage medium are optical disks – CDs, DVDs and Blu-Ray disks. ■ They work by reflecting laser light onto the surface of the rotating disk and reading the reflections as 1s or 0s. ■ ■ Blu-Ray disks have a much higher capacity than CDs and DVDs, making them ideal for storing and distributing high definition video films or large amounts of data. Optical media are low cost and robust, making them an ideal way to distribute data. Advantages and disadvantages of each medium type Magnetic storage Solid-state storage Advantages Disadvantages ● Large capacity ● Slower access speeds than ● Small drives can be portable (but solid-state this increases risk of damage) ● Large hard drives are not ● Reliable for medium term portable ● Not as durable – shocks and storage of 5–7 years ● Low cost per GB strong electric/magnetic fields can easily damage a hard drive ● Very fast access speed ● Lower capacity than magnetic ● Very portable storage (though larger than ● Very durable optical storage) ● Likely to be reliable over a very ● More expensive per GB than long period of time magnetic drives 163 9781510484306.indb 163 19/08/20 8:33 PM 4 Computer systems Optical storage Advantages Disadvantages ● Very portable ● Limited capacity ● Very durable – can withstand ● Slowest access times shocks and extreme condi- ● Cost per GB is expensive on CDs tions (but not scratches) ● Reliable for many, many years if looked after ● Cost of individual disks is small Cloud storage ■ ■ The cloud is a generic term that refers to services and applications that are stored or run on remote servers accessed via the internet rather than being stored locally on your computer, tablet or phone. These remote servers use magnetic or solid-state storage technology. Advantages of cloud storage ■ ■ ■ ■ Files and applications can be accessed from anywhere in the world with an internet connection. The amount of storage space is flexible, and users can buy additional storage when they run out of space. All data that is stored in the cloud is regularly backed up and kept secure by the hosting company. Data can be shared easily with colleagues anywhere in the world. Disadvantages of cloud storage ■ ■ ■ ■ An internet connection is required to access files and services. Users have little control over the security of their data. Data stored in the cloud could be targeted more easily by hackers than data stored locally. Cloud providers can change their terms and prices with little notice. Embedded systems ■ ■ An embedded system is a computer system that has been designed for a dedicated function as part of a bigger system. Embedded systems are often manufactured as a single chip. The dedicated hardware and software make embedded systems more robust and reliable than general-purpose computers. 164 9781510484306.indb 164 19/08/20 8:33 PM AQA GCSE Computer Science Embedded systems often have the following characteristics: ■ Designed and engineered to perform a limited set of tasks to reduce size and improve performance. ■ Programs are often uploaded at the manufacturing stage, directly to the device. ■ ■ ■ ■ ■ There are often very limited options to modify these programs. Low power consumption so that they can operate from a small power source. Small in size to fit in portable devices. Rugged so that they can operate in hostile environments. Low cost, making them suitable for mass-produced products. Examples of embedded systems Embedded systems are found in most consumer products such as: ■ ■ ■ ■ ■ ■ ■ ■ washing machines microwave ovens home security systems home heating controls car engine management systems set top boxes telephones televisions. 165 9781510484306.indb 165 19/08/20 8:33 PM QUESTION PRACTICE 4 Computer systems 01 The figure below shows a logic circuit. A G1 D B G3 G2 C P E 01.1 State the type of logic gate labelled G1. [1 mark] 01.2 State what a NOT gate does. [1 mark] 01.3 Complete the following truth table for the logic circuit shown in the figure above by filling in the missing cells. [3 marks] A 0 0 0 0 1 1 1 1 B 0 0 1 1 0 0 1 1 C 0 1 0 1 0 1 0 1 D 0 0 0 0 E 1 0 1 0 P 1 0 1 0 0 0 0 01.4 Give the Boolean expression for the logic diagram shown in the figure. 02 02.1Describe three functions performed by an operating system. [1 mark] [6 marks] 02.2 Identify two types of utility software. [2 marks] 02.3 State two different types of application software. [2 marks] 03 03.1Describe what is meant by a ‘high-level language’. 03.2 Explain why the computer needs to translate the program code before it is executed. 03.3 Describe two differences between how a compiler and an interpreter would translate high-level program code. [2 marks] [1 mark] [4 marks] 166 9781510484306.indb 166 19/08/20 8:33 PM AQA GCSE Computer Science 04 04.1E xplain why a developer would normally use a low-level language when writing programs for an embedded system. [2 marks] 04.2 Describe two differences between high-level and low-level languages. [4 marks] 05 A computer is advertised as: Clock speed 1.3 GHz Cache size 2 MB Number of cores 4 RAM 4 GB 05.1 What is meant by the clock speed of 1.3 GHz? [2 marks] 05.2 How does the number of cores affect the performance of the computer? [2 marks] 05.3 Explain one reason why cache size can affect the performance of the computer. [2 marks] 05.4 Describe what happens during the Fetch–Execute cycle. [3 marks] 06 Von Neuman architecture is the model for most modern computers. 06.1 What are the two main features of Von Neumann architecture? [2 marks] 06.2 (i) What is the purpose of the ALU? [2 marks] (ii) State two examples of processes carried out by the ALU. 06.3 What is the purpose of the control unit? [2 marks] [2 marks] 07 Julie has a computer with 2 GB of RAM and 500 GB of secondary storage. 07.1 (i) State the full names of RAM and ROM. [2 marks] (ii) Describe two differences between RAM and ROM. [4 marks] (iii) Describe what is held in RAM while a computer is working. [3 marks] (iv) Describe what is held in ROM on a computer. [2 marks] 07.2 (i)Explain what secondary storage is and why it is necessary in a computer. (ii) Identify three types of secondary storage. [4 marks] [3 marks] 167 9781510484306.indb 167 19/08/20 8:33 PM 4 Computer systems 08 Embedded systems are used in a range of devices. 08.1 What is an embedded system? [2 marks] Home heating monitoring systems use embedded systems. These systems will have both RAM and ROM. 08.2 State one item of data stored in RAM and one item of data stored in ROM. [2 marks] 08.3 Describe two inputs and one user interface output for such a system. [6 marks] Embedded systems are also used in heart monitors and pacemakers implanted inside people's chests to monitor and control their heart rate. 08.4 Identify and describe two features that make an embedded system appropriate for use in a heart monitor or pacemaker. [6 marks] 168 9781510484306.indb 168 19/08/20 8:33 PM 5 CHAPTER INTRODUCTION In this chapter you will learn about: 5.1 Computer networks ➤ The advantages and disadvantages of computer networks ➤ The main types of network: PAN, LAN and WAN ➤ Wired and wireless networks ➤ LAN topologies – star and bus ➤ Network protocols, including: Ethernet, Wi-Fi, TCP, UDP, IP, HTTP, HTTPS, FTP, SMTP, IMAP ➤ The 4-layer TCP/IP model: application, transport, internet and link layers ➤ Network security, including: authentication, encryption, firewalls and MAC address filtering FUNDAMENTALS OF COMPUTER NETWORKS 5.1 Computer networks A computer network is two or more computers or devices that are linked together, either using cables or wirelessly. This means that that they can communicate with one another and can share resources. As well as computers and printers, devices that may be connected to a network include: ● smartphones ● tablets ● gaming consoles ● fitness trackers ● smart watches ● internet connected home appliances. Advantages of computer networks One of the main advantages of computer networks is the ability for computers to share peripheral devices such as printers and scanners. They also allow a single internet connection to be shared, so network users can access email. This can reduce the costs of the network as less hardware is needed. 169 9781510484306.indb 169 19/08/20 8:33 PM 5 Fundamentals of computer networks Another advantage of computer networks is the ability to exchange data between computers without needing to use physical media such as memory sticks or external hard drives. This can be particularly useful in a school or office setting and can be achieved using shared drives and folders. This also allows files to be backed up centrally. Using networks in larger organisations like schools and businesses allows computers to be managed centrally by a network manager. This enables them to update software remotely and manage security centrally through the use of firewalls and anti-malware software. They can also control access to files and resources for different users, and user activity can be monitored. Users are able to log in to any computer on the network and still access all of the same resources. This means that files can be accessed from wherever they are needed, and separate computers are not needed for every single user. Disadvantages of computer networks Additional hardware is needed to set up a network, which can be expensive, and larger networks will need to be overseen by a network manager. If one machine on a network gets infected with malware, it can quickly spread to other machines on the network if up-to-date anti-malware software and other security measures are not in place. It is also possible that hackers may target a network specifically in order to gain access to several computers. If the network includes a file server to store folders and files, no one will be able to access their files or do any work if the file server goes down. Types of computer network There are three main types of network that you need to know about. Personal area network (PAN) A personal area network (PAN) is used to connect personal devices, such as a mobile phone or laptop to wireless headphones, and only spreads over a small area. The most common type of technology used in a PAN is Bluetooth. This uses short-range radio waves to connect devices over a distance of just a few metres. Local area network (LAN) A local area network (LAN) is the type of network found in homes, schools and singlesite companies and organisations. LANs cover a small geographic area as the computers and devices are usually located on one site. The hardware is usually owned and maintained by the organisation that uses it, and it will often use both wired and wireless connections. 170 9781510484306.indb 170 19/08/20 8:33 PM AQA GCSE Computer Science Workstation Workstation Switch Printer Router Internet Server Figure 5.1 Basic LAN network Wide area network (WAN) A wide area network (WAN) is a type of network used by large organisations such as banks, with offices in different cities that need to be linked together. A WAN typically covers a wide geographic area – it may even be worldwide – and links together the LANs for each different site. The connections between the sites are usually hired or leased from a telecommunication company and may include cable, satellite and telephone lines. Therefore, WANs are usually under collective or distributed ownership as different parts of the network infrastructure are owned and maintained by different organisations. LAN LAN Network servers Gateway router WAN LAN LAN switch Network users LAN Figure 5.2 Basic WAN network The biggest example of a WAN is the internet. 171 9781510484306.indb 171 19/08/20 8:33 PM 5 Fundamentals of computer networks Knowledge check 1 2 3 4 5 State two advantages of using a computer network. State two disadvantages of using a computer network. Describe the characteristics of a LAN. Give one example of a type of technology used in a PAN. Identify two differences between a LAN and a WAN. Wired and wireless networks In order to be able to communicate, devices on a network need to be connected to one another. This can be achieved using wired connections or wireless connections. Wired networks Wired networks use either copper network cables or fibre-optic cables to physically connect the devices. They are often found in office networks where the devices tend to be in fixed places and do not move around. Tech terms Bandwidth How much data can be transmitted, normally measured in bits per second. Ethernet port A special socket on a computer that allows it to be connected to a wired network using the ethernet protocol (see page 176). Copper cable Inside copper cables there are individual copper wires, which are arranged in pairs. Each pair of cables is twisted together to reduce interference from other signals and therefore improve transmission. Data is transmitted using electrical impulses. There are different ratings that indicate how quickly the cable can reliably transmit data and over what range. The bandwidth is generally between 100 megabits per second (Mbps) and 1 gigabit per second (Gbps), for a distance of up to about 100 metres. Most PCs have built-in ethernet ports, so connecting computers using copper cable can be a cost-effective option if the bandwidth is adequate. Copper cable is also already found in telephone networks, and using this existing infrastructure saves the cost of installing new cables specifically for computer networks. Copper cable is relatively cheap to install within a small geographic area, such as a LAN within an office building. Fibre-optic cable Fibre-optic cables are made up of many thin glass strands (or fibres), which transmit data as pulses of light. As they use light to transmit data, they do not suffer from electrical interference. Fibre-optic cables do not break easily as they are strong, flexible and do not corrode. Fibre-optic cables have a very high bandwidth of up to 100 terabits per second (Tbps) and are capable of transmitting data over distances of 100 kilometres or more. For this reason, they are often used to connect WANs across large geographic areas. The cables that cross the oceans to connect different continents to the internet are fibre-optic cables. Fibre-optic cable is more expensive to install than copper cable and is therefore used to transmit data over long distances and to connect WANs. Beyond the spec A megabit is equal to one million bits, a gigabit is one thousand million bits and a terabit is one trillion bits. Megabits should not be confused with megabytes. Remember there are 8 bits in a byte. 172 9781510484306.indb 172 19/08/20 8:33 PM AQA GCSE Computer Science Wireless networks Wireless networks make use of radio waves to connect devices, and include technologies such as Wi-Fi and Bluetooth. The strength of a radio wave decreases as it moves further away from its transmission source, so the radio waves that are used in wireless networks are only suitable for relatively short distances of up to 100 metres. Radio waves are also subject to interference from other radio signals of the same or similar frequencies, and are partially blocked by physical objects such as walls. However, radio waves are ideal for mobile devices as the device will be able to connect to the network as long as it is in range of a wireless access point (WAP). Wireless networks generally have a bandwidth of about 300 Mbps. Advantages and disadvantages of wired and wireless networks Advantages of wired networks ● ● ● ● The speed of data transmission is fairly quick. The connections are stable and reliable, and do not really suffer from interference. It is easier to use security measures such as firewalls as data packets pass through a central point. It is relatively hard to hack into a wired network from outside the network. Advantages of wireless networks ● ● ● ● Disadvantages of wired networks ● ● ● Computers can only be used in fixed locations. Wired networks can be costly to install due to the specialist hardware required. Specialists may be needed to maintain the network. Wireless networks are generally cheap to set up as they do not require cables to be installed throughout a site. Little specialist knowledge is required to set up a wireless network as most devices connect automatically. Users can connect and move around freely as long as they are in range of a wireless access point. Additional devices can be added to the network very easily, and no additional hardware is required. Disadvantages of wireless networks ● ● ● ● The speed of data transmission is slower than wired networks. Wireless connections can be obstructed by walls and other obstacles. Wireless connections are less stable than wired connections and can ‘drop off’ unexpectedly. Wireless networks are generally less secure than wired networks as data packets can be intercepted and their contents accessed if the information has not been encrypted before transmission. Knowledge check 6 7 8 9 Identify two ways in which a desktop computer may be connected to a home network. Identify a type of medium suitable for connecting WANs across large geographic areas. State two advantages of using a wireless network. State two disadvantages of using a wireless network. 173 9781510484306.indb 173 19/08/20 8:33 PM 5 Fundamentals of computer networks Network topologies The way in which devices in a network are arranged and connected together is called the network topology. Any device connected to a network is referred to as a node. There are two common LAN topologies that you need to know about. Star network topology In a star network topology, each computer or client is connected individually to a central node, which can be a file server or a switch or hub. Tech terms Switch/hub Used to connect devices in a wired network so that data can be received and forwarded to the correct destination device. Node The name for any connection to a network. Data collision When two separate data packets are in the same place at the same time. Figure 5.3 Star network topology A star topology is the most common network layout, and it tends to be fast and reliable because each client has its own connection to the central node. As data is only directed from the central node to the intended computer, it helps to keep network traffic to a minimum, and in turn this reduces data collisions. The switch can screen data packets, rejecting any that are corrupt, which can increase security on the network, and it is easy to add new devices as they simply need to be connected to the switch. If the connection to one device on the network fails it is less likely that the rest of the network will be affected. However, star networks require a lot of cabling, as every computer is connected individually, which can be expensive. If the central server or switch fails then so will the entire network. Star networks tend to be found in large organisations such as schools and businesses. They are also found in home networks, especially those that are wireless, with all of the devices connecting to a central router with a built-in wireless access point. They are an ideal choice when trying to create a reliable network as a single computer failure does not affect the rest of the network. It is also relatively easy to add extra devices, so the network has some size flexibility. Bus network topology In a bus network topology, each node is connected directly to a single cable known as the backbone, with terminators at either end. 174 9781510484306.indb 174 19/08/20 8:33 PM AQA GCSE Computer Science Server Workstation Terminator Terminator Printer Figure 5.4 Bus network topology All of the devices share this cable to transmit data to one another, with the data being sent to all of the other nodes. Each node then has to decide if it should accept the data by inspecting the destination MAC address (see page 180). Only one node can successfully transmit data at any one time, as when multiple transmissions are sent simultaneously, the data will collide and have to be re-sent. If this happens, both computers will wait a random amount of time before trying to re-send their information, which slows the network. If for any reason the backbone fails, the whole network will fail. Because of this, standalone bus topologies are not widely used, although they may be used within hybrid network topologies. Bus topologies are suitable for temporary networks in a single room, where the amount of data to be transmitted is low. They are cheap because much less cabling is required compared to a star topology. It is relatively easy to connect nodes to a bus network, and they are also easy to dismantle when no longer required. Knowledge check 10 State what is meant by a network topology. 11 Describe a star topology. 12 Explain one advantage and one disadvantage of using a bus topology. Network protocols All networks work in essentially the same way. A device prepares a data signal to send to another device, and this is transmitted along cables or wirelessly until it reaches its destination address. This transmission is controlled by protocols, which are essentially sets of rules that all manufacturers and devices use. 175 9781510484306.indb 175 19/08/20 8:33 PM 5 Fundamentals of computer networks Ethernet Tech terms Channel Each frequency band (2.4 GHz and 5 GHz) is further broken down into smaller frequency ranges, which wireless devices can receive and transmit data at. Using different channels allows multiple devices to use the same Wi-Fi signal without interfering with each other. Data packet A chunk of data that is sent across a network. Header The part of a data packet that contains details about the data packet, to make sure it is transmitted correctly and can be reassembled when received. Payload The part of a data packet containing the data that the application wishes to transmit. Checksum A mathematical technique that checks for errors in data that has been transmitted. Ethernet is the traditional set of protocols used to connect devices in a wired LAN. Ethernet is not one protocol but a family of protocols. Ethernet protocols define how data should be physically transmitted between different devices, using MAC addresses (see page 180) to determine which device the data should be sent to. Ethernet protocols also define what should happen if collisions occur on the network. Ethernet protocols can be used on copper or fibre-optic cables Wi-Fi Wi-Fi is a set of protocols that define how network devices can communicate wirelessly using radio waves. Wi-Fi is not one protocol but a family of protocols. Wi-Fi is a trademark, and the Wi-Fi standards determine the frequency band and channel that should be used, data transmission rates and how devices should be authenticated when they attempt to join a network. Most Wi-Fi standards transmit data using radio waves in one of two frequency bands, either 2.4 GHz or 5 GHz. Signals transmitted on the 2.4 GHz frequency have a greater range but lower data transmission rates than those using the 5 GHz frequency. The generic term for a wireless network is a wireless LAN (WLAN). Transmission Control Protocol (TCP) Transmission Control Protocol (TCP) is a protocol that splits the data from applications into smaller data packets that can be sent across a network. Each packet is made up of a header and payload. The header contains the sequence number of the packet and a checksum to allow the recipient device to check that it has been sent correctly. The payload contains the actual data from the application that needs to be sent. When a data packet is correctly received, an acknowledgement message is sent back to the server. User Datagram Protocol (UDP) TCP is concerned with the accurate delivery and receipt of data packets that are ordered and error checked. However, the User Datagram Protocol (UDP) is used by apps to deliver data more quickly by removing the acknowledgement to the sender that the data was safely received. Data is sent in a constant stream from the server, without having any checks to make sure that the recipient has received the data correctly. It is frequently used for activities such as live broadcasts and online games where speed of delivery is vital. If packets are lost, the video or audio will continue to play, but will just become a little distorted for a moment. Internet Protocol (IP) The Internet Protocol (IP) defines how data packets should be sent between networks. Every device connected to a network is assigned a unique IP address. This represents the location of the device on the network, just like your postal address indicates where your home is. This address consists of four 8-bit sections, each of which represents a number between 0 and 255. Each section is separated by a full stop. An example IP address is 194.83.249.5. An IP header is added to each data packet containing the source and destination IP addresses for that packet. Routers use this information to determine whether the packet’s destination is on the local network or whether the packet needs to be passed on to another network. Switches on the destination network direct the data to the correct node or device. 176 9781510484306.indb 176 19/08/20 8:33 PM AQA GCSE Computer Science Hypertext Transfer Protocol (HTTP) Tech term Secure socket layer (SSL) An encryption protocol. Hypertext Transfer Protocol (HTTP) is used by web browsers (clients) to request web servers to send a web page and its associated resources to a user’s browser. Hypertext Transfer Protocol Secure (HTTPS) Hypertext Transfer Protocol Secure (HTTPS) is a secure version of HTTP that adds secure socket layer (SSL) encryption to the communications. It is used, for example, for internet banking and online shopping. File Transfer Protocol (FTP) File Transfer Protocol (FTP) is used to transfer computer files between a client and a server. It is commonly used for uploading web pages to web servers. Simple Mail Transfer Protocol (SMTP) Simple Mail Transfer Protocol (SMTP) is used to send email to an email server or between servers – for example, where the sender and recipient have different email service providers. Internet Message Access Protocol (IMAP) Internet Message Access Protocol (IMAP) is a protocol used for accessing email messages. It allows multiple devices to have synchronised access to the same inbox, with the messages retained on the email server. Messages can be organised into folders or flagged as important, and this information is updated when a device next connects to the server, so that the same mailbox view can be seen on several different devices. IMAP is the more commonly used email protocol these days because of its ability to synchronise email accounts across several devices. Knowledge check 13 Explain what is meant by a network protocol. 14 Identify the most appropriate protocol to be used when uploading a file from a computer to a web server. 15 Identify the most appropriate protocol to use when web page communications between a client and host need to be encrypted. 16 Explain the purpose of SMTP. 17 TCP and IP are two protocols used in network communications. State what the initials TCP and IP stand for and describe the function of each protocol. The 4-layer TCP/IP model In order to simplify the network communication process, the different activities involved in sending data packets are divided into layers. Each layer is concerned with a different task, and the relevant protocols are assigned to each layer. The layers are organised into a distinct order in which their rules need to be applied. Collectively, this is called the 4-layer TCP/IP model. It is made up of: ● the application layer ● the transport layer ● the internet layer ● the link layer. 9781510484306.indb 177 177 19/08/20 8:33 PM 5 Fundamentals of computer networks When data is sent, it passes down the 4-layer TCP/IP model, starting at the application layer. As it passes from one layer to the next, the data is encapsulated and additional information is added. When the data reaches its destination, the process is reversed to de-encapsulate the data, starting with the link layer. Application layer This is where the network applications, such as web browsers or email, operate. Transport layer This layer sets up the communication between the two hosts and they agree settings such as ‘language’ and ‘size of packets’. Internet layer Protocols: HTTP HTTPS FTP SMTP IMAP Protocols: TCP UDP Protocols: IP This is where data is addressed and packaged for transmission. It then routes the packets across the network. Link layer This is where the network hardware such as the NIC (network interface card) and OS device drivers are located. Protocols: Wi-Fi Ethernet Figure 5.5 The 4-layer TCP/IP model There are several advantages to organising the protocols in this way. Firstly, it is possible for one layer to be developed or changed without affecting any of the other layers. Software and hardware manufacturers can specialise in understanding one layer without needing to know how every other layer works. This means that devices made by different manufacturers are compatible, giving the consumer more choice. It is also easier to identify and correct networking errors and problems. Beyond the spec The link layer is sometimes referred to as the ‘network access layer’ or ‘network interface layer’. However, you will not need to know these alternative layer names for assessment. Knowledge check 18 Identify two of the layers within the TCP/IP model. 19 What is the purpose of the internet layer within the TCP/IP model? 20 Describe two advantages of using the 4-layer TCP/IP model. 21 Name three protocols which operate at the application layer. 178 9781510484306.indb 178 19/08/20 8:33 PM AQA GCSE Computer Science Network security It is very important that networks are kept secure, as they are more vulnerable to hacking than standalone devices. If a hacker is able to gain access to a network through a single device, they can use this to then gain access to other devices on the same network. This may allow them to steal sensitive data, or to install malware on the network servers. Network security is primarily concerned with preventing unauthorised people from accessing the network. There are several approaches that are used to secure networks. Authentication Authentication is the process of confirming that a user is who they say they are. There are several ways in which this can be done: ● Usernames and passwords are perhaps the most common method. A username and a secret password are chosen by the user and when these are entered a computer system can check that they match those of a known user. Systems are often set up so that the user account is locked after a certain number of failed attempts, to prevent people trying lots of passwords to get into someone else’s account. ● Possession of an electronic key, device or account is used as authentication since only one person will have access to that particular device. Some computer systems will check that an email address or phone number belongs to you by sending an email or text containing a secret code. ● Biometrics is the use of measurements relating to biological traits. If your school uses fingerprint scanners to identify you, you will have experience of this. Banks are increasingly using voice recognition to authenticate users who use telephone services. Two-factor authentication is where two of the above are checked simultaneously. For example, you may log in to a system with a username and password (first method of authentication) and are immediately sent a text message or email to respond to (second method of authentication). Once a user has been authenticated, user access levels can be used to ensure that users can only access the files and other resources that they need to. Encryption Encryption is used to prevent data from being of any use if it is intercepted and read. Any data sent on the internet is potentially vulnerable to being hacked, and it is particularly important that data such as credit and debit card numbers used for online shopping are kept secure. Data that is transmitted wirelessly is particularly susceptible to being intercepted, and so encryption is used to disguise the contents. Encryption relies on the use of a key to encrypt and decrypt the data. An algorithm can be used for this. A very simple form of encryption is called the Caesar cipher. This method displaces letters by a known amount. For example, if the letters were displaced by 4 then the number 4 would be the ‘key’. A displacement of 4 would produce the following look-up table: plaintext A B C D E F G H I J K L M N O P Q R S T U V W X Y Z letter cipher letter E F G H I J K L M N O P Q R S T U V WX Y Z A B C D An encrypted message such as ‘COMPUTING’ would be transformed into GSQTYXMRK. The message would be sent along with the key ‘4’ to allow decryption. 179 9781510484306.indb 179 19/08/20 8:33 PM 5 Fundamentals of computer networks The Caesar cipher is a simple form of symmetric encryption. This means that both the sender and receiver share the same key. However, most websites use asymmetric encryption. This involves the use of a pair of keys, a public key and a private key, which are uniquely paired when they are created. Your public key can be shared with anyone and can be used by them to encrypt a message to send to you. However, only your private key can be used to decrypt the message. Firewall A firewall is a network security device that monitors incoming and outgoing network traffic. It is primarily designed to stop unwanted internet traffic from gaining access to a network, and decides whether to allow or block specific traffic based on a defined set of security rules. For example, executable files, or data sent from specific IP addresses, might be blocked to prevent malware being installed on the system. In a similar way, firewalls can also prevent devices on a network from accessing certain websites. For example, a company may block access to social media sites that are not needed for work purposes. A firewall uses ports to let data in and out of the network, and these can be opened and closed to control the flow of data. Many operating systems include a firewall as part of the software. However, a firewall can also be a separate hardware device, and is often also built into a router. The capabilities of firewalls have changed dramatically in recent years and will continue to do so. MAC address filtering A Media Access Control (MAC) address is a unique number that identifies the actual device that is connected to a network. The MAC address is part of the network interface controller (NIC) inside the device and is assigned when the NIC is manufactured; it cannot ever be changed. A MAC address is made up of 48 bits, shown as six groups of two hexadecimal digits, for example: b8:09:8a:b8:57:17. Key point Remember – one hexadecimal digit represents 4 bits of data. So six groups of two hexadecimal digits is 12 hexadecimal digits, which represents 12 × 4 = 48 bits. MAC address filtering can be used to allow specific devices to access, or be blocked from, a network. This can be done in one of two ways: ● A white list can be created, which will only allow devices on the list to connect to the network. This is the more secure approach but requires each device to be added to the list individually. ● A black list can be used, which will deny access for any devices listed. Authentication, encryption, firewalls and MAC address filtering can all work together to increase the security of a network. Authentication ensures that only authorised users are able to access resources once they have successfully logged in. On wireless networks, data transmissions are encrypted to ensure that even if they are intercepted, they cannot be understood. The actual devices used to access a network are controlled via MAC address filtering. A whitelist means that only known and approved devices can be used to access the network, even if the user is authorised. Alongside these methods, the firewall monitors all incoming and outgoing traffic, filtering out any data that doesn’t meet given criteria and thus stopping access to the computer or network. 180 9781510484306.indb 180 19/08/20 8:33 PM AQA GCSE Computer Science Knowledge check 22 Identify three methods which can be used to help keep networks secure. 23 Give an example of how two-factor authentication may be used when signing up to a website. 24 Explain why sensitive data is encrypted before it is sent on the internet. 25 Explain the purpose of an IP address. 26 Explain the purpose of a MAC address. 181 9781510484306.indb 181 19/08/20 8:33 PM RECAP AND REVIEW 5 FUNDAMENTALS OF COMPUTER NETWORKS Important words You will need to know and understand the following terms: computer network personal area network (PAN) local area network (LAN) wide area network (WAN) internet wired networks copper cable fibre-optic cable wireless networks network topology star network topology bus network topology protocols Ethernet Wi-Fi Transmission Control Protocol (TCP) User Datagram Protocol (UDP) Internet Protocol (IP) Hypertext Transfer Protocol (HTTP) Hypertext Transfer Protocol Secure (HTTPS) File Transfer Protocol (FTP) Simple Mail Transfer Protocol (SMTP) Internet Message Access Protocol (IMAP) 4-layer TCP/IP model layers application layer transport layer internet layer link layer network security authentication encryption firewall Media Access Control (MAC) address 5.1 Computer networks A computer network is formed when two or more computers and devices are linked together so that they can communicate and share resources. Advantages of computer networks include: ■ computers can share peripheral devices, such as printers and scanners data can be exchanged between computers directly ■ files can be backed up centrally ■ software updates and security can be managed centrally ■ access to files and resources can be controlled for different users ■ user activity can be monitored. Disadvantages of computer networks include: ■ additional hardware is required ■ larger networks will need to be overseen by a network manager ■ malware can spread quickly within a network ■ a hacker could gain access to several computers ■ if a central file server goes down no one will be able to access their files. ■ Types of computer networks A personal area network (PAN) uses Bluetooth (short-range radio waves) to connect personal devices, such as a mobile phone and wireless headphones, over a distance of just a few metres. A local area network (LAN) consists of devices connected together in a single building or site, such as a school or office, using wired and wireless connections. They are often owned and controlled by a single person or organisation. A wide area network (WAN) is formed by connecting together LANs. They usually cover a wide geographic area and use telephone lines, fibre-optic cables and even satellites, and so are often under collective or distributed ownership. The largest example of a WAN is the internet. Wired and wireless networks Networks can use both wired and wireless connections to link devices together. 182 9781510484306.indb 182 19/08/20 8:33 PM AQA GCSE Computer Science Wired networks Wired networks use copper or fibre-optic cables to connect the devices. They tend to be found where the devices are fixed in one place and do not move around. Copper cable: ■ consists of twisted copper wires ■ transmits information using electrical impulses ■ is relatively cheap to install within a small geographic area, such as a LAN within an office building ■ has a bandwidth between around 100 Mbps and 1 Gbps, for a distance of up to about 100 metres. Copper cable is already found in telephone networks and so can be convenient and cost effective if a high bandwidth is not required. Fibre-optic cable: ■ consists of thin glass strands/fibres ■ transmits information using light pulses ■ is more expensive to install and is used to transmit data over long distances and to connect WANs ■ has a very high bandwidth of up to 100 Tbps. Wireless networks Wireless networks use radio waves to connect devices and are ideal for mobile devices. The two most common wireless network technologies are Wi-Fi and Bluetooth. Advantages of wireless networks include: ■ They are generally cheap to set up. ■ Little specialist knowledge is required. ■ Users can connect and move around freely as long as they are in range of a wireless access point (WAP). ■ Additional devices can be added easily. Disadvantages of wireless networks include: ■ The speed of data transmission is slower than wired networks. ■ Connections can be obstructed by walls and other obstacles. ■ Connections are less stable than wired connections. ■ They are generally less secure if the data is not encrypted before transmission. Network topologies A network topology is the way in which devices are arranged and connected together. In a star network topology, each computer is connected to a central node, which can be a switch or a server. It is the most common network layout. 183 9781510484306.indb 183 19/08/20 8:33 PM 5 Fundamentals of computer networks Star networks: reduce data collisions as data is only sent between the central node and intended computer ■ increase security because the central node can screen data packets ■ are robust – if the connection to one device fails the rest of the network is unaffected ■ require a lot of cabling ■ are found in larger hybrid wired networks and most home wireless networks. In a bus network topology, each node is connected to a single cable called the backbone, with terminators at each end. All devices share this cable to transmit data, with data sent to all of the other nodes. Bus topologies are generally only suitable for temporary networks in a single room. ■ Server Workstation Terminator Terminator Printer Bus networks: ■ ■ ■ ■ ■ are more prone to data collisions as each device sends and receives data using the backbone are less secure as each device sees all data packets are less robust – if the backbone fails the whole network fails require less cabling are found in hybrid networks and very small wired networks. 184 9781510484306.indb 184 19/08/20 8:33 PM AQA GCSE Computer Science Network protocols Protocols are sets of rules or standards that all manufacturers and devices use to communicate with one another. Two families of protocols are used to connect devices within a LAN. Ethernet Wi-Fi Ethernet is a set of protocols used to connect devices in a wired LAN. Wi-Fi is a set of protocols that is used to connect devices using radio waves. Wi-Fi is a trademark and the generic term for wireless networks is WLAN. Other protocols are used to transmit data over a network. TCP UDP IP Transmission Control Protocol splits the data from applications into smaller data packets that can be sent across a network. User Datagram Protocol is similar to TCP but does not include error checking and is used where speed of delivery is vital, such as streaming video over the internet. Internet Protocol adds a header to each data packet, which includes the source and destination IP addresses. There are a range of standard protocols used by applications such as web browsers and email clients. HTTP HTTPS FTP SMTP IMAP Hypertext Transfer Protocol defines the rules to be followed by a web browser and a web server when requesting and supplying information. Hypertext Transfer Protocol Secure uses SSL to encrypt communications between a web browser and a web server to ensure that they are secure. File Transfer Protocol defines the rules for transferring files between computers. Simple Mail Transfer Protocol defines the rules for sending email messages from a client to a server, and then from server to server. Internet Message Access Protocol allows multiple devices to have synchronised access to mail on a mail server. Messages are read rather than downloaded and can be organised and flagged. 185 9781510484306.indb 185 19/08/20 8:33 PM 5 Fundamentals of computer networks The 4-layer TCP/IP model Protocols are assigned to layers, each of which has a specific purpose, to enable communication to take place. Layer Application layer Function Where network applications, such as web browsers and email, operate Relevant protocols HTTP/HTTPS FTP IMAP Transport layer Internet layer Link layer SMTP Sets up communications between the TCP two hosts and agrees settings such UDP as ‘language’ and ‘size of packets’ IP Data is addressed and packaged for transmission, and then routed across the network Wi-Fi Where network hardware, such as the NIC, and OS device drivers are Ethernet located Network security Network security is concerned with preventing unauthorised access to networks. A combination of different methods is usually used to control access to networks, and then to protect data sent across networks. Authentication is the process of confirming that a user is who they say they are. The simplest method involves the user inputting a username and password. Additional factors can be used to make the authentication process more robust, such as the use of biometrics or a code sent to a device in the user’s possession. Encryption is used to disguise the contents of data when it is transmitted to prevent it from being of any use if it is intercepted. It involves the use of a key to encrypt and then decrypt the data. It is especially important that encryption is used on wireless networks. Firewalls monitor incoming and outgoing data on a network and decide whether to allow the data through or to block specific traffic based on a defined set of security rules. MAC address filtering allows devices to access, or be blocked from accessing, a network based on the physical MAC address embedded within the device’s network adapter. (A MAC address is a unique number associated with a device when it is manufactured and it cannot be changed.) 186 9781510484306.indb 186 19/08/20 8:33 PM AQA GCSE Computer Science QUESTION PRACTICE 5 Fundamentals of computer networks 01 01.1 Draw a simple diagram to show a bus topology for five computers. [2 marks] 01.2 State two advantages of connecting the computers together to form a network. [2 marks] 01.3 Identify two ways in which the computers could be connected together. [2 marks] 01.4 State whether the computers will be connected in a LAN or a WAN and justify your answer. [2 marks] 02 02.1 Computers on a network communicate with each other using protocols. Define the term network protocol. 02.2 Identify the protocol used by multiple devices to synchronise access to mail on a mail server. [2 marks] [1 mark] 02.3 Identify two protocols used in the transport layer to transmit data over a network. [2 marks] 02.4 Complete the table by identifying one protocol found in each layer of the TCP/IP model. [4 marks] Layer Transport Link Internet Application Protocol 03 03.1 Describe a star topology. [2 marks] 03.2 Identify and describe three reasons why a large company would choose to use a star topology. [6 marks] 03.3 Explain how MAC address filtering can be used to protect a network. [3 marks] 03.4 Identify and describe two other methods that can be used to help keep a network secure. [4 marks] 04 04.1A small business has three employees, each with a computer, and a printer that they share. They wish to connect these together to form a LAN. Identify which topology would be most suitable for them to use and justify your choice. 04.2 A school network uses a star topology. Explain why this is more appropriate than a bus topology. 9781510484306.indb 187 [3 marks] [3 marks] 187 19/08/20 8:33 PM 6 CYBER SECURITY CHAPTER INTRODUCTION In this chapter you will learn about: 6.1 Fundamentals of cyber security 6.2 Cyber security threats ➤ Social engineering, malware, pharming, weak passwords, misconfigured access rights, removable media, outdated software ➤ Penetration testing 6.3 Methods to detect and prevent cyber security threats ➤ Biometric measures ➤ Password systems ➤ CAPTCHA ➤ Email confirmations ➤ Automatic software updates 6.1 Fundamentals of cyber security Cyber security is about keeping networks and computers, and the files, data and programs stored on them, safe from attack, damage and unauthorised access. Various processes, practices and technologies, such as anti-malware software and firewalls, are used to keep systems secure. Cyber security is primarily concerned with preventing unauthorised access to networks and devices. However, it is important that additional measures are in place to stop or limit the damage or theft of data if someone does manage to gain unauthorised access. 6.2 Cyber security threats Threats to networks and computer systems can come both from internal and external sources, and there are a number of different ways in which networks can be attacked. These include the use of malware, social engineering and other direct attacks on a network. 188 9781510484306.indb 188 19/08/20 8:33 PM AQA GCSE Computer Science Social engineering techniques Social engineering is a form of security attack that involves tricking or manipulating people into giving away confidential information or access details. Fear is often used to put people off-guard and make them more likely to comply with instructions. Techniques include blagging, phishing and shouldering. We will be looking at these in more detail later in this chapter. Malicious code Malicious code, also known as malware, is any kind of malicious program that is installed on a computer system with the intention to cause damage and disrupt its functionality or to steal information. The main types of malware include computer viruses, trojans and spyware. We will be looking at these in more detail later in this chapter. Pharming Tech terms Domain name A website address, for example hoddereducation.co.uk. DNS server A server that contains details of the IP addresses associated with domain names. Brute force attack When hackers use thousands of random combinations of characters every second in order to guess passwords. Dictionary attacks When hackers use known words and previously leaked passwords to guess passwords. It is for this reason that you should never use just one standard word as your password. 9781510484306.indb 189 Pharming is a form of attack where users of a website are directed to a fake version of the website. There are two ways in which this might happen. Malware installed a computer can change the IP address of the domain name to the bogus one, or malware can infect the DNS server itself so that everyone is directed to the bogus site. In both cases, the user will enter the genuine website address into their browser, but will be directed to the fake website, which usually looks exactly like the real one. Here, the user’s login details and personal data will be captured and these details can be used by the criminal to access the user’s real account. This is a particular concern for online banking and e-commerce websites. Weak and default passwords Passwords are commonly used to help prevent unauthorised access to a network or computer. However, they are only effective if they remain secret and are not easy to systematically guess by brute force attacks. They should not be used for multiple accounts and should never be written down. A brute force attack is where a hacker systematically tries combinations of letters, numbers and symbols in order to discover a password. These attacks are often automated, enabling thousands of combinations to be tried every second. Dictionary attacks work in the same way but use lists of known passwords and standard words in the dictionary. A weak password is one that can be easily discovered or detected by people who should not know it. Examples of weak passwords include words picked out of the dictionary, simple patterns of letters from a computer keyboard, car registration numbers, or the dates of birth and names of family members or pets. Long passwords that use a combination of letters, numbers and symbols will take longer to guess in a brute force attack. Most computer systems and devices provide the user with a default password when they are set up. Many systems prompt or force the user to change the password when they first log in. However, a number of devices, such as routers, do not routinely require the password to be changed and the password can often be found in the instruction manual or even on the device itself. If a default password is not changed, it leaves the system far more vulnerable to unauthorised access. 189 19/08/20 8:33 PM 6 Cyber security Misconfigured access rights Tech term Insider attack When unauthorised access is attempted by someone who has some authorised access to the network. Users of a network are often arranged into user groups. Each group has different access rights that determine what software, hardware and files they are permitted to access. For example, on a school network, staff may be able to access certain folders that pupils cannot. User access levels are an important way of avoiding attacks caused by the careless actions of users. Preventing normal users from installing new software means that malware cannot be installed even if a user is lured into clicking on a suspicious link. In addition, access to confidential information can be limited to only those who need it, which helps to protect against insider attacks. If the access rights are not set up correctly then a user may be able to access emails and files belonging to another user, or they may be prevented from accessing their own emails and files. Removable media Removable media refer to storage devices such as USB memory sticks, external hard drives and CDs or DVDs. These devices pose two separate threats to computer systems: data theft and virus infection. Data theft can include both intentional theft of data, where an employee deliberately copies sensitive data onto removable media to pass on to a third party, and unintentional loss of data, where removable media containing unencrypted data are lost and fall into the wrong hands. If removable media contain malware, it will attempt to install itself when connected to a computer. It can then quickly spread to all other devices on the network unless suitable security measures, such as anti-malware software, are in place. For these reasons, many organisations prevent the use of removable media on their computer systems. Unpatched and/or outdated software Patching is the process of updating software to fix a problem that has been identified or to add new features. Any software can be patched or updated, but it is particularly important that software such as operating systems and browsers are kept up to date as they are more vulnerable to being hacked or attacked by malware. Many programs will automatically update to ensure that they have all necessary patches installed, so that they run smoothly and are protected from new threats. Old versions of operating systems may need to be upgraded to the latest versions to ensure that they are still supported with patches if a new threat is identified. Penetration testing Penetration testing is used to test a system or network in order to identify vulnerabilities in its security that an attacker could exploit. Testers take on the role of hackers and try to gain unauthorised access in a controlled attack. Good penetration testing also assesses the security awareness of users to see how likely they are to fall for social engineering ploys, and demonstrates the effectiveness of network security policies. It may also include checking the organisation’s ability to respond to security incidents and to recover any data that has been lost or compromised following an attack. 190 9781510484306.indb 190 19/08/20 8:33 PM AQA GCSE Computer Science A white-box penetration test is designed to simulate a malicious insider who has knowledge of the target system and is likely to have basic credentials to gain access. The testing is usually designed to check specific vulnerabilities that have already been identified. The aim of a black-box penetration test is to simulate an external hacking or cyberwarfare attack, where the attacker has no knowledge of any usernames, passwords or other normal means of access for the target system, and does not have any knowledge of how the system works. This type of testing can be quite time-consuming as the tester will need to use some brute force techniques to find issues and gain access. As the tester will not know the full functionality of the system, the testing may not check all areas or detect all vulnerabilities. Knowledge check 1 2 3 4 5 6 7 Explain what is meant by cyber security. Explain what is meant by pharming. Identify three different types of malware. Explain what is meant by a ‘weak’ password and give one example. Explain what is meant by a ‘strong’ password. Identify two threats that removable media pose to a network. Describe the difference between white-box and black-box penetration testing. Social engineering The people that use computer systems are always the weakest point in security as they can be influenced and exploited by others into making errors. Social engineering involves manipulating people into doing things that they would not normally do by exploiting their natural inclination to trust what they are being told. Blagging Blagging, also known as pretexting, is often done by phone but can also be carried out face to face. Here, the criminal invents a scenario to persuade the victim to divulge information or perform actions that they would be unlikely to do in ordinary circumstances. Often, they will pretend to be from an official organisation such as a bank, insurance company or the police, or to be another employee of a company or a network administrator. Phishing Phishing uses fake emails, Short Message Service (SMS) messages and websites to trick people into giving away their sensitive data and information (when an SMS message is used it is sometimes referred to as smishing). Emails or messages usually claim or appear to be from a bank or building society, an e-commerce site or an email provider. The messages often ask the user to verify their account by clicking on a link or taking some other similar action. Links often then take the user to a fake version of the website where login details can be captured, and sometimes credit and debit card details. The cybercriminal is then able to use these details to access the real account to steal a person’s identity or money. 191 9781510484306.indb 191 19/08/20 8:33 PM 6 Cyber security Figure 6.1 Phishing email Shouldering Shouldering, or shoulder surfing, involves finding out login details, passwords and personal identification numbers (PINs) by watching people enter them. This could happen by looking over someone’s shoulder as they enter their PIN at a cashpoint or checkout, or even by using recording equipment. For this reason, we are encouraged to use our other hand to cover the keypad as we type in our PIN. When logging in via a computer screen, each character is replaced by an asterisk (*) to help mask what is being typed. Preventing social engineering 192 9781510484306.indb 192 The most important way to protect against social engineering is to educate users so that they are aware of the different ways in which they may be targeted and know what steps to take to avoid becoming a victim. Users should always be suspicious of any unexpected emails, especially if they require urgent action. Steps that can be taken to avoid becoming a victim of social engineering include: ● Checking whether an email is genuine by making sure that the sender’s email address is correct, which we can do by hovering over the sender’s name with the cursor. ● Looking out for typing errors and poor grammar that indicate an email is not authentic. ● Where possible, verifying that an email is legitimate by using another communication method, such as a text message, to confirm the sender. ● Not clicking on a link to their website in an email that claims to be from a bank or other company. Instead, navigate to the website directly using a separate browser. ● Never downloading files from sources you don’t know or trust. ● Always using your hand to cover the keypad when typing in a PIN at a cash point or when using a chip-and-PIN machine. ● Never divulging a PIN over the phone, even if the caller claims to be from your bank. 19/08/20 8:33 PM AQA GCSE Computer Science Knowledge check 8 Define social engineering. 9 Identify three different social engineering techniques. 10 Explain how shouldering is carried out. 11 Blagging is a form of social engineering. Describe how it is used to gain personal data. Malicious code Malicious code, or malware, is software that has been written with the intention to cause damage and disrupt the functionality of a computer system or to steal data. It is usually installed without the user’s knowledge. Malware is the term used to refer to any kind of hostile or intrusive software. Viruses A virus is a computer program that is hidden within another program. The virus code is only run when the host program is executed. Viruses can delete data or change system files so that data becomes corrupted. Some viruses fill up the hard drive/SSD so that the computer runs very slowly or becomes unresponsive. Viruses can replicate themselves and insert themselves into other programs, which can then be passed on. They are often spread through attachments to emails, but may also be spread through files, programs or games downloaded from a web page or by loading an infected memory stick or CD/DVD. To avoid the risk of introducing viruses to a computer system, it is important that users avoid opening emails and attachments from unknown sources, or using unidentified memory sticks. Anti-malware software is designed to detect and remove malware. It protects systems in several ways: ● It performs real-time scans of incoming network traffic to detect whether they have been infected with a virus. ● It performs periodic scans of the whole system looking for malicious applications. ● If a virus or other malware is detected or manages to install itself it is quarantined. This prevents it from running and allows users to attempt to clean or remove it. Anti-malware software needs to be able to get regular updates from the internet as it relies on using up-to-date definitions of the viruses and malware that are known about and how to identify them by their code. Trojans Trojans are programs that users are tricked into installing under the pretence that they are legitimate and useful. They are often free and the malicious code hidden inside them only becomes apparent once the software has been installed. Some Trojans are just annoying, changing the desktop layout and adding new icons, but they can also delete files and use back doors to send screenshots and key presses to a hacker’s computer, allowing them to access your personal information. To avoid installing Trojans, it is important to only download and install programs from trusted websites and companies. 193 9781510484306.indb 193 19/08/20 8:33 PM 6 Cyber security Spyware Spyware is malware that comes packaged with other software, such as free software that a user downloads. It gathers information about a user and sends it to the originator. It includes programs such as keyloggers that record all the user’s keystrokes allowing the originator to obtain passwords and other login details. Most anti-virus software will also detect spyware, but it is possible to get software specifically designed to detect and remove spyware. As with other forms of malware, users should always be cautious about downloading and installing free software. Knowledge check 12 Match the type of malware to the description of how it is spread. Virus Spyware Trojan Malware disguised as legitimate software Malware that comes packaged with other software Malware that is spread through infected files 13 Explain how anti-malware software helps to protect a system. 6.3 Methods to detect and prevent cyber security threats A range of different prevention methods can be used to guard against different types of cyber security threats. Biometric measures Biometric security makes use of a person’s unique physical features, such as fingerprints, facial recognition, voice recognition or even a retinal scan, to authenticate them and allow them access to a system. Fingerprint scanners are a common security feature on many mobile phones, tablets and laptops, and facial recognition is also used to unlock a range of devices. Biometric methods cannot be stolen or forgotten and so are much more secure than passwords. Password systems As discussed earlier, passwords are a common method used to help prevent unauthorised access to a network or computer. It is very important that passwords are strong and that they are not written down anywhere. Some systems ask for certain characters from a password, such as the 2nd, 3rd and 7th characters, rather than requiring users to enter the whole password. This helps to prevent spyware such as keyloggers from capturing full password details. In addition to entering their username and password, two-step authentication requires the user to enter a unique code sent to their phone or via an external app. The code is usually only valid for a short period of time, and requires the user to have a phone in their possession, making it much harder for a hacker to gain access. 194 9781510484306_AQA_GCSE_Computer_Science_CH06.indd 194 19/08/20 10:14 PM AQA GCSE Computer Science CAPTCHA A CAPTCHA is a form of challenge–response test used to determine whether a user is human. It involves asking the user to complete a task that a software bot cannot. The most common type of CAPTCHA is the text CAPTCHA, where the user is shown a series of distorted characters as an image, which they have to type into a form. It is usually possible to hear an audio version of the characters instead. Picture recognition CAPTCHAs show the user a series of images from which they have to select those containing a subset of images, such as all those that include cars. Other types of CAPTCHA simply require the user to tick a box to confirm that they are not a robot. CAPTCHAs are effective in preventing fake registrations on websites, and other forms of spam. However, some people find them quite challenging to read, and may be put off using a website as a result. Beyond the spec CAPTCHA stands for Completely Automated Public Turing test to tell Computers and Humans Apart. Alan Turing was a computer scientist who came up with a method to assess a computer’s ability to display human-like intelligence – this is called the Turing test. Email confirmations When signing up to a new website or service, users often receive an email asking them to confirm that they want to set up the account by clicking on a link. The account will not be registered and activated until the user has clicked on the link and finished the registration process. This system is beneficial to both the user and the website. It helps to confirm that the email address is valid and prove the identity of the user as they need to be able to access their email in order to complete the process. It also alerts users to attempts to use their email address fraudulently. Automatic software updates In order to keep a system as secure as possible, it is important that software, especially operating systems and browsers, is kept up to date. Many programs include an option to ‘automatically update and install’ patches and other updates as soon as they are available. This ensures that the most recent versions of software are always being used, and removes the risk of users forgetting or ignoring the updates. Knowledge check 14 Identify and describe two methods that can be used to ensure that users are real when signing up to a website. 195 9781510484306.indb 195 19/08/20 8:33 PM RECAP AND REVIEW 6 CYBER SECURITY Important words You will need to know and understand the following terms: cyber security social engineering malware pharming access rights removable media patching penetration testing white-box penetration testing black-box penetration testing blagging phishing shouldering anti-malware software virus Trojan spyware biometric password systems CAPTCHA 6.1 Fundamentals of cyber security Cyber security is about keeping networks, computers and the files, data and programs stored on them safe from: ■ ■ attack damage unauthorised access. Various processes, practices and technologies are used to protect systems. ■ 6.2 Cyber security threats There are a range of different cyber security threats to computer systems and networks: Type of threat Social engineering techniques What it does ● People are tricked or manipulated into giving away confidential information or access details. ● Fear is often used to put people offguard. ● Techniques include blagging, phishing and shouldering. Malicious code ● Malicious programs are installed on a computer system. (Malware) ● They cause damage and disrupt functionality or steal information. ● Types of malware include computer viruses, Trojans and spyware. Pharming ● Users of a website are redirected to a fake version of the website. ● Login details are captured to enable the criminal to access the real account. Weak and default ● Weak or default passwords are used to gain access to a network or passwords computer. Misconfigured ● Users are able to access emails and files belonging to another user. access rights Removable media ● Can be used to copy and steal data from a system. ● Can also introduce malware to a system. Unpatched and/or ● Can make a system vulnerable to hacking or being attacked by malware. outdated software 196 9781510484306.indb 196 19/08/20 8:33 PM AQA GCSE Computer Science Penetration testing Penetration testing tests a system or network in order to identify vulnerabilities in its security that an attacker could exploit. Testers take on the roles of hackers to test how easy it is to gain access to systems or resources without the knowledge of usernames, passwords or other normal means of access. A white-box penetration test is designed to simulate a malicious insider who has knowledge of the target system and is likely to have basic credentials to gain access. A black-box penetration test is designed to simulate an external hacking or cyber-warfare attack, where the attacker has no knowledge of any usernames, passwords or other normal means of access for the target system. Social engineering Type of social engineering Methods Blagging (or pretexting) ● The criminal invents a scenario to engage the targeted victim. ● Victims are persuaded to divulge information or perform actions that would be unlikely in ordinary circumstances. Phishing ● Fake emails, SMS messages or websites are used to trick people into giving away their personal data. Shouldering ● Observing over the shoulder of a person as they enter details such as their password or PIN. Malicious code Malware is software that has been written with the intention to cause damage and disrupt the functionality of a computer system or to steal data. Malware refers to any kind of hostile or intrusive software. Anti-malware software, such as anti-virus software, is designed to detect and remove malware. However, users should avoid downloading and installing programs from untrusted websites and companies, and avoid opening email attachments from unknown sources, to prevent malware being introduced into systems in the first place. 197 9781510484306.indb 197 19/08/20 8:33 PM 6 Cyber security There are several different types of malware: Type of malware Virus Trojan Spyware What it does ● Is hidden inside, or attached to, another file or program. ● Is only run when the host program is executed. ● Deletes or corrupts data and files. ● Looks like legitimate software. ● Slows the computer and creates back-door access for hackers. ● Is often bundled with free software. ● Logs activity and keystrokes and sends these back to a criminal. 6.3 Methods to detect and prevent cyber security threats A range of different prevention methods can be used, including: Security measure Biometric measures What it is ● The use of physical features such as fingerprints or facial recognition to authenticate users. ● Found on many mobile devices such as phones, tablets and laptops. Password systems ● Very common method to help prevent unauthorised access to a system or network. ● Strong passwords should be used. ● Some systems ask for certain characters from a longer password. ● Two-step authentication is used for more secure login procedures. CAPTCHA ● A form of challenge–response test to determine if a user is human. ● May involve a user re-typing text displayed as an image, or identifying a subset of images from a larger group. Email confirmations ● New users to a website or service are sent an email asking them to verify their identity, or that they wish to proceed. ● Helps to confirm that an email is valid and alerts users to fraudulent use of their email. Automatic software ● The ability for software, especially operating systems and browsers, to install patches and other updates as soon as they updates become available. 198 9781510484306.indb 198 19/08/20 8:33 PM QUESTION PRACTICE 6 Cyber security 01 A range of prevention methods can be used to guard against different types of cyber security threat. 01.1 Discuss the benefits of the use of user access levels for the network administrator. [4 marks] 01.2 Explain how penetration testing can be used to ensure that the network is secure. [2 marks] 02 There are a number of different ways in which a network can be attacked. 02.1 Explain two ways in which the use of memory sticks may be a security threat. [4 marks] 02.2 State three rules that a password policy should include. [3 marks] 02.3 Explain two other security measures that can be taken help keep a system secure. [4 marks] 03 Users of a system are one of the biggest security risks. 03.1 State what is meant by social engineering. [2 marks] 03.2 Describe how phishing is carried out. [2 marks] 199 9781510484306.indb 199 19/08/20 8:33 PM 7 CHAPTER INTRODUCTION In this chapter you will learn about: 7.1 Relational databases ➤ The concept of a database ➤ The concept of a relational database ➤ The concepts of tables, records, fields, primary keys and foreign keys ➤ The use of a relational database to eliminate data inconsistency and data redundancy 7.2 Structured Query Language (SQL) ➤ Using SQL to retrieve data from a relational database using the keywords SELECT, FROM, WHERE and ORDER BY. ➤ Using SQL to insert data into a relational database using the INSERT keyword ➤ Editing and deleting data from a relational database using the UPDATE and DELETE keywords RELATIONAL DATABASES AND STRUCTURED QUERY LANGUAGE (SQL) 7.1 Relational databases Databases A database is a persistent store of organised data. Persistent means that the data in the database will remain there until it is specifically deleted. Organisation means that the database is structured into records, fields and tables. A record is a data structure that allows multiple data items to be stored in different fields, using field names to identify each item of data. To create a record, we must first define the field names that will make up each record. For a record about a student, these might be: ● ● ● StudentID FirstName Surname We can then store data under these field names in a database management system using a table. 200 9781510484306.indb 200 19/08/20 8:33 PM AQA GCSE Computer Science Table 7.1 Table called ‘Student’ showing four records StudentID FirstName Surname 027341 Bradley Jenkins 027342 Jamie Cable 027343 Charlotte Pegg 027344 Imogen Fletcher The above table has three fields and four records. The StudentID field, however, has a special purpose – it is the field that uniquely identifies the record. No other student may have the same StudentID as another student. The field that uniquely identifies a record is known as the primary key field and each table should have one. So far, this table is well organised. However, if we extend the table to also store data about the registration class that each student is in, we quickly run into problems. Table 7.2 Table called ‘StudentRegistration’ now including registration classes StudentID FirstName Surname RegClass Teacher Room 027341 Bradley Jenkins 9A Mr Craddock Room 7 027342 Jamie Cable 9A Mr Craddock Room 8 027343 Charlotte Pegg 10B Mr Thompson Library 027344 Imogen Fletcher 10B Mr Thompson Library Now that the table stores data about both students and registration classes, we run into the problems of data redundancy and data inconsistency. ● ● Data redundancy is where data is repeated in a database table. For example, we know that group 9A is taught by Mr Craddock and 10B is taught by Mr Thompson, but this is shown more than once. If we extended the number of records in the table to include data about all of the students in the school, this same information would be stored once per student in each class. This makes the database file much larger than it needs to be. Data inconsistency leads on from data redundancy. If data is stored multiple times, what if it isn’t always the same? An example of this is shown in the table above – is Mr Craddock’s class in Room 7 or Room 8? The more redundant data there is, the more likelihood there is that errors are introduced and that some of that data will be inconsistent. Relational databases A relational database is made up of more than one table. This removes the problems of data redundancy and data inconsistency. Using the previous example, we can split the ‘StudentRegistration’ table up into two separate tables – one to store data about students, and one to store data about their registration classes. Table 7.3 Student table StudentID FirstName Surname RegClass 027341 Bradley Jenkins 9A 027342 Jamie Cable 9A 027343 Charlotte Pegg 10B 027344 Imogen Fletcher 10B 201 9781510484306.indb 201 19/08/20 8:33 PM 7 Relational databases and Structured Query Language (SQL) Table 7.4 Registration table Key point A relational database uses multiple tables, linked through primary and foreign keys. The benefit of this type of database is that data is not duplicated, which also means that data cannot be inconsistent. RegClass Teacher RoomCode 9A Mr Craddock Room 7 9B Mr Rouse Room 5 9C Mrs Pearcey Room 2 10A Miss Walters Room 21 10B Mr Thompson Library Each table stores data about a different type of object, person, thing or idea. In this example, the first table stores information about students and the second table stores information about registration classes. The primary key of the Registration table is the RegClass field – no two registration classes can have the same value for this field. To link the tables together, the primary key from the Registration table is included in the Student table; such a field is known as the foreign key of the Student table. Relational databases may have more than two tables. Some complex database systems may have over a hundred tables! The important idea is that each entity (that is, a type of real-world thing) has its own table and each table refers to only one entity. With this in mind, we could expand the student database system with another table, this time to tell us more about the rooms that are used by each registration group. A room table could use the name of the room as the primary key and tell us all we need to know about each room. For example: Table 7.5 Room table RoomCode Location IT_facilities Room 2 Geography block True Room 5 Maths block False Room 7 IT block True Room 21 IT block True Library Sixth form centre False In this example, the primary key of RoomCode is the foreign key in the Registration table. This also gives the benefit of being able to store data about rooms that are not used for registration lessons. Knowledge check 1 State the meaning of the following terms: (a) (b) (c) (d) (e) Foreign key Primary key Table Record Relational database 202 9781510484306.indb 202 19/08/20 8:33 PM AQA GCSE Computer Science 7.2 Structured Query Language (SQL) The use of SQL to search for data Structured Query Language (SQL) is used to access data stored in a database. There are four main keywords in SQL that you need to be familiar with when performing searches: Key point Some special symbols called wildcards can be used in searches. The * wildcard can be used with a SELECT keyword as a shortcut to indicate that all fields from the table should be returned. ● ● ● SELECT identifies the fields to return from the database. FROM identifies which table(s) the data will be returned from. WHERE allows the programmer to include criteria, with only matching records being returned. ● ORDER BY allows the programmer to put the data returned by the query into either ascending (ASC) or descending (DESC) order. SELECT and FROM are compulsory to use in an SQL query. WHERE and ORDER BY are optional – if they are not included then all records from the table specified will be returned in no specific order. Worked example Using the Student table from the previous section as a guide, the FirstName and Surname fields for all students can be shown by using this SQL query: Tech term Wildcards Special characters that can substitute for letters in searches. ELECT FirstName, Surname S FROM Student FirstName Bradley Jamie Charlotte Imogen Surname Jenkins Cable Pegg Fletcher To find the same data, but sorted into ascending order of surname, an ORDER BY keyword can be added: ELECT FirstName, Surname S FROM Student ORDER BY Surname ASC FirstName Jamie Imogen Bradley Charlotte Surname Cable Fletcher Jenkins Pegg To show all fields, but only those students in Registration class 10B, a WHERE clause would be used: ELECT * S FROM Student WHERE RegClass = "10B" 203 9781510484306.indb 203 19/08/20 8:33 PM 7 Relational databases and Structured Query Language (SQL) StudentID 027343 027344 FirstName Charlotte Imogen Surname Pegg Fletcher RegClass 10B 10B To show the StudentID, FirstName and RegClass for all students called either Jamie or Charlotte, sorted by descending order of their studentID, this SQL query would be used: ELECT StudentID, FirstName, RegClass S FROM Student WHERE FirstName = "Jamie" OR FirstName = "Charlotte" ORDER BY StudentID DESC StudentID 027343 027342 FirstName Charlotte Jamie RegClass 10B 9A Knowledge check 2 The table called Address_book contains the following data: First_name Last_name Telephone Email Manjit Wilson 02223334445 mw@notreal.cod Kyle Mills 02232232232 km@notexist.cot Harry Smith 01223123123 harry@home.vid Sheila Jones 01212121212 SJ@home.vid (a) What is returned by the following query? SELECT First_name, Telephone FROM Address_book (b) Write a query to return the First_name and Email for the Last_name ‘Mills’. (c) Write a query to return all the information for the entry with the Email ‘SJ@ home.vid'. (d) Write a query to return the data of everyone in the address book, sorted into ascending alphabetical order of Last_Name. Searching for data using multiple tables Key point The AQA GCSE Computer Science specification states that students will be expected to deal with a maximum of two tables in any SQL query. SQL can also be used to link together tables and extract data from these tables. To do this, an extra WHERE clause is used to join the primary key in one table to the foreign key in the other table. Table 7.6 Student table StudentID FirstName 027341 Bradley 027342 Jamie 027343 Charlotte 027344 Imogen Surname Jenkins Cable Pegg Fletcher RegClass 9A 9A 10B 10B 204 9781510484306.indb 204 19/08/20 8:33 PM AQA GCSE Computer Science Table 7.7 Registration table RegClass Teacher RoomCode 9A Mr Craddock Room 7 9B Mr Rouse Room 5 9C Mrs Pearcey Room 2 10A Miss Walters Room 21 10B Mr Thompson Library In the Student table above, the RegClass field is a foreign key that links to the RegClass field as the primary key in the Registration table. Therefore, the additional WHERE clause that would be needed to link and search these two tables is: WHERE Student.RegClass = Registration.RegClass Note that the table names are used before the field names; this is because the field names are identical so it is important to differentiate them. A query to link the tables together and show details from both linked tables would look like this: SELECT FirstName, Surname, Student.RegClass, Teacher FROM Student, Registration WHERE Student.RegClass = Registration.RegClass This would return the following data, as Mr Craddock is the teacher for 9A and Mr Thompson is the teacher for 10B: Beyond the spec There is another way to join tables together, using a JOIN keyword. AQA GCSE Computer Science students are not expected to know about the JOIN keyword – but if you are familiar with it and want to use JOIN in your answers, this will be accepted. FirstName Surname RegClass Teacher Bradley Jenkins 9A Mr Craddock Jamie Cable 9A Mr Craddock Charlotte Pegg 10B Mr Thompson Imogen Fletcher 10B Mr Thompson Notice that the Teacher field is contained in the Registration table and the other fields are from the Student table. If only certain records are required, the WHERE clause can specify further criteria using the AND keyword. For instance: SELECT FirstName, Surname, Student.RegClass, Teacher FROM Student, Registration WHERE Student.RegClass = Registration.RegClass AND RegClass = "9A" FirstName Surname RegClass Teacher Bradley Jenkins 9A Mr Craddock Jamie Cable 9A Mr Craddock This is the same search as before but this time only students in RegClass 9A are returned from the search. 205 9781510484306.indb 205 19/08/20 8:33 PM 7 Relational databases and Structured Query Language (SQL) The use of SQL to insert, update and delete data Structured Query Language (SQL) also allows us to insert new data into a table using the INSERT INTO and VALUES keywords. The INSERT command must specify the table name and the fields into which you want to enter data. The VALUES command specifies the values that are to be inserted into these fields. It is important that the fields in the INSERT command are given in the same order as the data in the VALUES command. Worked example Using the Room table from the previous section as a guide, a new room can be inserted into the table using the following SQL: NSERT INTO Room (RoomCode, Location, IT_facilities) I VALUES ("Room 27", "Science corridor", True) The Room table would then have the following data: RoomCode Room 2 Room 5 Room 7 Room 21 Library Location Geography block Maths block IT block IT block Sixth form centre IT_facilities True False True True False Room 27 Science corridor True Existing data can also be updated using the UPDATE and SET keywords. It is important when using this to also include a suitable WHERE clause to decide which records will be updated, or all data in the table may be changed! Worked example To update all of the rooms in the IT block to instead be in the Computer Science block, the following SQL can be used: PDATE Room U SET Location = "Computer Science block" WHERE Location = "IT block" The Room table would then have the following data: RoomCode Room 2 Room 5 Location Geography block Maths block IT_facilities True False Room 7 Computer Science block Computer Science block True True Sixth form centre Science corridor False True Room 21 Library Room 27 To update Room 5 to instead be called Maths Room 1, the following SQL can be used: PDATE Room U SET RoomCode = "Maths Room 1" WHERE RoomCode = "Room 5" 206 9781510484306.indb 206 19/08/20 8:33 PM AQA GCSE Computer Science The Room table would then have the following data: RoomCode Room 2 Location Geography block Maths Room 1 Maths block False Room 7 Room 21 Library Computer Science block Computer Science block Sixth form centre Science corridor True True False True Room 27 IT_facilities True Data can also be deleted from a table using the DELETE FROM keywords. As with the UPDATE keyword, it is important when using this to also include a suitable WHERE clause to decide which records will be deleted. Worked example To delete all of the rooms that do not have IT facilities, the following SQL can be used: ELETE FROM Room D WHERE IT_facilities = False The Room table would then have the following data: RoomCode Room 2 Room 7 Room 21 Room 27 Location Geography block Computer Science block Computer Science block Science corridor IT_facilities True True True True As previously mentioned, the use of a WHERE clause is vital when using the delete command unless you intend to delete all records. The following SQL, without using WHERE would delete all records and leave the Room table empty: DELETE FROM Room Knowledge check 3 What do the following SQL command words do? (a) (b) (c) (d) SELECT WHERE FROM INSERT INTO 207 9781510484306.indb 207 19/08/20 8:33 PM RECAP AND REVIEW 7 RELATIONAL DATABASES AND STRUCTURED QUERY LANGUAGE (SQL) Important words You will need to know and understand the following terms: database record field names table relational database primary key foreign key data redundancy data inconsistency Structured Query Language (SQL) 7.1 Relational databases Databases A database is a persistent store of organised data. ● A record is a data structure that allows multiple data items to be stored. ● Field names are used to identify each item of data. ● Data is organised using field names and stored in a table. Relational databases A relational database uses more than one table. Each table stores data about one entity (a person, object or thing). For example, a car rental database may have a table for cars, a table for customers and a table for staff members. Each table in a relational database has a primary key field – this is the field that uniquely identifies the record – no other record will have the same data in this field. To link tables together, a foreign key is used. A foreign key is the primary key from one table stored in another table. For example, a car rental table will have the primary key of CarRegistrationNumber but may also have a foreign key of StaffID. This will link to the staff table to show who is responsible for that car. A relational database is used to remove the problems of data redundancy and data inconsistency. ● Data redundancy is where the same data is repeated in a database table. ● Data inconsistency is where repeated data is contradictory, so users of the database do not know what the actual correct value should be. 7.2 Structured Query Language (SQL) The use of SQL to search for data Structured Query Language (SQL) is a language used to access data stored in a database. ● ● ● SELECT identifies the fields to return from the database. FROM identifies which table(s) the data will be returned from. WHERE is an optional command that allows the programmer to include criteria, with only matching records being returned. It is also used to join and search across multiple tables. 208 9781510484306.indb 208 19/08/20 8:33 PM AQA GCSE Computer Science ● ORDER BY is an optional command that allows the programmer to sort the data returned into ascending or descending order using the ASC or DESC keywords. For example: SELECT names, points, bonus FROM games WHERE points > 100 ORDER BY names DESC Searching for data using multiple tables An additional WHERE clause is used to join together the primary key in one table to the foreign key in another related table. If both field names are the same, the table name should be placed before the field name, separated by a full stop (e.g. Student.RegClass). For example, the following SQL query will join together the games and players tables, using playerid as the primary key in the players table and the foreign key in the games table. SELECT * FROM players, games WHERE players.playerid = games.playerid Additional criteria can be added using an AND keyword, for example to find all games that ended with zero points: SELECT * FROM players, games WHERE players.playerid = games.playerid AND points = 0 The use of SQL to insert, update and delete data The INSERT INTO and VALUES keywords are used to insert new records into a table. For example: INSERT INTO games (name, points, bonus) VALUES ("Seth", 100, 20) The UPDATE, SET and WHERE keywords are used to update data in a table. For example: UPDATE games SET points = 50, bonus = 15 WHERE name = "Zoe" The DELETE FROM and WHERE keywords are used to delete data from a table. For example: DELETE FROM games WHERE points < 10 209 9781510484306.indb 209 19/08/20 8:33 PM QUESTION PRACTICE 7 Relational databases and Structured Query Language (SQL) 01 The table below shows the data table ‘sales’. RecordID ItemName Supplier Price QuantityStock 231 311 232 421 331 Fork Tray Plate Spoon Dish Hodden Bilton Hodden Caddon Bilton 0.25 8.50 0.12 1.20 6.35 3000 10 1200 55 15 01.1 What will be returned by the queries: [6 marks] (i) SELECT ItemName, Price FROM sales WHERE Supplier = "Bilton" (ii) SELECT RecordID, ItemName, Supplier FROM sales WHERE Price < 1.00 (iii) SELECT RecordID, ItemName FROM sales WHERE QuantityStock > 1000 01.2 Write a query to print out the RecordID, ItemName and Supplier for all items where the stock level is less than 50. 02 The data table ‘Cities’ is shown below. City Country Population Currency Birmingham Birmingham Detroit Paris Barcelona UK USA USA France Spain 1100 212 689 2244 1602 UKP USD USD EURO EURO [4 marks] What is returned by the query: 02.1 SELECT City, Population FROM Cities WHERE Currency = "UKP" 02.2 SELECT City, Country FROM Cities WHERE Population > 1500 [1 mark] [2 marks] Write a query to return: 02.3 The city and population for all cities where the currency is the euro. [4 marks] 02.4 All the data about any city with a population less than 1000. [4 marks] 210 9781510484306.indb 210 19/08/20 8:33 PM 8 ETHICAL, LEGAL AND ENVIRONMENTAL IMPACTS OF DIGITAL TECHNOLOGY ON WIDER SOCIETY CHAPTER INTRODUCTION In this chapter you will learn about: 8.1 The impacts and risks of digital technology on society: ➤ Ethical impacts ➤ Legal impacts ➤ Environmental impacts ➤ Data privacy Tech term Immoral Not conforming to accepted standards of behaviour. 8.1 Ethical, legal and, environmental impacts and risks of digital technology Ethical impacts Ethics refer to what people believe to be right and wrong. While most people probably agree about most things that constitute ethical behaviour, there is not always a definitive definition. The ethical use of computer technology is about acting in a responsible way and ensuring that no harm is caused to others. Ethics are not the same as legalities – something may be immoral but not illegal. However, a good legal system will be based on an ethical approach. 211 9781510484306.indb 211 19/08/20 8:33 PM 8 Ethical, legal and environmental impacts of digital technology on wider society Ethics are to some extent a personal thing but there are codes of ethics laid down by various organisations including associations of computing professionals. The BCS (British Computer Society) has some fairly typical ethical standards that it recommends computing professionals should adhere to. They include not undertaking work that is beyond the individual’s capability, not bringing the profession into disrepute, avoiding injuring others and not taking bribes. Another organisation, The Computer Ethics Institute, lists ten commandments for computer ethics: 1 Thou shalt not use a computer to harm other people. 2 Thou shalt not interfere with other people’s computer work. 3 Thou shalt not snoop around in other people’s computer files. 4 Thou shalt not use a computer to steal. 5 Thou shalt not use a computer to bear false witness. 6 Thou shalt not copy or use proprietary software for which you have not paid. 7 Thou shalt not use other people’s computer resources without authorisation or proper compensation. 8 Thou shalt not appropriate other people’s intellectual output. 9 Thou shalt think about the social consequences of the program you are writing or the system you are designing. 10 Thou shalt always use a computer in ways that ensure consideration and respect for your fellow humans. Legal impacts The widespread use of computers has had a legal impact, meaning that new laws have had to be constructed in response to technological changes. Computers are associated with a wide range of existing and new criminal activities including: ● unauthorised access to data and computer systems for the purpose of theft or damage ● identity theft ● software piracy ● fraud ● harassment, such as trolling. Various laws describe the rules that computer users must obey. These laws are designed to prevent the misuse of computer systems and they can vary from country to country. It is a criminal offence to not follow laws. Beyond the spec Legislation relevant to Computer Science Tech term GDPR A Europe-wide law enforcing an individual’s rights over their data. The Data Protection Act 2018 Computers hold vast amounts of data and it is important this data is collected, stored and processed in ways that protect the individual. The Data Protection Act (DPA) 2018 sets out rules for handling this personal data and is the UK’s implementation of the General Data Protection Regulation (GDPR). Every organisation holding personal data, apart from those 212 9781510484306.indb 212 19/08/20 8:33 PM AQA GCSE Computer Science with specific exemptions, must register with the Information Commissioner’s Office and disclose what data they are holding, why they are collecting it and how it will be used. The seven principles of the DPA are: ● Lawfulness, fairness and transparency: There must be valid reasons for collecting and using personal data. ● Purpose limitation: The purpose for processing the data must be clear from the start. ● Data minimisation: Data being processed must be adequate, relevant and limited to what is necessary. ● Accuracy: Steps must be taken to ensure data is accurate, up to date and not misleading. ● Storage limitation: Data must not be kept for longer than necessary. ● Security: There must be adequate security measures in place to protect the data held. ● Accountability: The data holder must take responsibility for how the data is used and for compliance. Exemptions are granted to specific sectors including national security, scientific research, financial services and the Home Office. Computer Misuse Act 1990 Under the provisions of the Computer Misuse Act 1990 it is a criminal offence to make any unauthorised access to computer material … … with intent to commit further offences (for example blackmail) … with the intent to modify the computer material (for example distributing viruses). The first provision refers to unauthorised access (commonly called hacking). The second provision refers to anything that impairs the performance of a computer system including the distribution of viruses. Copyright, Designs and Patents Act 1988 The Copyright, Designs and Patents Act 1988 protects the intellectual property of an individual or organisation. Under the act it is illegal to copy, modify or distribute software or other intellectual property without the relevant permission. This act also covers video and audio where peer-to-peer streaming has had a significant impact on the income of the copyright owners. Using the internet to download free copies of copyright material (e.g. software, films, books, music) is illegal since no money or credit will have been passed on to the original creator. Environmental impacts The use of technology also has an environmental impact. Most modern computers consume low levels of electricity but are often left running permanently. Data centres, which are large facilities that store all sorts of data (like Instagram accounts, YouTube videos, etc.), account for around 2% of all energy used on the planet – this is the same amount as for air travel. In addition, energy is of course also used to manufacture a computer in the first place. As with all consumer electronics, computers are made from valuable physical resources such as metals and minerals, some of which are very rare and non-renewable. Computers also include some pretty toxic material, such as airborne dioxins, polychlorinated biphenyls (PCBs), cadmium, chromium, radioactive isotopes and mercury. Their disposal raises environmental issues and needs to be handled with great care. Unfortunately, old computer equipment is often shipped to countries with lower environ­ mental standards, to reduce the cost of disposal, and they end up in landfill sites. In 213 9781510484306.indb 213 19/08/20 8:33 PM 8 Ethical, legal and environmental impacts of digital technology on wider society some cases, children pick over the waste to extract metals that can be recycled and sold, thus exposing them to significant danger. Figure 8.1 Discarded computer equipment is often picked over to extract metals However, there are also a number of positive effects on the environment from computer use, including: ● A reduction in the use of paper, with on-screen reading replacing the need for physical copies of documents. ● Laptops and the internet allow people to work from home, reducing the need for travel which reduces energy consumption and CO2 emissions. ● Computers are essential tools for scientific research into the development of renewable energy and more energy-efficient devices. ● Smart metering constantly monitors and accurately reports energy and water use. ● Computers are also essential for the management of renewable and low energy-use technology. Privacy issues Most people agree they have a right to some degree of privacy. However, we often provide lots of personal information to organisations whenever we access the internet, particularly when signing up to services with accounts. Organisations may even share this information with third parties – which we may have accidentally agreed to when we set up an account. Personal details and details of activities are often willingly shared on social media. People do not always realise how much personal information they are sharing and exactly what can be seen by whom. For instance: ● Whenever we check in on social media the location and time is logged. ● Many apps track the location of your mobile phone. Whenever we take a picture with our phone’s camera, the location and time are also logged. When such images are uploaded to social media sites, the companies are able to access this information and automatically scan images to try and work out who was in the picture through facial recognition algorithms. 214 9781510484306.indb 214 19/08/20 8:33 PM AQA GCSE Computer Science Knowledge check 1 Is it reasonable for organisations to demand access to and monitor social network pages where the content is posted from private computers? 2 Discuss the environmental impact of computer use. 3 Identify two ways that individuals might be monitored in their daily life. 4 What issues may result from unwise posts on a social media site? The impacts of digital technology on wider society The widespread use of computer technology in all aspects of daily life has brought many benefits for the individual and society. Computer systems are now involved in most human activities. You will need to consider the impacts of the following technology applications. Cyber security Cyber security is the technology used to protect networks, individual users and data from attack or damage. The nature of these attacks and the measures to deal with them are covered in Chapter 6. The Computer Misuse Act 1990 makes hacking and intentional damage to computers or data illegal. Ethical impacts Legal impacts Environmental impacts Individual privacy and security Legislation: Computer Misuse Act Hacking industrial and public utility systems may have severe environmental consequences Identity theft Loss of or damage to personal or corporate data Hacking and distribution of viruses are criminal acts Mobile technologies Mobile technologies are very useful and a key part of modern life, but they can be tracked, and calls and messages can be intercepted and used by the authorities. This can be seen as a positive when it comes to tracking criminal activity but as a threat to personal privacy. Ethical impacts Legal impacts Environmental impacts Sharing personal data can have unforeseen and potentially harmful consequences Tracking criminal behaviour or tracking individuals Mobile devices use large amounts of rare and harmful materials Personal privacy issues with tracking and use of data Trolling and other illegal or abusive activities Many devices are not sent for recycling increasing the demand for these materials 215 9781510484306.indb 215 19/08/20 8:33 PM 8 Ethical, legal and environmental impacts of digital technology on wider society Wireless networking Most places have extensive provision for wireless technology, most commonly to provide internet access for mobile devices but also for smart monitoring of all kinds of infrastructure. Smart metering is seen as having a positive impact on energy use. Wireless technology can be vulnerable to attack, with messages (including sensitive information such as passwords) susceptible to being intercepted. Attacks on wireless infrastructure networks could cause widespread disruption of public services. See Chapter 5 for more details about wireless networking and the use of encryption to protect data on these networks. Ethical impacts Legal impacts Environmental impacts Wireless networking can be vulnerable to eavesdropping providing access to sensitive and personal data Accessing data from wireless networks would be an offence under the Computer Misuse Act Wireless networking can be vulnerable to attack causing widespread disruption Cloud storage Cloud storage is file storage that is accessed via the internet, meaning that data can be accessed from any computer anywhere in the world. The drawbacks of cloud storage include the need for internet access and the individual’s lack of control over the security of the files. The security of data is the responsibility of the cloud service provider and the storage may be in countries with different data protection laws. See Chapter 4 for more information about cloud storage. Ethical impacts Legal impacts Environmental impacts Many people store personal information which may be vulnerable to hacking and misuse Many data centres are outside the UK, EU or USA and different data protection standards may apply Large data centres use a significant amount of energy to store vast quantities of data Hacking Hacking is the unauthorised access to computer material and is covered by the Computer Misuse Act 1990. Data stored on cloud servers may be more vulnerable to hacking than that stored on physical media on the user’s own computer. Ethical impacts Legal impacts Environmental impacts The privacy of an individual may be compromised from hacking, illegally accessing personal data The Computer Misuse Act makes hacking illegal Hacking of public utilities can compromise systems causing widespread disruption to public services Wearable technologies There are an increasing number of wearable technology devices such as activity monitors and smart watches. These devices track movements and health data indicators, which are often shared with friends, family and others via websites or apps. Ethical impacts Legal impacts Personal privacy may be Illegally accessing this data is compromised through tracking outlawed in the UK under the of a wearer’s activity Computer Misuse Act Environmental impacts Wearable devices like all mobile devices use rare and harmful materials meaning careful recycling is important 216 9781510484306.indb 216 19/08/20 8:33 PM AQA GCSE Computer Science Computer-based implants Computer-based implants are often used for medical reasons, to assist with some conditions. These include: ● to help with partial hearing ● heart monitoring and regulation ● enhancements to help the partially sighted ● to control robotic limbs. Ethical impacts Legal impacts Environmental impacts Currently few devices exist that Currently few devices exist that As with all mobile devices, present any ethical issues present any legal issues these use rare and harmful material requiring specialised recycling Autonomous vehicles An autonomous vehicle is another name for a driverless car. The driverless car is becoming a reality – there are many current trials of autonomous vehicles. While autonomous vehicles can communicate with each other and cooperate effectively to provide a safe environment, there are issues with these vehicles dealing with the less predictable behaviour of human drivers. There are issues around the legal responsibility for any potential incidents – is the occupant of the car responsible for what the computer system does, or is it the manufacturer of the car or the programmer of the device? It must be remembered that most car accidents currently are due to human error. Ethical impacts Legal impacts Environmental impacts Autonomous vehicles use artificial intelligence (AI) to monitor other vehicles and road users; what happens to that data? In the case of an accident, who is responsible, the owner of the vehicle or the developer of the AI system used to operate the vehicle? Where the majority of the vehicles are autonomous, they can communicate and cooperate providing a more efficient means of transport Knowledge check 5 Write down two advantages and two disadvantages of using social media every day. 6 Identify two advantages of monitoring an individual’s internet searches for an online store. Key point These topics are often asked as open-discussion questions in which you are expected to give a balanced argument. Think about both sides of the issue and present a conclusion based on the balance of the evidence. These questions may well be asked within a particular context, so make sure your answers reflect this context rather than simply writing down a list of bullet points. It is important to consider the point of view you are expected to take and respond based on evidence seen from that person’s or organisation’s perspective. What may be an advantage from one point of view may be a disadvantage from another. 217 9781510484306.indb 217 19/08/20 8:33 PM RECAP AND REVIEW 8 ETHICAL, LEGAL AND ENVIRONMENTAL IMPACTS Important words You will need to know and understand the following terms: ethics cyber security hacking mobile technologies wireless networking cloud storage 8.1 Ethical, legal, environmental impacts and risks of digital technology Ethical impacts Ethics refer to what is right and wrong and how people should behave. The ethical use of computer technology is about acting in a responsible way and ensuring that no harm is caused to others. Ethics are not the same as legalities – something may be immoral but not illegal. Computer organisations such as BCS have codes of conduct that prescribe suitable ethical behaviour for its members. Legal impacts Legal issues are laws drawn up to govern activities and control computer crime, and include: unauthorised access to data and computer systems for the purpose of theft or damage ■ identity theft ■ software piracy ■ fraud ■ harassment, such as trolling. Laws relevant to Computer Science include: ■ The Data Protection Act 2018: Sets out seven key principles that should be central to processing personal data. ■ The Computer Misuse Act 1990: This act makes it a criminal offence to access or modify computer material and includes hacking and the distribution of malware. ■ ■ The Copyright, Designs and Patents Act 1988: Protects the intellectual property of an individual or organisation making it illegal to copy, modify or distribute software or other intellectual property such as music and video. Environmental impacts The negative environmental impacts of widespread computer use include: ■ There are large global energy requirements to run computer systems and data centres. 218 9781510484306.indb 218 19/08/20 8:33 PM AQA GCSE Computer Science Rare and non-renewable metals and minerals need to be used. ■ Some computer components are made from toxic materials that are a hazard to the environment and to human health if not disposed of properly. The positive environmental impacts of widespread computer use include: ■ Homeworking reduces the need to travel, which reduces CO emissions. 2 ■ More on-screen documents mean a reduction in the use of paper and other resources. ■ Computers enable scientific research, which leads to more environmentally friendly technologies, such as electric cars, the design of solar panels and so on. ■ Privacy issues Using computers raises concerns about individual rights to privacy. Ways in which individuals are monitored include: ■ ■ ■ ■ ■ ■ ■ Companies can monitor exactly what their workforce are doing on their computers. Use of CCTV and facial recognition. Automatic number plate recognition (ANPR). Websites can track a lot information about your internet activities: your location, your browser, your IP address, your operating system, what websites you have visited and what you have searched for. This data might be used to provide insights, for example to target advertising. Mobile phone companies are able to track an individual’s location from their mobile phone, even if they are not using it. Mobile phone call records are also stored and can be accessed by law enforcement agencies if requested. With the wrong privacy settings, social media activity is available for anyone to see. The impacts of digital technology on wider society You will be expected to think about all of the issues above in relation to the following: ■ cyber security ■ mobile technologies ■ wireless networking ■ cloud storage ■ hacking ■ wearable technologies ■ computer-based implants ■ autonomous vehicles. 219 9781510484306.indb 219 19/08/20 8:33 PM QUESTION PRACTICE 8 Ethical, legal and environmental impacts of digital technology on wider society 01 There has been a lot of discussion about the benefits of facial recognition and the invasion of privacy brought about by the use of facial recognition. Discuss the impacts of the use of facial recognition around a busy transport hub. [6 marks] 02 Lots of people use wearable devices to track activity. Describe the advantages and disadvantages to the individual when using these devices. [6 marks] 03 Driverless cars are becoming a reality. Discuss the main issues related to the use of driverless cars, considering the ethical and legal aspects. [6 marks] 04 Identify two computer-based implants and describe the benefits and drawbacks to the individual of these devices. [6 marks] 05 Many people use a cloud-based storage system to back up their mobile telephone. Discuss the advantages and disadvantages of cloud storage for the individual. [4 marks] 220 9781510484306.indb 220 19/08/20 8:33 PM GLOSSARY 4-layer TCP/IP model A way of organising protocols according to their function. Abstraction Removing unnecessary or irrelevant detail from a problem. Bit (b) A binary digit, 0 or 1, symbol b. Black-box penetration test A test designed to simulate an external attack where the hacker has no knowledge of any access credentials. Access rights The permissions that a user on a computer system has to view, write, modify and delete files, change configurations, and add or remove applications. Different types of users may have different permissions. Blagging Also referred to as pretexting, the use of an invented scenario to obtain sensitive data. Algorithm A sequence of instructions. Boolean logic Form of algebra in which all values are True or False. Amplitude The fluctuation of a sound wave that determines the loudness of the sound. Boolean A data type that can either be True or False. Named after George Boole, an English mathematician and logician. Analogue Continuously changing values. Boolean operators Such as AND, OR, NOT, used to compare multiple conditions. AND gate An AND gate output is 1 only if the two inputs are also both 1. Boundary test data Values that sit at the very end of the expected range. Anti-malware software Software installed on a computer system to protect it against malware and unauthorised access. Buses Communication channels through which data moves around the CPU. Application layer Layer in the TCP/IP model concerned with network applications. Bus network topology A network arrangement where each device is connected by a single cable known as the backbone. Application software The programs used to carry out different tasks on a computer, such as word processors, media players and browsers. Byte (B) A group of 8 bits, symbol B. Arithmetic Logic Unit (ALU) The component of a CPU that performs arithmetic and logical operations. Call Instruct the program to stop execution of the current line and instead execute a subroutine. After the subroutine has finished, execution returns to the main program. Arithmetic operators Such as +, −, *, / etc. Used to perform mathematical operations. Array A series of items of data that are grouped together under one identifying name or label. 1D arrays label each item of data using one index number. 2D arrays label each item of data using two index numbers. Cache memory A small area of memory that can be accessed quickly. Used to store commonly used instructions or data. CAPTCHA A challenge–response test used to determine if a user is human. Central Processing Unit (CPU) A collection of billions of electronic switches that process data, execute instructions and control the operation of the computer. ASCII A 7-bit code to represent a set of characters available to a computer. Character A single item from the computer character set, for example ‘h’, ‘?’. Assembler Converts assembly language into machine code. Character code A numeric code that represents one character in a character set such as ASCII. Assignment Variables and constants are assigned values, often using the ‘=’ operator. Authentication Establishing a user’s identity. Binary A number system based on 2, using the digits 0 and 1 only. Binary shift Moving the binary digits one place to the left multiplies the number by 2, one place to the right divides the number by 2. Biometric The use of a person’s unique physical features to authenticate them. Character set The complete set of characters available to a computer. Clock The CPU clock synchronises all components on the computer motherboard. Clock speed The number of cycles per second measured in Hz. Cloud Storage, services and applications that exist on the internet. Cloud storage Data stored on remote servers accessed via the internet. 221 9781510484306.indb 221 19/08/20 8:33 PM Glossary Colour depth The number of bits used for each pixel. Commenting Comments written by programmers. They are ignored by the computer when the program is run but can make code much easier to understand. Comparison operators Such as >, <, ==, etc., used to compare values. Fetch–Execute cycle The basic operation of the CPU. It continually fetches, decodes and executes instructions stored in memory. Fibre-optic cables Cables made up of thin glass strands which transmit data using light. Field A single item of data in a record. Compiler A translator that decodes all of the lines of code in a high-level language to produce an executable file which can then be run by the computer. Programs only need compiling once. Field name The label given to each data field in a record. Computer network Two or more computers or devices that are linked together to communicate and share resources. Firewall Software and/or hardware that inspects and controls all inbound and outbound network traffic. Concatenation Joining two or more strings together to form one new string. Flowchart A visual way of representing inputs, processes and outputs of an algorithm. Constant A label for an area of memory that stores a value that does not change during execution of a program. Foreign key A field that relates to a primary key field in another table. Control unit (CU) A component of a CPU that coordinates the activity of the CPU. Hacking Unauthorised access to computer material. Copper cables Cables made up of eight individual copper wires used for wired connections in a network. Cyber security Technology used to keep networks, users, computers and data safe from attack, damage and unauthorised access. Database A persistent and organised store of data. Data inconsistency Where data is stored twice in a database but with different values so the true value is unknown. Data redundancy Where the same data appears twice in a database, wasting space. File Transfer Protocol (FTP) A protocol for transferring files over the internet. Hard disk drive (HHD) A non-volatile, magnetic storage device. Hardware The physical components of a computer. Hexadecimal A number system based on 16, using digits 0–9 and letters A–F to represent the decimal values 0–15. High-level language A programming language which is written using English-like statements, that are easier for programmers to work with. High-level languages, such as Python, Java, C# and VB.Net, must be translated into machine code before a computer can run them. Data structure A collection of data values given one name. Huffman coding A minimal variable-length character coding based on the frequency of each character. Data validation Limiting data that can be used by checking against rules. Hypertext Transfer Protocol (HTTP) A protocol used to transfer data between a web browser and web servers. Decimal A number system based on 10. Declare Tell the language that the variable is to be used and give it a data type. Hypertext Transfer Protocol Secure (HTTPS) A secure version of HTTP that adds Secure Socket Layer (SSL) encryption to the data. Decomposition Breaking a problem into sub-problems. Identifier Another word for the name of a variable or constant. Definite iteration Loop for a specified number of times, for example FOR loops. Image size The width in pixels x height in pixels for an image. Efficiency The number of steps (and therefore time) required for an algorithm to solve a problem. Embedded system A computer system that forms part of an electronic device. Encryption The process of encoding a message or information so that only authorised persons can access and understand it. Indefinite iteration Loop until a particular condition is met, for example WHILE loops. Input–Process–Output Problems and algorithms can be broken down into inputs, processes and outputs. Integer division (e.g. DIV) An operator that returns the whole number after a division, for example 11 DIV 2 = 5. Integers Whole numbers, for example 302. Erroneous test data Values that are of the wrong type, for example a string when the value should have been an integer. Internet A global network of networks that connects computer systems together across the world. Ethernet A protocol used to connect devices in a LAN over a wired connection. Internet layer Layer in the TCP/IP model concerned with addressing and routing data packages. Ethics Moral principles that govern a person’s behaviour. 222 9781510484306.indb 222 19/08/20 8:33 PM AQA GCSE Computer Science Internet Message Access Protocol (IMAP) A protocol for accessing email messages on a mail server without having to download them to your device. Internet Protocol (IP) A set of rules for sending and receiving data across the internet. Interpreter A translator that decodes one line of code of ­ high-level language and then runs it before moving on to the next line. Programs need to be interpreted each time they are run. Iteration Repetition of sections of code. Layers A way of separating the different activities involved in communication over the internet. Modulus division (e.g. MOD) An operator that returns the remainder after a division, for example 11 MOD 2 = 1. Nested iteration Iteration inside another iteration construct (such as a FOR loop within another FOR loop). Nested selection Selection inside another selection statement (such as an IF within an IF statement). Network security Methods used to prevent unauthorised access to computer networks. Network topology The way in which devices are arranged and connected together in a network. Nibble A group of 4 bits, half a byte. Legal Actions permitted or denied by force of law. Normal test data Values that sit within the expected range. Length (string) A string-handling operation that counts how many characters are contained in a string. NOT gate A NOT gate output reverses the input. Link layer Layer in the TCP/IP layer concerned with forwarding data packets within a network. Local area network (LAN) A network of computers in a small geographic area, such as a single building. Local variables Variables that are only accessible inside the module where they are located. Logic circuit A circuit designed to perform complex functions using basic logic gates. Logic error Something that causes the program to behave in an unexpected way. Operating systems Software that communicates with the hardware and allows other programs to run. Operators Symbols or words in a program that are reserved for particular actions. Optical storage Storage devices that use laser light to read and write data. OR gate An OR gate output is 1 if one or the other, or both, of the two inputs are 1. Parameters Values that a main program sends to subprograms for them to use. Looping Another name for iteration. Password systems A memorised set of characters used to confirm the identity of a user. Lossless compression Compression technique that does not lose any of the original data and the original file can be recovered. Patching A software update designed to update, fix or improve a program. Lossy compression Compression technique that removes some of the original data. The original cannot be recovered. Penetration testing Testing a computer network for vulnerabilities that an attacker could exploit. It is an authorised activity also known as ethical hacking. Low-level language A programming language whose lines of code directly correspond to the CPU’s hardware processes. This means it has little or no abstraction, for example machine code, assembly language. Machine code A language using binary codes that a computer can respond to directly. Malware Software programs designed to cause damage to a computer system or steal information. It is short for ‘malicious software’. Media Access Control (MAC) address A number programmed into the network interface controller (NIC) that uniquely identifies each device on a network. Memory management The process in which the operating system assigns blocks of memory to programs that are running in order to optimise system performance. Mobile technology Technology that the user carries with them, such as smartphones, tablets and smartwatches. Modularised programming Splitting a program into separate modules to make implementation and maintenance easier. Personal area network (PAN) A computer network connecting devices near to an individual for their personal use. Pharming A form of attack where users are directed to a fake website. Phishing The use of fake emails or messages to obtain sensitive data. Pixel The smallest element of an image. Pixels are the dots that make the image on the screen. Prefixes: kilobyte (KB), megabyte (MB), gigabyte (GB), terabyte (TB), petabyte (PB) Naming convention based on multiples of 1000. Primary key A field that holds a unique value; its value is not shared by any other record. Processor cores Multiple processor cores may be able to process multiple instructions simultaneously. Program code Instructions given to a computer in a high-level language. 223 9781510484306.indb 223 19/08/20 8:33 PM Glossary Programming language A language used to write instructions for a computer. Shouldering Also referred to as shoulder surfing, obtaining information such as a PIN by watching the person enter it. Protocol Set of rules that determine how data is transmitted between devices on a network. Simple Mail Transfer Protocol (SMTP) A protocol used to send emails from an email client, such as Outlook, to a mail server. Pseudo-code Similar to a high-level programming language but without strict syntax rules. Social engineering Methods used to trick people into divulging sensitive or confidential information. RAM The main memory of a computer that stores data, applications and the operating system while in use. When the power is turned off, RAM loses it’s data. Solid-state drive (SSD) A non-volatile storage device using solid-state storage. Random numbers A number returned by a program that cannot be predicted in advance. A series of random numbers would not display any patterns at all. Read and write Data can be both read and written by the computer. Solid-state storage Storage device using electronic switches to store data. Sorting algorithm: bubble sort, merge sort Different algorithms that can sort a list of unordered data. Software The programs that run on the computer. Read-only Data cannot be written by the computer. Spyware Malicious software, such as keyloggers, that is designed to gather data from a computer. Real numbers Numbers with a decimal point, for example 4.0, 302.81. Also known as floating point numbers. SQL commands: SELECT, FROM, WHERE Used to interrogate databases. Record A series of data fields. Registers Very small memory locations within the CPU that temporarily store data and can be accessed very quickly. Star network topology A network arrangement where each computer or client is connected to a central point, usually a switch or hub. Relational database A database consisting of multiple linked tables. String A collection of items from the computer character set, for example ‘hello’. Relational operators See Comparison operators. String conversion Converting a string to another data type (such as an integer or Boolean). Removable media Storage devices that can be removed from a computer while it is running. Returning a value To pass a value back from a function. String handling Modifying or manipulating strings, such as joining them together (concatenation) or finding their length. Robust program A program that will not crash or malfunction even when unexpected input is given by a possibly malicious user. Structured Query Language (SQL) The language used to pass instructions to databases. ROM Read-only memory used to store data and programs to initialise a computer system. Substring A part of a string, for example the first three characters of HELLO would be HEL. Run-length encoding (RLE) Data compression technique that converts consecutive identical values into a code consisting of the character and the number of characters in that sequence. Syntax error Something that breaks the grammatical rules of the programming language. Sample A digital representation of an analogue signal at a moment in time. Systems software The files and programs that make up the operating system of a computer. Table A collection of records in a database. Sample rate The number of times a sound is sampled per second, measured in Hz. Test data Values that are used to check that a program behaves correctly. Sample resolution The number of bits per sample. Testing Checking that a program does what it should for every possible input and output. Searching algorithm: linear search, binary search Different algorithms that can search a list of data. Secondary storage Non-volatile storage for files and programs. Secure program A program that only allows access to authorised users. Selection The decision-making process in a program. Sequence Execution of statements in a program one after another. 224 9781510484306.indb 224 Trace table Records the values of variables and outputs for each line of code, used when checking a program to make sure it is correct. Translator A program that decodes a high-level language into machine code. There are three types of translators: assemblers, compilers and interpreters. Transmission Control Protocol (TCP) A protocol that splits the data from applications into smaller data packets that can be sent across a network. 19/08/20 8:33 PM AQA GCSE Computer Science Transport layer Layer in the TCP/IP model concerned with controlling communication between two hosts. Trojan Malicious software that presents as legitimate software. Truth table A table listing all possible inputs and outputs for a logic circuit. Unicode A character set using 16/32 bits to represent a range of language symbols. There are several billion possible character codes available to Unicode. User Datagram Protocol (UDP) A protocol used for streaming data across the internet. Utility software A collection of programs, each of which does a specific housekeeping task to help maintain a computer system. Variable A label for an area of memory that stores a value that can change during execution of a program. Virus Malicious software hidden within another program that can replicate itself. Volatile and non-volatile Volatile means data is lost when the power is removed. Non-volatile memory retains data even when the power is turned off. Von Neumann architecture The most common organisation of computer components, where instructions and data are stored in the same place. White-box penetration test A test simulating a malicious insider who has knowledge of the target system and is likely to have basic access credentials. Wide area network (WAN) A network of computers that spans a large geographic area, often between cities or even across continents. Wi-Fi A set of protocols that allows devices to communicate using radio waves. Wired networks Networks using cables to connect devices together. Wireless access point (WAP) A device that allows devices to connect to a network using Wi-Fi. Wireless networks Computer networks that use radio waves to connect to devices, instead of cables. XOR gate An XOR gate output is 1 if one or the other, but not both, of the two inputs are 1. 225 9781510484306.indb 225 19/08/20 8:33 PM KNOWLEDGE CHECK ANSWERS 1 Fundamentals of algorithms 1 2 3 (a) Decomposition is breaking a problem down into smaller sub-problems. (Once each sub-problem is small and simple enough, it can be tackled individually.) (b) Abstraction is removing or hiding unnecessary details from a problem (so that the important details can be focussed on or more easily understood). Steps may include: • generating the text for the message • encryption method • generating or selecting the key • encrypting the text • sending the encrypted text • sharing the key • decrypting the text. Answers might include: • The details of the game are removed. • Players’ details are limited to name. • Winner and loser are identified without further details. • No ranking involved. 4 Line NumOne NumTwo Output 01 8 02 5 04 5 5 Comments Error – outputs smaller number, should output NumOne in line 04 and NumTwo in line 06. Either change lines 04 and 06 or change the condition in line 03 to ≤ It compares pairs of values starting at the beginning of the list. If they are out of order, they are swapped and a flag is set to say a swap has taken place. It repeats 6 7 8 9 checking the list and swapping values that are out of order until the list is in order and no swaps are made. Pass 1: 6, 2, 5, 8, 9 swap made Pass 2: 2, 5, 6, 8, 9 swap made Pass 3: 2, 5, 6, 8, 9 no swap list is sorted During the divide stage, the list to be sorted is split into two sub-lists of approximately half each. This division is then repeated on the sub-lists until each list contains only a single item. The divide stage splits a list of x items into x lists of one item each. It compares each value with 18 in turn from the start of the list until it finds a match or reaches the end of the list. C It searches through all of the list. If the last item in the list is compared against the search term and does not match then it declares that the search term is not in the list. 2 Programming 1 2 3 4 5 (a) (b) (c) (d) (e) (f) Integer Character or string Boolean String Real Real Sequence is the execution of statements one after the other, in order. Selection is the construct used to make decisions in a program. Iteration is the construct used to repeat sections of code. Iteration is commonly called looping. A While loop does something while a condition is met. A Repeat Until loop does something until a condition is met. A Repeat Until loop will always run at least once. (a) 11 (b) 21 (c) 4 (d) 2 (e) 4 (f) 4 (g) 4 (h) 44 (a) True (b) True (c) False (d) True 226 9781510484306.indb 226 19/08/20 8:33 PM AQA GCSE Computer Science 6 (a) (b) (c) (a) (b) (c) (d) Blueberry Lime fruit[1][2] 7 16 ‘Co’ ‘g is fun’ SUBSTRING(13,15,text) or SUBSTRING(13,len(text-1),text) 8 Functions return a value or multiple values to the main program; procedures do not return a value. 9 Subroutines allow programs to be split up into multiple sections. They make the code easier to read and maintain, reduce the size of the code and allow reuse of code. They make it easier to debug code as errors as the scope is set by the subroutine. 10 Validation involves checking values against a set of rules to see if the data is sensible and as expected. It does not check that the data is correct. 11 •A type check, to make sure that numbers are entered. • A range check, to make sure that the person is not over 100. • The use of drop-down lists so that users can only select from a pre-set range of values. • A length check to ensure the data contains the right number of digits. 12 It is not easy to validate a name because, apart from checking for any unusual characters, such as %, the length of someone’s name can vary greatly and there is no set range that it would fall into. 13 (a) When the user enters −10, the WHILE loop repeats and asks the user to enter another value as the condition (withdraw ≤ 0) is True. The program will not exit the loop until a suitable value is entered. In addition, the message ‘you must withdraw a positive amount’ is displayed. (b) When the user enters 110, the WHILE loop repeats and asks the user to enter another value as the condition (withdraw > balance) is True. The program will not exit the loop until a suitable value is entered. In addition, the message ‘you cannot withdraw more than your balance’ is displayed. (c) When the user enters 60, the WHILE loop ends as both conditions (withdraw ≤ 0 or withdraw > balance) are False. The balance is then changed to be 40 (100 − 60) and the new balance of 40 is output. 14 Robust programming is ensuring that the program still functions correctly under less than ideal conditions. 15 Consider the data that a user could input (including 16 17 18 19 incorrect, unexpected or deliberately invalid data), dealing with these appropriately. Ensuring that all instructions and error messages are unambiguous and easy to follow. Two factor authentication requires the use of two different types of authentication. These are usually selected from something you know (such as a password), something you have (such as a code sent to a mobile phone in your possession), or biometrics (such as fingerprint scanners). For example, you may be asked to enter your password and then enter a code sent to you in a text message. Normal test data is data that is of the correct type that you would expect a user to input. It should be accepted and not cause any errors. Boundary test data is of the correct type and is at the very edge of what should be accepted, allowing the program to run without causing errors. Erroneous test data is of the correct type but should not be accepted as it is not within the expected range. Any appropriate password that is less than eight characters or more than 15 characters long (e.g. ‘bob’ or ‘please change me this password is too long’). 20 Test data Expected result 25 Type of test data Normal (or any value over 18) 18 15 Boundary Erroneous Accepted Rejected Accepted (or any value under 18, or any value of the wrong data type) 21 A logic error will still allow the program to run but will produce an unexpected result, whereas a syntax error will stop a program from running in the first place as it breaks the rules of the language. 22 This would cause a syntax error because FOR is a keyword and so provides the start of a new construct rather than completing the IF statement condition. 23 This would cause a logic error. The program would still run, but the result would be incorrect. 3 Fundamentals of data representation 1 Computers use switches to store and process data and each of these switches have just two states: either 1 (on) 227 9781510484306.indb 227 19/08/20 8:33 PM Knowledge check answers 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 or 0 (off). 1 and 0 are the two numbers in the binary number system. Binary can be hard for computer scientists to remember and communicate without introducing errors, but binary easily converts into Hex which is far easier for programmers to use and remember. (a) 9 (b) 29 (c) 49 (d) 140 (e) 219 (f) 252 (b) 101110 (a) 10100 (c) (e) (a) (c) (e) 1001011 (d) 1100010 10010011 (f) 11010101 34 (b) 43 A5 (d) BF C9 (a) 18 (b) 88 (c) 93 (d) 174 (e) 202 (a) 9C (b) 33 (c) FF (d) 39 (e) 4E (a) 10010101 (b) 10101011 (c) 11101 (d) 10100011 (e) 1010110 10010 10100 1100001 11001011 1111001 11 decimal 3 1100 decimal 12 divided by 4 110100 decimal 52 11010 decimal 26 multiplied by 2 101000 decimal 40 101 decimal 5 multiplied by 8 110 decimal 6 110000 decimal 48 divided by 8 1110000 decimal 112 111 decimal 7 multiplied by 16 1000 decimal 8 10000000 decimal 128 divided by 16 10011000 decimal 152 10011 decimal 19 multiplied by 8 1011 decimal 11 101100 decimal 44 divided by 4 (a) shift left 3 places (b) shift right 4 places Effectively left by 1 so multiply by 2 (a) 70 (b) 71 (c) 74 The number of bits per pixel 24 or 16 27 28 111110 100000 100000 111000 100000 100000 29 240000 bits or 30 KB 30 10 × 10 × 16 = 1600 bits 31 32 33 34 35 36 1600 / 8 = 200 bytes The number of bits used to store each sampled value. The higher the sample rate, the closer the sound is to the original, but the larger the file needed to store the data. 30 KB Reducing the size of a file. With lossy compression, some data is lost and the original file cannot be recovered; with lossless compression, no data is lost and the original data is available. The tree will be similar to this, but there are other possibilities. In this case: Chr Code Bits Number of chrs Total bits P 10 2 3 6 I 01 2 2 4 E 00 2 2 4 D 111 3 1 3 R 1101 4 1 4 Space 1100 4 1 4 Total bits required 25 To store PIED PIPER in ASCII we have 10 characters at 7 bits per character = 70 bits. E I P D Sp R 228 9781510484306_AQA_GCSE_Computer_Science_Answer.indd 228 19/08/20 9:36 PM AQA GCSE Computer Science 37 S = 10, W = 0011 38 13 1, 12 0, 7 1 (or in binary 10001101 00001100 10000111) 39 4 Computer systems B 0 0 1 1 0 0 1 1 C 0 1 0 1 0 1 0 1 P 0 0 0 1 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 0 A B C P 0 0 0 1 0 0 1 1 0 1 0 0 0 1 1 1 1 0 0 0 1 0 1 1 1 1 0 0 1 1 1 1 6 1 A 0 0 0 0 1 1 1 1 1 1 1 1 7 A 2 P A B P 0 0 1 0 1 0 1 0 1 1 1 1 B 8 A P B 3 C A B P 0 0 1 0 1 0 A 1 0 0 1 1 0 B 9 P 4 10 A 0 0 0 0 1 1 1 1 B 0 0 1 1 0 0 1 1 C 0 1 0 1 0 1 0 1 P 1 1 1 0 0 0 0 1 A 0 0 0 0 B 0 0 1 1 C 0 1 0 1 P 1 1 0 1 5 9781510484306_AQA_GCSE_Computer_Science_Answer.indd 229 A B P C D 11 P = NOT (A AND B) 12 P = (NOT A) OR (NOT B) 13 Programs that are not essential to the operation of the computer, but which are involved in maintaining a computer system. 14 • Encryption software • Defragmentation • Data compression • Backup 229 19/08/20 9:36 PM Knowledge check answers 15 To protect data from unauthorised access. The data is scrambled into a form that cannot be understood if it is accessed by unauthorised users. 16 The performance of a system is slowed as the disk needs to be accessed more frequently to read all of the data. 17 • To reduce the size of files so that they take up less storage space. • To reduce the size of files so that they can be transmitted more quickly over the internet. 18 • Provide a user interface. • Control the use of the RAM. • Share processor time between different programs and processes. • Control peripheral devices. • Control who can access the computer and what resources they can use. • File management to allow users to organise their work into folders and subfolders. 19 The memory manager controls whereabouts a program and its data will be stored in RAM. When a program is finished or the data is no longer needed, it frees up the space for reuse. 20 These are programs which act as a translator to allow the CPU and devices to communicate correctly. 21 The processor manager schedules which processes are to be executed by the CPU. 22 • User authentication in order to access software and files. • Use of different privileges and rights for different types of user. • Automatic updating of the OS to ensure that security loopholes are patched. • Malware protection from inbuilt utilities. • Automatic encryption of data stored on the secondary storage. 23 Any three from: Python / C# / C++ / Visual Basic / Ruby / Pascal / Fortran / Java / JavaScript or other suitable alternatives. 24 • They use English-like syntax which makes them easier for programmers to use. • They use abstraction to hide the details of the underlying instructions that need to be completed by the processor. 25 They enable programs to be run very quickly. 26 • Using an interpreter. This translates and runs the code one line at a time. • Using a compiler. This translates the whole program into machine code and produces an executable file. 27 An assembler converts assembly language into machine code. 28 A processor with four processor cores (able to deal with four simultaneous processes). 29 The clock speed is 2.3 GHz. (It is able to carry out 2.3 billion cycles per second.) 30 • It is used to hold data that needs to be accessed very quickly. • It sits between the CPU and main memory. • It is faster than accessing main memory. • CPU looks to cache for necessary data or instruction. • If the data is not in cache, it is found in main memory then transferred to cache. 31 • Clock speed determines how many operations per second. • Cache memory holds frequently required data, so more cache less time accessing main memory. • More cores allow more processes to complete simultaneously, more cores more speed. 32 To process data, carry out instructions and control the components of the computer. 33 The processor • fetches instructions from memory • decodes these instructions • executes the instructions. 34 To hold data currently in use by the CPU and addresses to fetch data from or store data in. 35 Any two from: arithmetic operations (add, subtract), logical operations (AND, OR, NOT) or binary shift. 36 • RAM is volatile, meaning it needs electrical power to operate – any data stored in RAM is lost when the power is turned off. • ROM is non-volatile memory, which means it does not require power to maintain its contents. • RAM is read-and-write. • ROM is read-only. • RAM holds the operating system and applications/ data in use when the computer is switched on. • ROM holds the data and instructions required to boot the computer up. 37 The operating system, applications that are running and any associated data while the computer is on and in use. 38 The instructions and data needed to get the system up and running and ready to load the operating system from secondary storage. 39 Because RAM is volatile, no data is stored in RAM once power is removed. We need secondary storage to store 230 9781510484306.indb 230 19/08/20 8:33 PM AQA GCSE Computer Science various files on our computers so that they are available the next time we switch on the computer. 40 The operating system, data, images, programs, documents. 41 Cloud computing refers to the use of storage, services and applications that are accessed via the internet rather than being stored locally on a device. 42 • File storage, such as DropBox, iCloud Drive. • Applications, such as Office 365, Google Docs. 43 • An internet connection is required to access the data. • There is little control over the security of the data or where it is stored. • Terms and process of data storage can be changed with little notice. • Fees may become expensive in the long term. 44 ROM is non-volatile and does not require power to maintain its contents. It holds data and instructions to operate the device. RAM is required to store user selections or any output generated by the device. 45 • User selection for time, power level, program. • Display of user selections, timer countdown, ‘ping’. 46 Many examples including those in the book: washing machines, dishwashers, microwaves, settop boxes, telephones, televisions, home security, water meters, energy smart meters, home security or heating monitoring systems, missile guidance, vehicle management, CAM, digital cameras and portable entertainment devices. (There are several other examples.) Justification based on device selected, but can include: • power requirements for example battery operated devices • size, for example in portable devices • rugged, for example in missiles or car engines • low cost for domestic devices • dedicated software – limited need for user input and output, limited range of programs/options. 5 Fundamentals of computer networks 1 • Can share hardware devices. • • • • • Can share an internet connection. Can exchange data between computers. Files can be stored centrally. Software can be managed centrally. Security can be managed centrally. • Backups can be managed centrally. 2 • Additional hardware is usually needed to set up. • Networking hardware can be expensive. • Malware can spread easily between devices connected to the network. • If it uses a central file server, no one will be able to access files if it goes down. • Larger networks will need to be overseen by a network manager. A LAN is a network in a small geographic area, such as a home, a school, an office or a group of buildings on a single site. The hardware is usually owned and maintained by the organisation that uses it. It will often use both wired and wireless connections. 4 Bluetooth 5 • A LAN is a network covering a small geographic area; a WAN covers a wide geographic area. • The networking hardware in a LAN is usually owned and maintained by the organisation that uses it; the connections used in a WAN are usually hired or leased from a telecommunications company. 6 Using ethernet cables or wirelessly using W-Fi. 7 Fibre optic cable 8 • They are generally cheap to set up. • Most devices connect automatically. • Users can move around as long as they are in range of the WAP. • Additional devices can be added easily. 9 • The speed of data transmission is slower than wired networks. • Connections can be obstructed by obstacles. • Connections are often less stable than wired connections. • Data packets can be intercepted and read if they have not been encrypted. 10 The physical or logical arrangement of nodes that make up a communication network. 11 Each computer or client is connected individually to a central point, usually a switch or hub. 12 Advantages: • Cheap to set up as less cabling is required compared to other topologies. • Relatively easy to connect devices. Disadvantages: • Only one node can transmit data at a time. • If the backbone fails, the whole network will fail. 13 A set of rules that allows devices to transmit data to one another. 14 FTP 3 231 9781510484306.indb 231 19/08/20 8:33 PM Knowledge check answers 15 HTTPS 16 Used to send emails to an email server, or from one email server to another. 17 TCP stands for Transport Control Protocol. It splits data into smaller packets and adds the header, which includes the packet number, number of packets and checksum. IP stands for Internet Protocol. It defines how data should be sent across networks, by including source and destination IP addresses within each data packet. 18 • Application layer • Transport layer • Internet layer • Link layer 19 It addresses and packages data for transmission and then routes the packets across the network. 20 •It is possible for one layer to be developed or changed without affecting any of the other layers. Software and hardware manufacturers can specialise in understanding one layer without needing to know how every other layer works. • Devices made by different manufacturers can be compatible, giving the consumer more choice. • It is easier to identify and correct networking errors and problems. 21 • HTTP • HTTPS • FTP • IMAP • SMTP 22 • Authentication • Encryption • Firewall • MAC address filtering 23 A user sets up their username and password. They then have to enter their email address or mobile phone number. A code is sent to their email address or mobile phone which they have to enter on the website in order to gain access. 24 Encryption disguises the contents so that it cannot be read if it is intercepted. 25 •An IP address identifies a network or device on the internet. • An IP address is added to identify the source and the destination. • The IP address is used to determine where to send the data. 26 A MAC address uniquely identifies each device that is connected to a network. 6 Cyber security 1 It is the different ways in which networks and devices are protected against unauthorised access. 2 It is where users are directed to a fake website in order to obtain their login details. 3 Virus / Trojan / Spyware 4 A password that can be easily discovered or detected by other people, such as names of family or pets / car registration numbers / simple patterns of letters from a keyboard, for example qwerty. 5 A long password (eight characters or longer) that includes a mix of uppercase letters, lowercase letters, numbers and special symbols. 6 • They may introduce malware onto a network. • Data may be copied onto the device and stolen. 7 White-box testing simulates an employee trying to hack into a system from the inside, with knowledge of the system. Black-box testing simulates external hacking with no knowledge of usernames or passwords or how the system operates. 8 It is the process of tricking or manipulating people into giving away confidential information or access details. 9 • Blagging • Phishing • Shouldering 10 Shouldering involves watching people as they enter their login details or PIN number. 11 A criminal inventing a scenario to trick a victim into divulging information that they wouldn’t normally do otherwise. 12 Virus Malware that is spread through infected files Spyware Malware that comes packaged with other software Trojan Malware disguised as legitimate software 13 • It performs real-time scans of incoming network • • 14 • • traffic to detect if they are infected with a virus. It performs periodic scans of the whole system looking for malicious applications. If it detected a virus or other malware, it is quarantined to prevent it from running and allows users to attempt to clean or remove it. CAPTCHAs: Users are required to select a sub-set of images / enter text that has been distorted. Email confirmation: Users are sent an email with a link they need to click on to confirm that their email address is valid. 232 9781510484306.indb 232 19/08/20 8:33 PM AQA GCSE Computer Science 7 Relational databases and Structured Query Language (SQL) 1 (a) A foreign key is a key from one table stored in (b) (c) (d) (e) 2 (a) (b) (c) (d) 3 (a) (b) (c) (d) another table to create a link between the tables. A primary key is a field that uniquely identifies a record in a table. A table stores data about a different type of object, person, thing or idea. A record is a data structure that allows multiple data items to be stored in different fields, using field names to identify each item of data. A relational database is a database made up of more than one table linked together. Manjit 02223334445 Kyle 02232232232 Harry 01223123123 Sheila 01212121212 SELECT First_name, Email FROM Address_book WHERE Last_name = 'Mills' SELECT * FROM Address_book WHERE Email = 'SJ@home.vid' SELECT * FROM Address_book ORDER BY Last_name ASC Identifies the fields to return from the database. Allows the programmer to include criteria, with only matching records being returned. Identifies which table(s) the data will be returned from. Inserts new data into a table using the table name and fields to specify where the data is to be entered. 8 Ethical, legal and environmental impacts of digital technology on wider society 1 2 3 4 5 6 No • Private posts from private computers. • Individual has the right to their own opinions and the right to free speech. For example: • Use of electricity by data centres. • Use of rare substances within the technology depleting resources. • Energy used to manufacture devices. • Toxic materials used and their disposal. For example: • CCTV on the streets and in public places as well as private homes with CCTV, corporate buildings with CCTV and the workplace. • Mobile phones can be tracked and are tracked by various apps. • Online activity in the workplace and by various websites, for example to monitor searches to target advertising. • Online monitoring of social media activity to provide a profile for organisations. For example: • Social media posts are viewable by a wide audience often well beyond friends and acquaintances, and may influence how the individual is seen by potential employers or members of specific groups or the general public. • Families may see posts intended for close friends; employers may see unguarded moments from social activities. For example: • Sharing recent activities with friends. • Keeping friends up to date with what you like and are doing. • Unguarded moments available to employers. • Comments may solicit abuse, trolling. For example: • Target advertising more effectively. • Promote special offers. E.g. searches for shoes may solicit social media advertising for various shoe brands or online retailers can provide better targeted promotions. For example: Yes • The activity may reflect on the company. • It may identify the individual’s opinions and activities that are incompatible with the company. 233 9781510484306.indb 233 19/08/20 8:33 PM QUESTION PRACTICE ANSWERS 1 Fundamentals of algorithms 01 1 mark for naming, 1 mark for description and 1 mark for statement for a maximum of two principles: • Decomposition [1] is breaking a problem down into smaller sub-problems [1]. Once each sub-problem is small and simple enough, it can be tackled individually [1]. • Abstraction [1] is removing or hiding unnecessary details from a problem [1] so that the important details can be focussed on or more easily understood [1]. • Algorithmic thinking [1] involves deciding on the order that instructions are carried out [1] in order to identify decisions that need to be made by the computer [1]. 02 1 mark (start and) stop – mark awarded if just stop/end included. 1 mark for ‘input choice’. 1 mark each for decision boxes for A, B, and C (3 marks maximum). Note the decision box for D allows for choices other than A, B, C or D. It can be omitted without penalty for this question. 1 mark for YES and NO correctly labelled. 1 mark each for procedures ADD, DELETE, CHANGE (3 marks maximum). 1 mark for all procedures linked back to the correct place for another input. Example showing a representative answer: Start Input choice Choice A? Yes Add No Choice B Yes Delete No Choice C? Yes Change No Choice D? No Yes Stop 234 9781510484306.indb 234 19/08/20 8:33 PM AQA GCSE Computer Science 03 1 mark for each input line (2 marks). 1 mark for casting as integer. 1 mark for using suitable prompts. 1 mark for calculation of points. 1 mark for output line with suitable message. For example: OUTPUT 'Please enter number of wins: ' win ← USERINPUT OUTPUT 'Please enter number of draws: ' draw ← USERINPUT win_int ← STRING_TO_INT(win) draw_int ← STRING_TO_INT(draw) points ← (3 * win_int) + draw_int OUTPUT 'Total points earned: ' OUTPUT points 04 04.1 Line 1 syntax error (number), should be initialised to 0. Line 3 always references array index 1 rather than the variable number. 04.2 number ← 0 WHILE number < 15 name[number] ← USERINPUT number ← number + 1 ENDWHILE OUTPUT 'The total number of names entered is' OUTPUT number 04.3 FOR loop: FOR number ← 0 TO 14 OUTPUT name[number] ENDFOR WHILE loop alternative: number ← 0 WHILE number < 15 OUTPUT name[number] number ← number + 1 ENDWHILE 9781510484306.indb 235 235 19/08/20 8:33 PM Question practice answers 05 x 0 1 3 6 10 15 21 y 0 1 2 3 4 5 6 output 21 1 mark each for rows 0,0 and 1,1. (2 marks) 1 mark for remaining column for X. 1 mark for remaining column for Y. 1 mark for 21,6 as last row. 1 mark for 21 output. 06 Process shown in full, marks for the stages indicated. Elephant Dog Cat Dolphin Sheep Frog First pass Dog Elephant Cat Dolphin Sheep Frog [1] Dog Cat Elephant Dolphin Sheep Frog [1] Dog Cat Dolphin Elephant Sheep Frog Dog Cat Dolphin Elephant Sheep Frog Dog Cat Dolphin Elephant Frog Sheep Cat Dog Dolphin Elephant Frog Sheep Cat Dog Dolphin Elephant Frog Sheep Cat Dog Dolphin Elephant Frog Sheep Cat Dog Dolphin Elephant Frog Sheep [1] Second pass [1] 07 1 mark allocated to every correct swap. Marks indicate where a swap has been identified. 0. 1. 2. 3. 4. 5. 6. 7. Pear Apple Apple Apple Apple Apple Apple Apple Apple Pear Grape Grape Grape Grape Grape Banana Grape Grape Pear Banana Banana Banana Banana Grape Banana Banana Banana Pear Pear Pear Pear Pear Strawberry Strawberry Strawberry Strawberry Strawberry Raspberry Raspberry Raspberry Raspberry Raspberry Raspberry Raspberry Raspberry Strawberry Strawberry Strawberry [1] [1] [1] [1] [1] 08 08.1 1 mark per stage Pizza Apple Apple, Pizza Banana Chips Banana, Chips Apple, Banana, Chips, Pizza Sandwich Crisps Crisps, Sandwich Egg Pasty Egg, Pasty Crisps, Egg, Pasty, Sandwich Apple, Banana, Chips, Crisps, Egg, Pasty, Pizza, Sandwich 08.2 Any of the following up to a maximum of 2 marks: • A bubble sort passes over the data several times [1]. • The merge sort uses just one pass [1] over the data and is more efficient for large data 236 9781510484306.indb 236 sets [1]. 19/08/20 8:33 PM AQA GCSE Computer Science 09 09.11 mark per stage. For the first three stages, the algorithm compares the item searched for (Harry) to the item pointed to and moves on because the item is not found. For the fourth item, Harry is found and so the algorithm ends. Jeremy, Adrian, Ben, Harry, Frank, James Jeremy, Adrian, Ben, Harry, Frank, James Jeremy, Adrian, Ben, Harry, Frank, James Jeremy, Adrian, Ben, Harry, Frank, James 09.2 The list is not sorted and so a binary search could not have been carried out. 10 10.1 AD245, BE767, FR226, HA102, HC224, JA233, KE124, MA267, PE334 AD245, BE767, FR226, HA102 HA102 1 mark for identifying the midpoint (HC224) and discarding the right-hand side of the list. 1 mark for identifying the midpoint of the new sublist (either BE767 or FR226) and discarding the left-hand side of the list. 1 mark for continuing until only HA102 remains. 10.2 Three steps required. Accept two steps required if left-hand side of middle chosen for midpoint on second step. 10.3 Eight steps required. 2 Programming 01 01.1 1 mark for each correct answer. OUTPUT 'enter first number' num1 ← USERINPUT OUTPUT 'enter second number' num2 ← USERINPUT result ← num2 – num 1 OUTPUT result 01.2 Any one of: • num1 [1] • num2 [1] • result [1] 01.3 It is storing a value which is used elsewhere in the program that could change. [1] 9781510484306.indb 237 237 19/08/20 8:33 PM Question practice answers 02 02.1 1 mark for each correct answer. OUTPUT 'enter your age' age ← USERINPUT IF age ≥ 18 THEN OUTPUT 'Old enough to vote' ELSE OUTPUT 'Too young to vote' ENDIF 02.2 Two from: • Sequence [1] • Selection [1] • Assignment [1] 03 03.1 3, 4, 5 [1] 03.2 1 mark for each correct answer. Has been used Sequence ✓ Selection Iteration Has not been used ✓ ✓ 03.3 0, –1, –2 [1] 04 • Use a condition-controlled loop [1] • Loop runs 5 times [1] • Correct output is produced [1] For example: number ← 5 WHILE number > 0: OUTPUT number number ← number – 1 ENDWHILE 238 9781510484306.indb 238 19/08/20 8:33 PM AQA GCSE Computer Science 05 05.1 1 mark for each correct answer. radius ← 0 WHILE radius < 1 OR radius > 20: OUTPUT 'enter the radius' radius ← USERINPUT ENDWHILE area ← 3.14159 * radius * radius OUTPUT area 05.2 The value of PI (3.14159). [1] 05.3 It is a value that will not change (while the program is running). [1] 06 Destination: String [1] as the data is text [1]. Cost: Real [1] as the value includes decimal places [1]. Number of changes: Integer [1] as the value is a whole number [1]. First class: Boolean [1] as the value can only be TRUE or FALSE [1]. 07 07.1 Damson [1] 07.2 fruit[1,1] [1] 07.3 1 mark for each row: 0 1 2 3 0 Lime Cherry Banana Pear 1 Lemon Orange Raspberry Damson 2 Strawberry Pineapple Plum 08 Any of the following up to a maximum of 4 marks: • Input validation [1] to ensure that only numerical data is entered for the number of days worked/for the hourly rate of pay [1]. • Range check [1] to ensure that the number of days worked is between 0 and 7 [1]. • Presence checks [1] to ensure that a value is entered for each piece of information [1]. 09 09.1 1 mark for each correct answer: OUTPUT 'Enter minutes played' mins ← USERINPUT IF mins < 1 OR mins > 100 THEN OUTPUT 'Invalid input' ENDIF 09.2Data validation is the process of checking data against pre-defined rules to ensure that it is sensible and as expected. [1] 239 9781510484306.indb 239 19/08/20 8:33 PM Question practice answers 10 10.1 • • • • • • • Takes username as input. [1] Takes password as input. [1] Checks if username is correct. [1] Repeats bullet 1 if username is incorrect. [1] Checks if password is correct. [1] Repeats bullet 2 if password is incorrect. [1] Outputs "Access Granted" if both username and password are correct. [1] Example program in AQA Pseudo-code: username ← '' password ← '' WHILE username ≠ 'Andy123' username ← USERINPUT ENDWHILE WHILE password ≠ 'ComputerX' password ← USERINPUT ENDWHILE OUTPUT 'Access Granted' 10.2Authentication is used to confirm the identity [1] of someone using a computer system (to ensure that they are who they say they are) and ensure that they should be granted access [1]. 11 Any of the following ways up to a maximum of 4 marks: • Lines 6, 8 and 10 should be indented [1] to clearly show what is happening in each part of the if/else statement [1]. • The variable names m and n [1] should be changed to more meaningful names [1]. • Comments should be added [1] to explain the purpose of the if/else statements [1]. 12 12.1 • Defines a function month [1] with number as a parameter [1]. • Includes a list of the months as names. [1] • Uses the list to find the correct month name. [1] • Returns the month name. [1] Example: SUBROUTINE month(number) monthList ← ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December'] monthName ← monthList[number –1] RETURN monthName ENDSUBROUTINE 240 9781510484306.indb 240 19/08/20 8:33 PM AQA GCSE Computer Science 12.2A temporary store of data [1] that is only available inside the module that it is defined in [1]. 12.3Allows reuse of code [1] without copying and pasting [1], improved maintainability [1] by making code easy to read/follow [1] and shorter if called multiple times [1]. 13 13.1 OUTPUT 'Enter the base' base ← USERINPUT OUTPUT 'Enter the height' height ← USERINPUT area ← 0.5 * (base * height) OUTPUT area 13.2 • Define a function areaTriangle [1] that takes two parameters, base and height [1]. • Calculate the area of the triangle. [1] • Return the area. [1] SUBROUTINE trianglearea(base, height) area ← 0.5 * (base * height) RETURN area ENDSUBROUTINE 14 14.1A logic error means that the program will run but will not give the expected output. [1] 14.2 IF number MOD i = 0 [1] 14.3 A syntax error is an error in the grammar of the program that will stop it from running. [1] 14.4 OUTPUT number + ' is a prime number' [1] 14.5 Iterative testing. [1] 15 15.1 1 mark for each correct answer: Test data Test type Expected result 15 Normal Value accepted 0 (any value that is not an integer between 1 and 20) Invalid Invalid input message 1 (or 20) Boundary displayed Value accepted 15.2This type of testing ensures that only valid data is accepted, that the program works correctly, and that the program is able to deal correctly with inappropriate entries. [1] 16 16.1 16.2 16.3 16.4 OUTPUT testScores[1][1] [1] 9 [1] 25 [1] • Declare the variable total. [1] • Use a loop to iterate through the values for the second index value. [1] • Iterate through the values 0 to 4. [1] • Calculate and output the average. [1] 241 9781510484306.indb 241 19/08/20 8:33 PM Question practice answers Example program: total ← 0 FOR i ← 0 TO 4 total ← total + testScores[1][i] ENDFOR average ← total / 5 OUTPUT average 17 17.1 Two of the following ways up to a maximum of 4 marks: • Indentation [1] to show what is happening in each part of the if/else statement [1]. • Commenting [1] to explain what each part of the program is doing [1]. • Change variable from m [1] to something more meaningful such as mark [1]. 17.2 1 mark for each correct answer: Test data Test type Expected result 70 Valid ‘Grade B’ 101 (Any integer less than 1 or greater than 100) Invalid ‘Please enter your mark: ‘ 81 or 100 Boundary ‘Grade A’ 3 Fundamentals of data representation 01 01.1 01001101 [1] 01.2 3B [1] 01.3 4 [1] 01.4 1 1 1 0 1 1 1 1 0 0 1 0 1 0 0 1 0 1 1 1 0 0 1 0 1 1 1 1 1 2 marks for correct answer: 1 mark for right hand 4 bits (0010), 1 mark for left hand 3 bits (111). 1 mark for showing correct working. 02 02.1 25 [1] 02.2 1100100 [1] 02.3 100 [1] 02.4 Multiplies the number by 4 [1] 02.5The result is 1100 or 12 in decimal. We have lost the (LSB) / last 1 [1], causing loss of precision [1]. This is in fact the same as performing integer division, i.e. 25 DIV 2. 242 9781510484306.indb 242 19/08/20 8:33 PM AQA GCSE Computer Science 03 03.1 The complete set of characters available to the computer. [1] 03.2 7-bit [1] binary codes [1] (are allocated to each character in the set). 03.3 Upper case characters precede lower case in ASCII: Apple, Damson, cherry, grape [1] 03.4 • The ASCII character set includes upper and lower English characters plus some punctuation and control codes – ASCII character codes take up 7 bits. [1] (ASCII is a subset of Unicode.) • The Unicode set includes a full range of foreign language characters plus technical and graphical symbols – Unicode takes up at least 16 bits. [1] (Basically a comparison between the limited characters in ASCII and full range in Unicode.) 03.5 C: D 04 04.1 04.2 04.3 04.4 The more pixels in an image file, the larger the file. [1] 1-bit two colours, 2-bit four colours [1]. File size for 2-bit is twice as large [1]. There are 24 possible binary codes [1] so 16 possible colours (as 24 = 16) [1]. 1 mark per row: 1011201110 1011201110 104110 212021 104110 05 05.1 • Analogue data is sampled at set intervals. [1] • The amplitude of the sound is turned into a digital value. [1] • The digital values are stored in binary. [1] 05.2Data that can be removed without affecting the file significantly (accept examples such as audio frequencies beyond normal hearing / blocks of similar colours) is permanently removed [1] to make the file smaller [1]. The original file cannot be restored [1]. 05.3Lossless compression uses algorithms to look for patterns in the data [1] so that repeated data items only need to be stored only once, together with information about how to restore them [1]. Used for text/programs [1]. 06 06.1 Character Number of occurrences Bit pattern P 3 10 I 2 01 E 2 00 D 1 111 R 1 1101 Space 1 1100 243 9781510484306.indb 243 19/08/20 8:33 PM Question practice answers 06.2 Character Number of occurrences Bit pattern Number of Bits Bits x freq P 3 10 2 6 I 2 01 2 4 E 2 00 2 4 D 1 111 3 3 R 1 1101 4 4 Space 1 1100 4 4 Total 25 1 mark for number of bits 1 mark for bits × freq 1 mark for total 06.3 Total number of characters 10 [1] × 7 bits per character = 70 bits [1]. 4 Computer systems 01 01.1 AND (gate) [1] 01.2Inverts / flips / switches an input and makes the output become the opposite of the input. [1] 01.3 A B C D E P 0 0 0 0 1 1 1 1 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1 0 0 0 0 0 0 1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 01.4 P = (A AND B) OR (NOT C) [1] 02 02.11 mark for each function (to a maximum of 3 marks) and 1 mark for descriptions (to a maximum of 3 marks): Processor management • Schedules which processes are to be executed. Memory management • Controls which parts of memory are being used by which process. Input/Output device • Installs and interacts with device drivers. • Allows devices to send and receive data. Application management • Installs new applications. • Controls access to applications. 244 9781510484306.indb 244 19/08/20 8:33 PM AQA GCSE Computer Science Security management • Uses user accounts to control access to the computer system. • Requires passwords to prove the identify of those accessing the system. • Automatically updates the OS. • Protects against malware. 02.2 Two of the following: • Encryption software • Defragmentation software • Data compression software • Backup software 02.3 Two of the following: • Word processor • Spreadsheet • Web browser • Database • Media player • Graphic design / CAD 03 03.1 Two of the following: • A high-level language uses English-like keywords and syntax [1] which makes them easier to read and write [1]. • High-level languages have to be translated into machine code in order to be executed. [1] 03.2High-level code cannot be executed directly [1]. OR The program needs to be translated into machine code in order to be executed by the processor [1]. 03.3 • A compiler translates all of the code in one go [1]. It then produces an executable file that can be saved and run again at any time [1]. • An interpreter translates and executes the code one line at a time [1]. When the program is run again, each line of code has to be re-translated as no executable file is produced [1]. 04 04.1 Two of the following: • The code can be run directly by the processor. • The code will execute very quickly. • The system is likely to be low-powered so low-level languages will run more quickly. 04.2 Two of the following comparisons: • High-level languages use English-like words and syntax [1]. Low-level languages use binary or mnemonics [1]. • High-level languages can be run on a range of different types of computer [1]. Low-level languages are hardware dependent and can only run on one specific type of computer [1]. • High-level languages abstract the details of the processor [1]. Low-level languages refer directly to the computer’s hardware, so programmers need to understand how it works [1]. • High-level languages need to be translated into machine code before they can be run [1]. Low-level languages can be run directly by the processor [1]. 245 9781510484306.indb 245 19/08/20 8:33 PM Question practice answers 05 05.1 1 mark per bullet up to 2 marks: • The number of FDE cycles per second. • The clock frequency. • 1.3 billion cycles / processes / instructions per second. 05.2 1 mark per bullet: • Each core can process separate streams of data. • The more cores, the more data can be processed simultaneously (and the greater the performance). 05.3 1 mark per bullet up to 2 marks: • Cache is faster than RAM. • Frequently required data is held in cache. • The more cache, the more data is available to be accessed quickly. 05.4 1 mark per bullet up to 6 marks: • The address of the next instruction is copied from the program counter … • … and placed in the MAR. • The control unit FETCHES the data that is stored at that address in the MAR … • … and copies it to the MDR. • The program counter is incremented to point to the next instruction. • The control unit DECODES the instruction for the data in the MDR. • The decoded instruction is EXECUTED. 06 06.1 1 per bullet up to 2 marks: • Data and instructions are stored in the same place / memory. • Data and instructions are in binary. • Data and instructions are indistinguishable from each other. 06.2 (i) 1 mark per bullet: • To carry out arithmetic or logical operations/calculations/instructions/decisions … • … for instructions being processed by the CPU. (ii) 1 mark per bullet up to 2 marks: • Arithmetic operations (such as add, subtract, multiply and divide). • Logical operations (such as AND, OR and NOT). • The result of (‘less than’, ‘greater than’, ‘equal to’) comparisons. • Binary shift operations. 06.3 1 per bullet up to 3 marks • To coordinate the activity of the CPU (by) … • … obtaining instructions from memory • … decoding instructions • … sending out signals to control how data moves around the parts of the CPU and memory (in order to execute these instructions). 07 07.1 (i) Random Access Memory, Read Only Memory (ii) RAM is volatile, meaning it needs electrical power to operate. Any data stored in RAM is lost when the power is turned off. RAM is read-and-write. RAM holds the operating system and applications/data in use when the computer is switched on. ROM is non-volatile memory, which means it does not require power to maintain its contents. ROM is read-only. ROM holds the data and instructions required to boot the computer up. 246 9781510484306.indb 246 19/08/20 8:33 PM AQA GCSE Computer Science (iii) T he operating system, applications that are running and any associated data while the computer is on and in use. (iv) The instructions and data needed to get the system up and running and ready to load the operating system from secondary storage. 07.2 (i) Because RAM is volatile, no data is stored in RAM once power is removed. We need secondary storage to store various files on our computers so that they are available the next time we switch on the computer. (ii) Magnetic hard disk, solid state drive, optical disc, (hybrid disk) or equivalents/examples. 08 08.1 1 mark per bullet: • dedicated function / special function … • … built into another device. 08.21 mark for: (RAM might contain) user selections / current state / output messages/(i.e. temporary data). 1 mark for: (ROM might contain) dedicated program / system settings. 08.3Depends on the system described, but 2 marks per expanded bullet point to a maximum of 4 marks for input, 2 marks for output: Inputs • Buttons to select program / enter time / select options. • Sensors to detect system status / temperature / weight / water level / door open or closed, etc. • Remote connections to time signals / Wi-Fi or radio or nfc to remote devices or controls. Outputs • Displays to show time / selections / status, etc. • Noises to signal errors /completion / confirmation of choices, etc. 08.41 mark per feature plus 2 marks for description (examples only – there are other possibilities): • Low power requirements [1] o Battery operated. [1] o Must work for a long time on one battery [1] … o … to avoid the need to be removed frequently [1]. • Robust [1] o Placed within body of a patient. [1] o This is a harsh environment. [1] • Small size [1] o Placed via surgery. [1] o Must be small enough to fit within the space available. [1] 5 Fundamentals of computer networks 01 01.1 Diagram should include: • central bus [1] • five computers each connected individually to the bus [1]. (See Figure 5.4.) 247 9781510484306.indb 247 19/08/20 8:33 PM Question practice answers 01.2 Two of the following: • Can share hardware devices. • Can share an internet connection. • Can exchange data between computers. • Files can be stored centrally. • Software can be managed centrally. • Security can be managed centrally. • Backups can be managed centrally. 01.3 Wired (ethernet) [1] and Wireless (Wi-Fi) [1]. 01.4 LAN Connected together in a small geographic area. / Uses own hardware. 02 02.1 A set of rules [1] that allows devices to transmit data to one another / communicate with one another [1]. 02.2 IMAP 02.3 • TCP • UDP. 02.4 1 mark per answer: Layer Protocol Transport TCP / UDP Link Wi-Fi / Ethernet Internet IP Application HTTP / HTTPS / FTP / SMTP / IMAP 03 03.1Each computer or client is connected individually to a central point [1], usually a switch or hub [1]. 03.2 • A star topology is fast and reliable [1] as each device has its own connection to the central node [1]. • Data is only directed to the intended device [1] which keeps network traffic to a minimum [1]. • It is easy to add new devices [1] as they simply need to be connected to the switch [1]. 03.3A white list can be created [1] so that only devices with known MAC addresses can be allowed to connect to the network [1]. This prevents unauthorised users from gaining access. / A black list can be used to deny access to any devices listed. [1] 03.4 1 mark per method and 1 mark per description, up to a maximum of 4 marks: Firewall • Monitors incoming and outgoing network traffic. • Can block specific traffic based on pre-defined security rules. Authentication • Use of user names and passwords linked to user accounts to access the system. Anti-malware • To detect and prevent malicious software from entering the system. 248 9781510484306.indb 248 19/08/20 8:33 PM AQA GCSE Computer Science 04 04.11 mark for the topology and any of the following justifications up to a maximum of 2 marks: • Star [1] • Using wireless networking: Easy to set up [1] and no specialist knowledge required [1]. No wiring required [1] and devices can be moved around as necessary [1]. 04.2 Any of the following up to a maximum of 3 marks: • A school covers a reasonable area [1] and wiring a bus network would be extremely complicated [1]. • A school will have many users [1]. A bus network would prove very slow because collisions will require data to be sent repeatedly [1]. • In a bus network, one failed device causes the whole network to stop working [1] / with a star, one failed device does not stop the network [1]. 6 Cyber security 01 01.1 Any of the following up to a maximum of 4 marks: • User access levels determine what software, hardware and files [1] different users are allowed to access [1]. • They help to avoid attacks caused by the careless actions of users [1], for example, preventing normal users from installing software which may contain malware [1]. • Confidential information can be limited to only those who need it [1], helping to protect against insider attacks [1]. 01.2Testers take on the role of hackers to try to exploit weaknesses in the system [1]. Any vulnerabilities can then be addressed [1]. 02 02.1 • Memory sticks may be infected with malware [1]. When put into devices the malware may be introduced to the network [1]. • Sensitive data may be written to the memory stick [1], allowing data to be stolen from the organisation [1]. 02.2 Any three of the following: • Specify a minimum length for the password, usually at least eight characters. [1] • Require both upper and lowercase letters. [1] • Require the use of at least one number. [1] • Require the use of at least one special character. [1] 02.3 Any two of the following: • Use biometrics [1] to authenticate users [1]. • Ensure that software is automatically updated [1] to ensure that patches are applied immediately [1]. • Use password systems where certain characters from a password have to entered [1] to help prevent spyware from capturing password details. 03 03.1Where someone is tricked/manipulated [1] into giving away information / access to a system [1]. 03.2The victim receives a fake email / SMS message [1] and responds by replying or clicking on a link [1] allowing private information to be captured [1]. 249 9781510484306.indb 249 19/08/20 8:33 PM Question practice answers 7 Relational databases and Structured Query Language (SQL) 01 01.1 (i) Tray 8.50 [1] Dish 6.35 [1] (ii) 231 Fork Hodden [1] 232 Plate Hodden [1] (iii) 231 Fork [1] 232 Plate [1] 01.2 SELECT RecordID, ItemName, Supplier FROM sales WHERE QuantityStock < 50 1 mark for correct structure. 1 mark for SELECT items. 1 mark for correct table. 1 mark for correct condition. 02 02.1 Birmingham 1100 [1] 02.2 Paris France [1] Barcelona Spain [1] 02.3 SELECT City, Population FROM Cities WHERE Currency = "EURO" 1 mark for correct structure. 1 mark for SELECT items. 1 mark for correct table. 1 mark for correct condition. 02.4 SELECT * FROM Cities WHERE Population < 1000 1 mark for structure. 1 mark for use of *. 1 mark for table. 1 mark for condition. 8 Ethical, legal and environmental impacts of digital technology on wider society 01 Discussion question. Marks based on the quality of the response. Areas that may be discussed include: • Identify and track suspicious characters. • Match faces to a national database to identify criminals or terrorists to inform the authorities. • Monitor the crowd and put in place measures to deal with large numbers of passengers. • Personal privacy being invaded. • Faces being added by the system to a database will infringe personal freedoms. 02 Points may include: • Monitor activity to help the individual monitor their personal fitness. •Monitor health data indicators, for example heart monitors to monitor and alert doctors to a heart condition. 250 9781510484306.indb 250 19/08/20 8:33 PM AQA GCSE Computer Science •Physical activity can be shared with friends and family to encourage participation in healthy activities. •Personal privacy compromised through tracking can identify when and where a person is/has been and consequently leaves them open to criminal activity. 03 Discussion question. Marks based on the quality of the response. Areas that may be discussed include: • Dealing with a mix of driverless cars and human drivers, who are less predictable. • Responsibility for any accident – the developer of the system or the owner of the vehicle? • AI uses data about other vehicles – how is that data secured or used? •Advantages of driverless cars when all/many cars are driverless – more efficient use of the roads because they can communicate and cooperate. • Driverless taxis would be efficient and inexpensive versus job losses. 04 Points may include: •Control system for robotic limbs provide users with better and more precise control of the limbs. •Heart monitors and regulators to monitor heart conditions for diagnostic purposes; to alert doctors to a problem; to apply a solution to a problem. • Cochlear implant to provide a sense of sound to a person with profound hearing loss. •Brain implant for partially sighted people transmits video images to the brain. Video images from cameras attached to glasses bypass the eyes to restore some sight. 05 2 marks for advantage point plus expansion, 2 marks for disadvantage point plus expansion. Points may include: • Instant access to data from any location with internet access. •Data stored away from the phone so backup secure and can restore a lost phone's data at any time. •Inexpensive backup compared to home computer, though home computer may not be an additional cost. 251 9781510484306.indb 251 19/08/20 8:33 PM INDEX 1-dimensional arrays 42, 66 2-dimensional arrays 43–4, 66 4-layer TCP/IP model 177–8, 186 A abstraction 2, 16, 139 access rights 190, 196 adding binary numbers 91–3, 114 address bus 145 algorithms 1 decomposition and abstraction 2, 16 efficiency 8–9, 12, 15, 19 flowcharts 4, 16–17 Input–Process–Output 3, 16 program code 6, 18 pseudo-code 5, 18 searching 9, 13–15, 21 sorting 9, 10–12, 19–20 trace tables 7, 18–19 amplitude 101 analogue signals 100 AND gates 126, 156–7 AND and NOT gates 127 AND and OR gates 128 AND operator 40 anti-malware software 193, 197 application layer, 4-layer TCP/IP model 178, 186 application management 137 application software 136, 158 applications program interface (API) 137 arguments 52 Arithmetic Logic Unit (ALU) 144, 160 arithmetic operations 37–8, 65 arrays 66 1-dimensional 42 2-dimensional 43–4 ASCII (American Standard Code for Information Interchange) 95–6, 97, 115 assemblers 139, 142, 159, 160 assembly language 139 assignment 29, 62 authentication 57–8, 70, 179, 186 autonomous vehicles 217 B backup 135 bandwidth 172 BIDMAS 38 Big O notation 9 binary arithmetic addition 91–3, 114 division 94, 114–15 multiplication 94, 114 binary search 13–14, 21 binary shifts 94, 114 binary system 81, 111 conversion to and from decimal 83–5, 111–12 conversion to and from hexadecimal 87–9, 113 use in computers 82 biometrics 57, 179, 194, 198 bits 90, 114 black-box penetration tests 191, 197 black lists 180 blagging (pretexting) 191, 197 Bluetooth 170 Blu-Ray disks 151 Boole, George 124 Boolean conditions 30 Boolean expression, creation from logic circuits 132–3 Boolean logic 124, 156 logic circuits 127–33, 157–8 logic gates 125–7, 156–7 real-life problems 133–4 truth tables 125 Boolean operations 40–1, 65, 131 Boolean variables 27, 62 boundary test data 59, 71 brute force attacks 189 bubble sort 10–11, 19–20 bus network topology 174–5, 184 buses 145, 160 bytes 90, 114 C C# 6 FOR loops 34 arithmetic operators 37 arrays 43–4 Boolean operators 41 IF statements 32 inputs and outputs 46 random number generation 50 relational operators 38–9 string-handling operations 47 subroutines 53 type conversion 49 variables 29 WHILE loops 35 cache memory 145–6, 161 Caesar cipher 179–80 CAPTCHA 195, 198 casting 48–9, 67–8 CDs 151 Central Processing Unit (CPU) 143–4, 160–1 components 144–5 diagram of 145 performance 145–6 channels 175 character codes 95, 115 ASCII 95–6, 97 Unicode 96–7 character sets 95, 115 characters 27, 28, 62 checksum 175 ciphertext 134 clock speed 145, 161 clocks 144, 160 cloud storage 152, 164 impact of 216 colour depth 99, 116 colour images 98–9 comparison operators (relational operators) 38–9 compilers 141–2, 159 Computer Misuse Act 1990 213, 215, 218 computer networks see networks computer-based implants 217 concatenation 47–8, 67 condition-controlled (indefinite) iteration 33, 34–5, 64 constants 28–9, 30, 62 control bus 145 Control Unit (CU) 144, 160 copper cables 172, 183 Copyright, Designs and Patents Act 1988 213, 218 count-controlled (definite) iteration 33–4, 64 cyber security 188, 194–5, 196, 198, 215 cyber security threats 188–90, 196–7 malicious code (malware) 193–4 penetration testing 190–1 social engineering 191–2 252 9781510484306.indb 252 19/08/20 8:33 PM AQA GCSE Computer Science D data bus 145 data collisions 174 data compression 117–18, 135 lossless 104–9 lossy 103–4 need for 103 data inconsistency 200, 208 data packets 175 Data Protection Act 2018 212–13, 218 data redundancy 200, 208 data structures 65 arrays 42–4, 66 records 45, 66 data theft 190 data types 26–8, 62 data validation 55–6, 70 databases 200–1, 208 inserting, updating and deleting data 206–7, 209 relational 201–2 searches 203–5, 208–9 decimal (denary) system 81, 111 conversion to and from binary 83–5, 111–12 conversion to and from hexadecimal 86–7, 112–13 decision boxes 4 declaration of variables 29, 62 decomposition 2, 16 default passwords 189 definite (count-controlled) iteration 33–4, 64 defragmentation 135 destructive testing 58 device drivers 137 dictionary attacks 189 discrete values 100 DIV (integer division) 37 dividing binary numbers 94, 114–15 DNS servers 189 domain names 189 driverless cars 217 DVDs 151 E efficiency of an algorithm 8–9, 19 searching algorithms 15 sorting algorithms 12 email, phishing 191–2 email confirmations 195, 198 embedded systems 153–4, 164–5 encryption 179–80, 186 encryption software 134 engine management systems 154 ENIAC (Electronic Numerical Integrator and Computer) 138 entities 202 environmental impacts 213–14, 218–19 erroneous test data 59, 71 error correction 60–1 ethernet 175, 185 ethernet ports 172 ethics 211–12, 218 Euclidean algorithm 9 F Fetch–Execute cycle 146–7, 161 fibre-optic cables 172, 183 field names 45, 66 fields 200 file size calculation image files 100, 116 sound files 102–3, 117 File Transfer Protocol (FTP) 177, 185 firewalls 180, 186 flash memory 150–1 floating point numbers (real numbers) 27 flowchart symbols 4, 17 flowcharts 4–5, 16–17 FOR EACH loops 43 FOR loops 33–4 use with arrays 42–3 foreign keys 202, 208 fragmentation 150 functions 52, 54, 70 G General Data Protection Regulation (GDPR) 212 gigabits 172 gigabytes (GB) 90, 114 greatest common divisor, algorithms for 9 H hacking 216 hard disk drives (HDDs) 149–50, 162 hardware 124, 143, 156, 160 headers 175 hexadecimal system (hex) 81–2, 111 conversion to and from binary 87–9, 113 conversion to and from decimal 86–7, 112–13 high-level languages 6, 139–40, 159 FOR loops 34 advantages and disadvantages 140 arithmetic operators 37 arrays 42–4 Boolean operators 41 constants 30 IF statements 32 inputs and outputs 46 random number generation 50 relational operators 39 string-handling operations 47 subroutines 52–3 type conversion 48–9 variables 29 WHILE and REPEAT UNTIL loops 35 Hopper, Grace 141 hubs 174 Huffman coding 105–7, 118 calculating bits required 108, 118–19 Hypertext Transfer Protocol (HTTP) 177, 185 Hypertext Transfer Protocol Secure (HTTPS) 177, 185 I identifiers 28, 62 IF statements 31–3, 63 image representation 115–16 colour 98–9 file size 100 pixels 97–8 image size 97–8, 100, 116 immoral behaviour 211 impacts of digital technology 219 autonomous vehicles 217 cloud storage 216 computer-based implants 217 cyber security 215 hacking 216 mobile technologies 215 wearable technologies 216 wireless networking 216 implants, computer-based 217 inconsistent data 200, 208 increments 33 indefinite (condition-controlled) iteration 33, 34–5, 64 index values 42 input 3, 46, 67 symbol for 4 input/output device management 137 Input–Process–Output 3, 16, 143 insider attacks 190 integers 26, 62 intellectual property protection 213 253 9781510484306.indb 253 19/08/20 8:33 PM Index Internet 171 internet layer, 4-layer TCP/IP model 178, 186 Internet Message Access Protocol (IMAP) 177, 185 Internet Protocol (IP) 175, 185 interpreters 142–3, 160 inverters (NOT gates) 125 IP addresses 175 iteration 63–4 definite loops 33–4 indefinite loops 34–5 nested 36, 64 K keys 179, 186 kilobytes (KB) 90, 114 L latency 150 laws relevant to Computer Science 212–13, 218 legal impacts 212, 218 relevant legislation 212–13 linear search 13, 21 link layer, 4-layer TCP/IP model 178, 186 lists 42, 66 Little Man Computer (LMC) 139 local area networks (LANs) 170–1, 182 local variables 53, 69 logic circuits 157–8 creation from expressions 131–2 AND and NOT gates 127 AND and OR gates 128 XOR and AND gates 129–30 logic errors 61, 71 logic gates 156–7 AND 126 NOT 125 OR 126 XOR 127 London Underground map 2 looping see iteration lossless compression 104, 117, 135 Huffman coding 105–8, 118–19 run-length encoding 108–9, 119 lossy compression 103–4, 117, 135 low-level languages 139, 159 advantages and disadvantages 140 M MAC address filtering 180, 186 machine code 138–9, 159 magnetic storage 149–50, 162, 163 advantages and disadvantages 152 malicious code (malware) 189, 190, 193–4, 196, 197–8 Media Access Control (MAC) address 180 megabits 172 megabytes (MB) 90, 114 memory 147–8, 161–2 cache memory 145–6 RAM 148 ROM 148 secondary storage 149–52 memory management 137 merge sort 11–12, 20 microcontrollers 153 mnemonics 139 mobile technologies, impact of 215 MOD (modulus) 37, 38 modularised structured programming 53, 69 multiplying binary numbers 94, 114 multitasking 136–7 N nested loops 44 nested selection and iteration 36, 64 network protocols 175–7, 185 network security 179–80, 186 network topologies 174–5, 183–4 networks 4-layer TCP/IP model 177–8, 186 advantages and disadvantages 169–70, 182 types of 170–1, 182 wired 172, 173, 183 wireless 173, 183 nibbles 90, 114 nodes 174 normal test data 58, 70 NOT gates (inverters) 125, 156 NOT operator 41 O operating systems 136, 158–9 functions 136–7 operator precedence 38 operators 65 arithmetic 37–8 Boolean 40–1 relational 38–9 optical storage 151, 163, 164 advantages and disadvantages 152 OR gates 126, 157 OR operator 40 output 3, 46, 67 symbol for 4 P parameters 51, 69 passwords 57, 179, 189, 194, 196, 198 storage 58 patches 137, 190 payloads 175 penetration testing 190–1, 197 peripherals 137 personal area networks (PANs) 170, 182 petabytes (PB) 90, 114 pharming 189, 196 phishing 191–2, 197 pixels 97–8, 115 presence checks 56 primary keys 201, 208 privacy issues 214, 219 procedures 52, 54, 70 processing 3 processor cores 146, 161 processor management 136–7 program code 6, 18 program translators 141–2, 159 programming concepts arithmetic operations 37–8 assignments 29 Boolean operations 40–1 constants 28–9, 30 data structures 42–5 inputs and outputs 46 iteration 33–5 nested selection and iteration 36 random number generation 49–50 relational operations 38–9 robust and secure programming 55–61 selection 30–3 sequence 30 string-handling operations 46–9 subroutines 50–4 variables 28–30 programming languages 159 levels of 138–40 see also high-level languages protocols 175–7, 185 pseudo-code 5, 18 binary search 14 bubble sort 11 linear search 13 merge sort 12 254 9781510484306.indb 254 19/08/20 8:34 PM AQA GCSE Computer Science Python 6 FOR loops 34 arithmetic operators 37 Boolean operators 41 IF statements 32 inputs and outputs 46 lists 42–4 random number generation 50 relational operators 38–9 string-handling operations 47 subroutines 52 type conversion 48 variables 29 WHILE loops 35 R random access memory (RAM) 148, 161–2 random number generation 49–50, 68 read-only memory (ROM) 148, 162 read/write memory 148, 162 real data type 62 real numbers (floating point numbers) 27 records 45, 66, 200–1 redundant data 200, 208 registers 145, 160 relational databases 201–2, 208 relational operators (comparison operators) 38–9, 65 removable media, security threats 190, 196 REPEAT UNTIL loops 34–5, 64 reserved keywords 28 resolution images 98 sound files 102 robust programs 55, 70 run-length encoding (RLE) 108–9, 119 S sample rate 102, 117 sample resolution 102, 117 sampling sound signals 101, 117 searches, Structured Query Language (SQL) 203–7, 208–9 searching algorithms 9, 21 binary search 13–14 comparison of 15 linear search 13 secondary storage 149, 162–4 cloud storage 152 magnetic 149–50 optical 151 solid-state 150–1 secure programs 55, 70 secure socket layer (SSL) 177 security 188, 194–5, 196, 198, 215 of networks 179–80, 186 security management 137 security threats 188–90, 196–7 malicious code (malware) 193–4 penetration testing 190–1 social engineering 191–2 selection 30, 63 IF statements 31–3 nested 36, 64 sequence 30, 63 shouldering (shoulder surfing) 192, 197 Simple Mail Transfer Protocol (SMTP) 177, 185 social engineering 189, 191–2, 196, 197 prevention 192 social media, privacy issues 214 software 124, 143, 156, 158, 160 application 136 systems 134–5 software updating 190, 195, 196, 198 solid-state drives (SSDs) 150–1, 162–3 solid-state storage 150, 162 advantages and disadvantages 152 sorting algorithms 9, 19–20 bubble sort 10–11 comparison of 12 merge sort 11–12 sound representation 117 file size 102–3 sampling and digital storage 100–2 spyware 194, 198 star network topology 174, 183–4 string conversion 48–9, 67–8 string data 62 string-handling operations 46–9, 67–8 concatenation 47–8 strings 27–8 structured programming 68–70 advantages 53–4 subroutines 50–3 Structured Query Language (SQL) 203–4, 208–9 inserting, updating and deleting data 206–7 using multiple tables 204–5 subroutines 50, 68–9 defining 51 local variables 53 passing parameters 51–2 returning values 52–3 symbol for 4 switches 174 syntax errors 60–1, 71 systems software 134–5, 158 T tables 45, 66, 200 terabits 172 terabytes (TB) 90, 114 test data 58–9, 70 test plans 58 testing 58–9, 70 trace tables 7, 18–19 transistors 90 translators 141–2, 159 Transmission Control Protocol (TCP) 175, 185 transport layer, 4-layer TCP/IP model 178, 186 Trojans 193, 198 truth tables 125, 156 for logic circuits 127–30 for logic gates 125–7 Turing, Alan 195 two-factor authentication 57, 70, 179, 194 type conversion (casting) 48–9, 67–8 U Unicode 96–7, 115 UNIVAC (Universal Automatic Computer) 141 User Datagram Protocol (UDP) 175, 185 usernames 57, 179 utility software 134 V variables 28–30, 62 local 53, 69 VB.Net 6 FOR loops 34 arithmetic operators 37 arrays 43–4 Boolean operators 41 IF statements 32 inputs and outputs 46 random number generation 50 relational operators 38–9 string-handling operations 47 subroutines 52 type conversion 48 variables 29 WHILE and REPEAT UNTIL loops 35 viruses 193, 198 volatile and non-volatile memory 148, 162 Von Neumann architecture 144, 160 255 9781510484306.indb 255 19/08/20 8:34 PM Index W waste disposal 213–14 wearable technologies 216 WHILE loops 34–5, 64 white-box penetration tests 191, 197 white lists 180 wide area networks (WANs) 171, 182 Wi-Fi 175, 185 wildcards 203 wired networks 172, 173, 183 wireless access points (WAPs) 173, 183 wireless LANs 175 wireless networks 173, 183 impact of 216 X XOR gates 127, 157 XOR and AND gates 129–30 256 9781510484306.indb 256 19/08/20 8:34 PM ACKNOWLEDGEMENTS The Publishers would like to thank the following for permission to reproduce copyright material. Apple product screenshot(s) reprinted with permission from Apple. Microsoft product screenshot(s) used with permission from Microsoft. Microsoft and Windows are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. Python is a registered trademark of the Python Software Foundation. Photo credits p. 2 © Tovovan/stock.adobe.com; pp. 99, 104 all © George Rouse; p. 124 © Wellcome Collection. Attribution 4.0 International (CC BY 4.0); p. 138 © Bettmann/Corbis via Getty Images; p. 141 © Division of Medicine and Science, National Museum of American History, Smithsonian Institution; p. 144 © Ymgerman/Shutterstock.com; p. 149 © mingis/istock/thinkstock; p. 150 © Gianni Furlan/Hemera/thinkstock; p. 151 © sergojpg /stock.adobe.com; p. 153 © Sergey Jarochkin/123RF; p. 154 © samunella/ stock.adobe.com; p. 214 © Peter Essick/Aurora Photos/Cavan/Alamy Stock Photo. Every effort has been made to trace all copyright holders, but if any have been inadvertently overlooked the Publishers will be pleased to make the necessary arrangements at the first opportunity. 257 9781510484306.indb 257 19/08/20 8:34 PM Code-IT in Python Discover a new way to help students learn how to code with Code-IT in Python. Help your GCSE students progress beyond simple programming skills and removing any fear they might have transitioning from a block-based language to a text-based language with our coding resource, Code-IT for Python. Code-IT in Python consists of 12 standalone modules, each focused on a different programming content required at GCSE - allowing you to pick and choose which modules your students need. Using our responsive, online environment, students are encouraged to write and test their own code in order to solve Coding Challenges. It’s not your average ‘how-to code’ product A digital resource that provides your students with a learning journey through essential programming skills required at GCSE. It will fill students’ coding skills gap Designed to focus on a range of programming skills, Code-IT in Python will equip your students with the necessary tools needed to complete any GCSE programming task effectively and efficiently. It will save you time! Auto-marked Coding Challenges require students to write, test and de-bug their code. Feedback is given immediately, so students can understand the areas they need to amend and learn from. It’s packed full of resources As well as detailed progress reports on students’ activity, you will find guidance sheets, lesson ideas and starter presentations to help reinforce learning and cut down on the time you spend creating new resources. Pick and choose the modules you want £30 + VAT per module for one-year access. Save up to 20% by exploring our bundle offers. Please note, none of the Code-IT in Python modules are approved by AQA. Visit our website or contact your local Sales Representative to find out more about Code-IT in Python and to register for a free trial. www.hodderducation.co.uk/code-it computing@hoddereducation.co.uk