Chapter 1 Data Processing and Information Chapter 1 Data Processing and Information 1.1 Data and Information. Data: Raw numbers, letters, symbols, sounds or images with no meaning on its own. o Example: 11431 o Context: Postal code Information: Data that has been processed and given context and meaning and can be understood on its own. o Example: 11431 o Context: Postal code o Meaning: This is a postal code in Egypt. Difference between static and dynamic information sources Examples Definition 1.2 Sources of Data Page | 2 Direct data Indirect data Collected from a primary source that must be used for the same purpose for which it was collected. Collected from a secondary source and already existed for another purpose. Asking the retailer for the shoes price. Questionnaires, Interviews, Observation and Data Logging The data will be relevant because what is needed has been collected. Asking someone who bought the shoes about the price. Electoral Register and third parties. Additional unwanted data will exist or wanted data may not exist. The original source is known and so can be trusted. The original source may not be known and so it can’t be assumed that is reliable. It can take a long time to gather original data rather than using data that already exists. The data is immediately available. The data is likely to be up-to-date because it has been collected recently Data may be out of date because it was collected at a different time. Bias can eliminated by asking a specific question or search for a specific thing. It can be biased due to its source. Eng. Hossam Badawy Chapter 1 Data Processing and Information 1.2.1 Direct data sources Questionnaires: set of questions specific to a subject or an issue designed to gather data from people being questioned. Questionnaires are easy to distribute, complete and collect as most people are familiar with this process. It can be on paper or on computer. Interviews: Arranged format meeting, usually between two people. The interviewer keeps asking questions in a structured or unstructured way. Structured interviews are similar to questionnaires, set of questions asked in the same order every time. Unstructured interviews are different from every interviewer and there’s no scenarios to be followed. Observation: It’s a method of collection of data, observers are asked to collect data about a given situation by just watching by themselves what happens. Data Logging: Data logging means using a computer and sensors to collect data. The data is then analysed, saved and the results are output, often in the form of graphs and charts. Data logging systems can gather and display data as an event happens. The data is usually collected over a period of time, either continuously or at regular intervals, in order to observe particular trends. It involves recording data from one or more sensors and the analysis usually requires special software. Data logging is commonly used in scientific experiments, in monitoring systems where there is the need to collect information faster than a human possibly could, in hazardous circumstances such as volcanoes and nuclear reactors, and in cases where accuracy is essential. Examples of the types of information a data logging system can collect include temperatures, sound frequencies, light intensities, electrical currents, and pressure. 1.2.2 Indirect data sources Electoral register: Also known as Electoral rolls, it is a list of people that are allowed to vote in a certain election. Sometimes there’s an open version from this roll that anyone can purchase and use for any purpose. They contain information like names, addresses, ages and other personal details. Third parties: It’s a method of collection of personal data, When you decide to deal with certain business offering a service or a product, usually you accept their terms to sell your personal details to other businesses, other organizations buy your data for their own use and this data usually collected by dealing with businesses, social media or even mobile activity. Page | 3 Eng. Hossam Badawy Chapter 1 Data Processing and Information 1.3 Quality of Information The quality of information is determined by the attributes mentioned below: Attribute Definition Completeness All information that is required must be provided. L Level of Detail There needs to be the right amount of information for it to be good quality, not too little or not too much information A Age Information must be up-to-date. R Relevance A Accuracy C Examples of bad information quality Giving an address without the building number. Ordering a cheese pizza without mentioning the size. Searching among all the IGCSE exams dates to find your exam date The next bus arrival time not up-dated. Information must be relevant to its purpose Busses schedule and you need to catch a train Information must be accurate to be considered as good quality. 10.51 instead of 105.1 Mess instead of miss 1.4 Accuracy of Data Validation: It is the process of checking if the data matches acceptable rules. Types of validations: Page | 4 o Length: Data is of a defined length or within a range of lengths. Example: The password is at least 8 characters. o Range: Data is within a defined range. Example: The age of the IGCSE students is between 15 and 18. o Limit: Same like the range check but only one boundary. Example: The minimum age for driving is 18. o Presence: Ensures that data is entered. Example: When you see a mandatory field when filling forms. o Type: Ensures that data matches the defined data type. Example: If a price is entered it must be numeric. o Format: Ensures that data matches the defined format. Example: The date is in the format of DD/MM/YYYY o Lookup: Checks if the entered data exists in a list. Example: Choosing only the grades A, B, C or D. Eng. Hossam Badawy Chapter 1 Data Processing and Information o Consistency: Checks and compares if the data entered in a field is consistent with another field. Example: If you entered in the title field “Mr” then the gender field must be Male and if you entered “Ms” then the gender field must be Female. o Check digit: It’s used to check if the entered identification number matches an existing rule. It’s done by comparing the result of a calculation by a check digit placed at the end of the identification number. Example: A barcode “2413-6”, the check digit “6” must be equal to the sum of the even numbers in the barcode. Proof reading: It is the process of checking the information by making sure that it does not contain any spelling mistakes, grammar mistakes or the formatting is consistent and the document is accurate. Verification: It is the process of checking if the entered data matches the original source. o Visual checking: The user reads and compares the entered data to the original source. o Double data entry: Data is entered to the system twice and checked for consistency, if they do not match then it is incorrect. o Parity Check: This is check is done to make sure that data is transmitted correctly between any devices by making sure that number of 1’s in every byte is even. Steps of Parity Check 1. The data sent is converted to bytes then bits. For example the word: “Hey” a. H is converted to 72 then converted to 01001000 b. e is converted to 101 then converted to 01100101 c. y is converted to 121 then converted to 01111001 2. The data then is determined if it has even number of 1’s or odd number of 1’s 3. If it’s even we will add 0 to end of the number and if it’s odd we will add 1 to the end of the number. 4. So the data sent will be as follows: a. H has two 1’s so it will be 010010000 b. e has four 1’s so it will be 011001010 c. y has five 1’s so it will be 011110011 5. The data is now ready to be sent, the receiver will now check if it has even number of 1’s, if not then the receiver knows that the data is manipulated. Page | 5 Eng. Hossam Badawy Chapter 1 Data Processing and Information Parity check won’t be effective if the error is transposed bits, for example: 011010101 became 101010011, no errors will be detected as it is still even number of 1’s. o Check Sum: It’s a calculated value named checksum, it can be calculated using many algorithms ( many ways ). The checksum is calculated on the whole file sent not on every byte like the parity check, after the checksum is calculated it’s attached to the file sent. The receiver then recalculates the checksum using the same algorithm and if there’s any difference then the file has probably been corrupted. o Hash Total: It’s done on fields in files by adding its values. For example: In this file you can see Student IDs that is stored in alphanumeric form Student ID 012 023 045 Exam Mark 50 45 40 We will use the Hash Total algorithm with Student ID field, the Student ID field will be converted into a number form and added up, so it will be as follows: 12 + 23 + 45 = 80 The total 80 will be sent to the receiver along with the file sent and then the receiver will recalculate the total on the same field and compare if both Hash Totals are equal then it’s transmitted correctly. Hash Totals are usually calculated on fields that there’s no purpose of adding them up like the Student ID. o Page | 6 Control Total: The same as the Hash Total but this time it will only calculate the Control Total value on numeric fields and it produces meaningful values that can be used for other purposes, for example: The teacher will use the data in the table above to calculate the average exam marks in the class, this will produce a control total value of 45. This value will be attached to the file and sent, the receiver will then recalculate the average and make sure that they are the same, if any difference happened then the file is probably changed during transmission and it has errors. Eng. Hossam Badawy Chapter 1 Data Processing and Information 1.5 Encryption This is the process of converting readable and understandable data into unreadable data by scrambling it and making it harder to understand and process. Encrypted data can be intercepted but cannot be understood. Symmetric encryption Also known as ‘secret key encryption’. This is the oldest method of encryption. It requires both the sender and recipient to possess the secret encryption and decryption key. The secret key needs to be sent to the recipient. Asymmetric encryption It is also known as public-key cryptography. It overcomes the problem of symmetric encryption keys being intercepted by using a pair of keys. This will include a public key which is available to anybody wanting to send data, and a private key that is known only to recipient. Page | 7 Eng. Hossam Badawy Chapter 1 Data Processing and Information Symmetric encryption requires much less processing time than Asymmetric encryption but in the symmetric encryption if a hacker gained access to the secret he can now encrypt/decrypt messages but in the Asymmetric encryption the private key cannot be deduced from the public key hence better security. Many security companies nowadays use the asymmetric encryption to trade the secret encryption key ( used in the symmetric encryption ) and continue to use the symmetric encryption to send and receive data. Encryption protocols An encryption protocol is the set of rules setting out how the algorithms should be used to secure information. The most popular protocol used when accessing web pages securely is transport layer security ( TLS ). TLS is an improved version of the secure sockets layer (SSL) protocol and has now, more or less, taken over from it, although the term SSL/TLS is still sometimes used to bracket the two protocols together. Purposes of SSL/TLS: enable encryption in order to protect data ensure the integrity of the data to make sure it has not been corrupted or altered. improve customer trust. If customers know that a company is using the SSL/ TLS protocol to protect its website, they are more inclined to do business with that company. make sure that the people/companies exchanging data are who they say they are (authentication) ensure that a website meets PCI DSS rules. The Payment Card Industry Data Security Standard (PCI DSS) was set up so that company websites could process bank card payments securely and to help reduce card fraud. This is achieved by setting standards for the storage, transmission and processing of bank card data that businesses deal with SSL/TLS and digital certificates Any website that uses HTTPS in their hyperlinks are secured webservers using the SSL/TLS protocols. Any website needs to be verified before communicating with, the SSL/TLS uses the digital certificate to verify it. Digital certificates are issued by an entity called Certificate Authority ( CA ) and they contain the domain name of the webserver ( the name of the website ), the CA’s digital signature and the public key of the website. Digital certificates prevents fraudsters from creating fake websites. Valid certificates can be only obtained after running a number of checks from the CA. If hackers are able to break into the CA system, they will be able to create fake certificates and will have a better opportunity to crack encryptions. Page | 8 Eng. Hossam Badawy Chapter 1 Data Processing and Information SSL/TLS in Client – Server Communication Transport layer security (TLS) is used for applications that require data to be securely exchanged over a client–server network, such as web browsing sessions and file transfers. In order to open an SSL/TLS connection, a client needs to obtain the public key. For our purposes, we can consider the client to be a web user or a web browser and the server to be the website. The public key is found in the server’s digital certificate. When a browser (client) wants to access a website (server) that is secured by SSL/TLS, the client and the server must carry out an SSL/TLS handshake. A handshake, in IT terms, happens when two devices want to start communicating. Handshake steps: 1. The client sends a message to the server telling it what version of SSL/TLS it uses together with a list of types of encryption that the client can use. 2. Server responds with a message which contain the type of encryption that will be used. 3. Server shows the client digital certificate to prove authenticity. 4. Client runs a number of check to prove that the certificate is authentic. 5. Client sends a random string of bits ( 0’s and 1’s ) that is used by both client and server to calculate the private key. The string is encrypted using the public key. 6. Client then sends an encrypted message to the server telling it the handshake is complete Uses of Encryption Hard disk encryption The principle of hard-disk encryption is fairly straightforward. When a file is written to the disk, it is automatically encrypted by specialised software. When a file is read from the disk, the software automatically decrypts it while leaving all other data on the disk encrypted. The encryption and decryption processes are understood by the most frequently used application software such as spreadsheets, databases and word processors. The whole disk is encrypted, including data files, the OS and any other software on the disk. Full (or whole) disk encryption is your protection should the disk be stolen, or just left unattended. So, even if the disk is still in the original computer, or removed and put into another computer, the disk remains encrypted and only the keyholder can make use of its contents. There are, however, drawbacks to encrypting the whole disk. If an encrypted disk crashes or the OS becomes corrupted, you can lose all your data permanently or, at the very least, disk data recovery becomes problematic. It is also important to store encryption keys in a safe place, because as soon as a disk is fully encrypted, no one can make use of any of the data or software without the key. Another drawback can be that booting up the computer can be a slower process. Page | 9 Eng. Hossam Badawy Chapter 1 Data Processing and Information Email When sending emails, it is good practice to encrypt messages so that their content cannot be read by anyone other than the person they are being sent to. Many people might think that having a password to login to their email account is sufficient protection. Unfortunately, emails tend to be susceptible to interception and, if they are not encrypted, sensitive information can become readily available to hackers. Email encryption parts: 1. The first is to encrypt the actual connection from the email provider, because this prevents hackers intercepting and acquiring login details and reading any messages sent (or received) as they leave (or arrive at) the email provider’s server 2. Then, messages should be encrypted before sending them so that even if a hacker intercepts the message, they will not be able to understand it. They could still delete it on interception, but this is unlikely 3. Finally, since hackers could bypass your computer’s security settings, it is important to encrypt all your saved or archived messages. Encryption in HTTPS websites HTTP (Hypertext Transfer Protocol) is the basic protocol used by web browsers and web servers. Unfortunately, it is not encrypted and so can cause internet traffic to be intercepted, read and understood. Hackers could intercept any private information including bank details and then use these to commit fraud. HTTPS (Hypertext Transfer Protocol Secure), however, enables users to browse the world wide web securely. To do this, it uses the HTTP protocol but with SSL/TSL encryption overlaid. HTTPS websites have a digital certificate issued by a trusted CA, which means that users know the website is who it says it is. Indicators that you are using a secure site are the inclusion of the HTTPS:// prefix as the starting part of the URL. There should also be a padlock icon next to the URL. Advantages and disadvantages of encryption Pros: Personal data and credit card info remains safe Encrypted information can no longer be changed or used for theft purposes Encrypted information cannot be sold or used by hacker to rival companies Drawbacks: Takes more time to load encrypted data as well as requiring additional processing power. Takes more time to load webpages with encrypted data. It takes up some memory from the client and server sides Ransomwares can be used by hackers to encrypt the data of a user’s computer system and ask for money to decrypt it back. Page | 10 Eng. Hossam Badawy Chapter 1 Data Processing and Information 1.6 Processing Data must be processed so that it can become information. Data can include personal data, transaction data, sensor data and much more. Data processing is when data is collected and translated into usable information. Data processing starts with data in its raw form and translates it into a more readable format such as diagrams, graphs, and reports. The processing is required to give data the structure and context necessary so it can be understood by other computers and then used by employees throughout an organisation. Batch Processing Input data is stored in a file called transaction file and at a scheduled time, the computer processes the collected data in one go. Examples: Producing utility bills Clearing of bank cheques Updating of a stock database Online Processing Here the user is in direct communication with the computer system and the data is processed shortly. Examples: Electronic fund transfer ( ATMs ) Online stores Point of sales ( Cashiers ) Real Time Processing It’s an example of online processing but in this case the data is transferred to the CPU and processed immediately without any delays. This type of processing involves the use of sensors and can be found in measuring and control systems. Examples: Greenhouses Intensive care units Weather stations Page | 11 Eng. Hossam Badawy
0
You can add this document to your study collection(s)
Sign in Available only to authorized usersYou can add this document to your saved list
Sign in Available only to authorized users(For complaints, use another form )