Programming and Software Development Image: www.freedigitalphotos.net Year 10 Information Technology Semester Two Theory Penleigh and Essendon Grammar School Contents PROGRAMMING AND SOFTWARE DEVELOPMENT 3 1. BASIC PROGRAMMING CONCEPTS 3 2. DATA TYPES 6 3. DATA OPERATORS 6 4. ALGORITHMS 8 5. CONTROL STRUCTURES 13 6. PROGRAMMING LANGUAGES 23 7. TESTING, ERROR DETECTION AND CORRECTION 25 8. DOCUMENTATION 30 REFERENCES 2 32 Programming and Software Development Programming is the process of writing programs and developing software. A program is a collection of instructions that, when executed, will complete a task on the computer. People who write programs are called programmers or software developers. Programmers write programs using programming languages. There are many programming languages available such as Java, Visual BASIC and C++. Each language has its own set of rules (syntax) which must be strictly followed. However, there are basic programming concepts in all programming languages. 1. Basic Programming Concepts When writing a program it is necessary to consider the input, processing and output. Input is the data needed to solve the problem, the processing is the calculations required and the output is the way of giving the solution. A program is written in a programming language using the following concepts. a. Keywords In computer programming, a keyword is a word or identifier that has a particular meaning to the programming language. Keywords help to identify the syntax of the language (syntax being the rules that structure the programming grammar). In most modern editors, the keywords are automatically set to have a particular text colour to remind or inform the programmers that they are keywords. An example of a keyword in REALBasic is MouseDown. This word already has a special meaning in REALBasic and cannot be used as a name of a variable, constant or object. b. Variables Variable is an item of data that may take different values, one at a time. They are storage containers used to hold data of the same type. To declare a variable, you must give it a name (identifier) and state its data type. c. Constant Constant is an item of data with only one specific value. Constants are either numbers or strings (group of characters) d. Assignment Statement Assignment statement gives a value to a variable such as: x = 9. The general form of an assignment statement is: variable = expression. The expression can contain other variables such as: y + 1. This will result in the assignment statement: x = y + 1. e. Identifier Identifier is the name of anything in a program such as a variable. There are some restrictions on the use of identifiers, such as always starting with a letter. 3 Programmers must give meaningful names to the objects (controls) that they use for ease of understanding code. The following naming convention must be followed. Adding a lowercase prefix to the name they give the control quickly identifies the object type. An underscore is then entered to separate the parts of the identifier. No spaces are permitted. It is the followed by the name they give the object, which should begin with a capital letter. For example if the purpose of an edit field (control) is for the user to type in a person’s salary then you might name it ef_Salary. Whenever you come across this in your code you would know it was a edit field from the ef prefix, and the ‘Salary’ part would give a strong clue as to its purpose. Examples of prefixes in REALBasic include: Control Type Pushbutton StaticText Listbox EditField Checkbox Combox PopupMenu Radio Buttons Prefix pb st lb ef cb cb pm rb Example pb_Save st_Caption lb_People ef_Password cb_Enabled cmb_State pm_State rb_Option f. Function A function is a reserved word for a particular purpose. Some functions are very powerful, such as IF and WHILE. Programming languages mainly have two types of functions: 1. Those built in functions which are supplied as part of the language 2. User-defined functions which you write yourself g. Subprograms When programming, a program may be best solved when it is broken down into smaller parts. Then each part is solved separately and then combined to produce the final solution. This is called structured programming. A structured program consists of a collection of smaller programs. Each of these smaller programs is called a subprogram. A subprogram is a self-contained section of code that performs a particular task. Subprograms have the added advantage of being reuseable. Programmers may use the same subprogram many times in different problems. This reduces the time required to program. Programmers keep a ‘library’ of subprograms they have written. 4 Questions: 1 Define the term ‘program’. 2 Explain the difference between: a. a constant and variable b. function and an identifier – 3 Describe an advantage of subprograms. 4 In Exercise 2 of the REALBasic booklet you change the colour of some text using Push Buttons. You use the following code. Sub Action() st_message.textcolor = &cFF0000 End Sub a. What Keywords are used? b. What is the Identifier? _______________________ c. What is the Assignment Statement? 5 2. Data Types Data type is the kind of data that can be stored in a variable. To create a variable it must be declared. Variable declaration involves stating the data type of the variable and an identifier or unique name for the variable. The following is an example of a variable declaration in REALBasic: Identifier Data Type Dim vi_Age as Integer Keywords for declaring variables Most programming languages have the following data types: a. Integer Integer number is a whole number without fractional parts. b. Real Numbers Real numbers are also known as floating-point number, and is a number with the fractional parts. c. Boolean Boolean is used to store values that have one of two possible states such as true or false. d. Character Character is used to store one character such as a letter, number or symbol. e. String String is used to store more than one character. 3. Data operators Data operators (or operators) are used to represent an action to be performed such as a calculation. Operators are classified as: a. Arithmetic operator Performs a calculation such as addition (+), subtraction (–), division (/), multiplication (*), powers (^) and modulus (%). Calculations are carried out using the standard order of operations. 6 b. Relational operator Compares two values and returns a boolean (true or false) result. Relational operators include less than (<), greater than (>), less-than-or-equal-to (<=), greaterthan-or-equal-to (>=), equal-to and not-equal-to. c. Logical operator Compares two values and returns a boolean (true or false) result. Logical operators include AND, OR and NOT. Questions: 1 Example Complete the table by stating the data type for each example. Data Type 6.7656 6 True Betty .5 F 2 Explain the difference between an arithmetic operator and a relational operator. 3 Using REALBasic coding as in the example on page 6 of this booklet, declare the following as variables: a. An integer for test results. b. If you have been selected in hockey team or not? c. The letter grade for your semester result in your reports. 7 4. Algorithms An algorithm is a series of steps designed to solve a problem in a finite time. An algorithm can be used to solve many types of problems. Algorithms are not programs but are an important part in the development of a program. All algorithms have the following characteristics: They are finite, that is, they always have an end. They use precise steps. They are designed to describe a problem in a way that will always be understood to mean the same thing. Algorithms can be used to describe simple daily actions or explain a particular task. For example: Algorithm to make a phone call: Pick up the phone. Dial number. Deliver the message. Hang up the phone. In this example the algorithm presents a solution in a definite number of steps. Each step is short enough so that it can be easily carried out. The steps must also be performed in a particular order (sequence) to solve the problem; for example, you cannot hang up the phone before delivering the message. Algorithms are represented in a number of different ways. These are referred to as methods of algorithm description. There are many different methods of algorithm description such as pseudocode and flowcharts. a. Pseudocode Pseudocode uses indented lines and keywords to describe an algorithm. Pseudocode is written using a word processor and is similar to many programming languages. The flow of control in pseudocode is always from the top to the bottom. The keywords are highlighted in capital letters (or bold) to emphasise them and to indicate the type of action being performed. The most common keywords are shown in Table 1. These keywords are grouped together in pairs. For example, for every BEGIN there is an END, for every IF there is an ENDIF. Indentation is used to show the structure of the algorithm. The rules for using pseudocode include: • Keywords are written in capitals. • Basic keywords come in pairs, for example, for every BEGIN there is an END, for every IF there is an ENDIF. • Indenting is used to show the structure in the algorithm. • The names of subprograms are underlined 8 Table 1: Keywords used in Pseudocode Problem: Write pseudocode to calculate the area of a rectangle given a length of 3 cm and a width of 5 cm. Solution: BEGIN Make length 3 cm Make width 5 cm Set area to length × width Display area END b. Flowcharts Flowcharts are a pictorial method of describing algorithms using a set of symbols, connecting lines and arrows. The rules for flowcharts are: There is only ever one way into a flowchart structure and one way out. A single flowchart should fit on one page. If the flowchart doesn’t fit on one page, then subprograms should be used. The main direction of flow is from top to bottom or left to right. Lines and arrows called flowlines indicate this flow of control. Flowlines do not need an arrow if the flow of control is following these main directions. 9 Table 2: Symbols and their meanings for a flowchart Problem: Write pseudocode to calculate the area of a rectangle given a length of 3 cm and a breadth of 5 cm. Solution: 10 Questions: 1. Write answers to the following questions. a. What is an algorithm? b. How is pseudocode different from flowcharts? c. Why must the steps in an algorithm be performed in a particular order? d. Why are the keywords in pseudocode highlighted? e. What is the purpose of indentation in pseudocode? f. What are the disadvantages of a flowchart? 2. True or false? Indicate with a T or an F in the far column. a. Algorithms are a never-ending series of steps to solve a problem. b. The steps to solving a problem can be carried out in any order. c. Algorithms are only written to solve computer problems. d. ‘Good’ algorithms are produced when the problem to be solved is thoroughly understood. 11 e. There is only one correct solution to every problem. f. Once a problem is solved there is no need to modify the solution. 3. The following algorithms have errors in their sequence. Find these errors and number the steps to correctly solve the problem. a. Algorithm to read a book: Read book Open book to first page Close book Get book b. Algorithm to run a bath: If bath is full, turn off the taps If water is too hot, increase the amount of cold water Put plug into bath If water is too cold, increase the amount of hot water Turn on the hot water tap Turn on the cold water tap 12 5. Control Structures Programmers solve a problem by designing an algorithm and then coding the algorithm into a programming language. Algorithms and programming languages consist of control structures. Control structures are the building blocks of the program. There are three basic control structures: sequence, selection and repetition. a. Sequencing b. Selection c. Repetition a. Sequencing Sequencing is the most common form of control structure. Each step of the algorithm is carried out in order of its position. Each step is done only once. Analyse the example on the table below to understand the rules about how to write sequences in both pseudocode and algorithms. Table 3: SEQUENCING - Design an algorithm to wash you hands. Pseudocode Solution Algorithm Solution BEGIN Wet hands Clean hands with soap Rinse off soap Dry hands END In pseudocode, the steps are placed between BEGIN and END (see the table below). The sequence of four steps is indented to show structure and to improve the readability of the algorithm. The flow of control is top to bottom, starting at the first step and finishing at the last step. Flowcharts always start and finish with a terminal symbol (oval). The steps are placed between these symbols and joined by flowlines. The direction of flow is down the page between the terminal symbols. 13 Questions: Write answers to the following questions. 1. A pseudocode algorithm is provided which will allow the user to input two whole numbers and output the sum of the two numbers. BEGIN Get 2 numbers Store in num1 and num2 Calculate sum (num1 + num2) Output sum END Write the flow chart algorithm to solve the same problem. Answer: 14 2. The flow chart shows a sequence of steps involving in buttering and eating a slice of bread. Write the pseudocode algorithm to solve the same problem. Flowchart Answer: Pseudocode Answer: 15 b. Selection Selection is used to make a logical decision. It requires a choice to be made between two or more options. The choice is made depending on the answer to a condition. There are two types of selection: binary and case. For the purposes of this course, we will only examine binary selection. Binary selection involves two choices Analyse the example on the table below to understand the rules about how to write binary selection in both pseudocode and algorithms. Table 4: BINARY SELECTION – Design an algorithm for turning on the lights. Pseudocode Solution Algorithm Solution BEGIN IF night THEN turn lights on ELSE turn lights off ENDIF END In pseudocode the keywords IF ... THEN ... ELSE are used for binary selection. The condition is put after the IF keyword. There are only two possible answers to the condition, true or false. If the condition is true then the process after the THEN keyword is executed. If the condition is false then the process after the ELSE keyword is executed. The ELSE statement is not always required and can be omitted. 16 In a flowchart the selection is made using a decision symbol (diamond). The condition is placed inside this symbol and the answer must be true or false. It is very important that the two flowlines from the decision symbol are labelled with true or false to determine which path to follow. The two flowlines join together to complete the binary selection. Questions: 1. Complete these algorithms using binary selection. a. A set for instructions to follow when driving towards a set of traffic control lights. Pseudocode algorithm: BEGIN IF lights are green THEN drive through intersection ELSE ____________ ____________ __________ b. Flowchart algorithm for the same problem. Answer: c. Repetition A repetition is also known as an iteration or loop. It allows a number of steps to be 17 repeated until some condition is satisfied. The steps to be repeated are referred to as the body of the loop. It is very important that each loop contains a condition to stop the loop going on forever. There are two types of repetition: pre-test or posttest. In pre-test loops the condition is tested at the start of the loop. If the condition is false the first time, the processes will NEVER be carried out. Pre-test loops end when the condition is false. Pre-test loops are also known as guarded loops because the loop is only operated when the condition is met. Analyse the example on the table below to understand the rules about how to write pre-test loops in both pseudocode and algorithms. Table 5: PRE TEST LOOP - an algorithm for how to use a seat belt in a car Pseudocode Solution Algorithm Solution BEGIN WHILE car is moving keep seat belts on ENDWHILE END In a pre-test repetition or guarded loop the condition is checked at the beginning of the loop before the steps to be repeated are executed. In pseudocode the keywords used for a pre-test repetition are WHILE ... ENDWHILE. The condition is put after the WHILE keyword and the body of the loop between the WHILE and ENDWHILE keywords. 18 In a flowchart the pre-test repetition is made using a decision symbol and flowlines. The condition is placed inside the decision symbol and checked before the body of the loop. In post-test loops the condition is tested at the end of the loop. The body of the loop will be executed the first time through, whether the condition is true or false, as the body of the loop is executed before the condition is tested. Post-test loops end when the condition is true but they always do the loop at least once. Remember: the post-test loop requires only one read statement but the pre-test loop requires two. A very general rule is to use pre-test loops when the number of loops is not known and post-test loops when the number of loops is known. Analyse the example on the table below to understand the rules about how to write post-test loops in both pseudocode and algorithms. Table 5: POST TEST LOOP - Design an algorithm to cut the grass. Pseudocode Solution Algorithm Solution BEGIN REPEAT Use lawn mower UNTIL grass is cut END In pseudocode the keywords used for a post-test repetition are REPEAT ... UNTIL (see problem below). The body of the loop is underneath the REPEAT keyword and the condition is after the UNTIL keyword. In a flowchart the post-test repetition is made with a decision symbol and flowlines (see Figure 15.8). The body of the loop is executed before the condition is met in the decision symbol. 19 Questions: 1. The pseudocode algorithm for a particular problem is given as: BEGIN Input the number 10 to 0 (in that order) Read 2 numbers WHILE both numbers >< 0 Calculate mean Print mean Read 2 numbers ENDWHILE Print end message END a. Write an explanation of this loop in general English. b. When does the loop end? c. Draw and complete the flowchart algorithm for the same problem. Answer: 20 2. A flowchart algorithm is provided for the problem ‘Making a Telephone Call’ but there are errors. Correct the solution and then write the pseudocode for the problem. Incorrect Flowchart Correct Flowchart 21 Correct Pseudocode: 22 6. Programming Languages Once you have written the solution to the problem in either pseudo code or flowcharts, you may begin your programming. Programming languages are used to create the instructions in a program that can be understood by the computer. Each programming language has its own set of rules that must be strictly followed. The rules of the programming language are called its syntax. Programming languages are divided into two groups: • Low Level Languages • High Level Languages a. Low Level Languages Low-level languages are the lowest level of computer languages and depend on the hardware of the computer. Programs written using low-level languages are often called machine code or assembly code. They process calculations much faster than high-level languages. The computer directly understands machine code. It can be executed or carried out rapidly and no translation is needed. However, it is very difficult to write and most people find it very hard to work with long strings of unstructured binary data or bit streams such as ...00011011011100010 110010000111011... b. High Level Languages High-level languages use English-like codes where each statement corresponds to several lines of machine code. Programming languages such as Java, Visual BASIC and C++ are high-level languages. A compiler or interpreter translates a high-level program into machine code so the computer can implement the solution. An interpreter takes one instruction at a time and finds, from its instruction library, the equivalent machine code instructions that it then obeys. Each time the program is run, each instruction is translated into machine code making program execution slow. An incremental compiler is another type of interpreter that uses an interpreter for the mainline of the program and compiles the modules as they are written. This helps to speed up program execution. A compiler changes the complete program from a high-level language into machine code but does not execute the program until required. The translated program, which is the one the computer understands, is kept for future use. This makes the process quicker than when using an interpreter. 23 Questions: 1. True or false? a. Each programming language has the same rules that must be strictly followed. ______ b. Most programmers use a high-level language. _______ c. A class describes how objects behave and the kind of information in the object. _______ 2. Explain the difference between a low-level language and a high-level language. 3. Explain why a compiler is a lot faster to use than an interpreter. 4. What is the purpose of a programming language? 24 7. Testing, Error Detection and Correction Most programmers strive for the perfect program, however, few are able to achieve it. It is rare for a complex program to be written without errors. Errors in a program are called bugs. A bug is an error that makes the program run incorrectly. The process of finding a bug is called debugging. Debugging is often a time- consuming and challenging task. a. Testing Test data is used to detect and correct any potential errors in a program. Except for simple programs, test data will only cover a small percentage of all the possible sets of data. The programmer selects test data that will cater for the ‘worst-case’ situation. This is data the programmer predicts will cause a problem. It is often data outside the boundaries of acceptable data such as entering a decimal number instead of an integer. Test data is also designed to check for expected outcomes. For example, if a user enters 4 the program should display the result for this value. The selection of test data depends on the programmer’s understanding of the program. Programmers use test data throughout the development of a program. b. Error Types To compile a program is to change the source code you have written into machine code that the computer can understand. During the process, errors may be found in the code. Error detection involves identifying and describing the error. Error correction fixes the source of the error to create a workable program. There are three basic types of errors: Logic errors result from an incorrect series of steps to solve the problem. The program with a logic error produces incorrect or unexpected results. Logic errors can occur if the algorithm does not solve the problem correctly. The algorithm should be tested before coding to eliminate logic errors. It is often a difficult task to find and correct logic errors. Syntax errors are made when the programmer has failed to follow the rules (syntax) of the programming language. A syntax error may be a spelling error or a symbol that cannot be translated. When the program is compiled or interpreted, an error message will appear if the program contains any syntax errors. Correcting any syntax errors is usually a simple task. Run-time errors occur when it is impossible for the computer to carry out the instruction. For example, if a calculation attempted to divide a number by zero it would be a run-time error. The instruction has the correct syntax but it is not possible to carry out the instruction. Incorrect data can often produce a run-time error. 25 c. Error Detection and Correction Software debugging tools are available for most programming languages. A debugger is a program that will perform the desk check electronically. They are often used with a breakpoint to watch the variables in a section of code. Debuggers are only tools to find problems and do not provide the solution to the problem. Breakpoint is a roadblock in the execution of the program. When the program reaches a breakpoint it stops. Breakpoints are useful in isolating sections of the code and analysing them. Desk checking involves the programmer checking each line of code. Desk checking takes place after the algorithm has been written and again after it has been coded in the programming language. The programmer executes the program the same as the computer. Desk checking provides a way to see exactly what code is being executed and the flow of execution through the program. A desk check usually involves watching the variables. A list is constructed containing the names of variables and their values. The list of variables is updated after each step of the desk check. If somebody other than the programmer performs the desk check it is called a peer check. d. Desk Checking in Practice Desk checking uses test data in a table to check all input and output. Test data should include all the expected inputs and some unexpected input as well. Input test data is used that will produce known results. The test data should include: 1. Typical data, which will test the commonly used program paths; 2. Unusual but valid data, which will test the program paths used to process exceptions; and 3. Incorrect, incomplete, or inappropriate data, which will test the program’s error routines. 26 Examine the following problem to see how to perform a desk check. Problem: An algorithm has been written to count the numbers from 1 to 5 and display them on the screen. Test data for this problem needs no inputs but should show the expected outputs. When the output does not match the expected output, there are one or more errors. In this case, setting count to 0 at the start would solve the error. Another desk check should be carried out to make sure. 27 Questions: 1. What am I? a. An error made when a programmer has failed to follow the rules of programming. b. An error that occurs when it is impossible for the computer to carry out the instruction. c. An error detection that involves putting a roadblock in the execution of the program. d. Documentation that involves writing an easyto-read program. 2. Complete the following sentences: a. When a program is _______ an error message will appear if the program contains a syntax error. b. A desk check usually involves watching the _______. c. A _______ is a program that will perform the desk check electronically. 3. What is the purpose of test data? 4. Describe the three basic types of errors. 5. What is desk checking? 28 6. The following tables are provided to desk check the given algorithms. Complete the tables. A. B. 29 8. Documentation Documentation is a written description to explain the development and operation of a program. It is not part of the actual code. Documentation is an important aspect of writing programs as it helps the programmer to understand what is going on. It should be written during the development of the program, however, it is often neglected and left until last. This results in inadequate documentation making the program difficult to understand and modify. The documentation required in a program falls into three main categories: Intrinsic documentation involves writing an easy-to-read program. It involves using correct programming techniques and meaningful variable names such as ‘height’ instead of ‘x’. Internal documentation consists of any comments or remarks within the program code to describe its purpose. The code below shows internal documentation in REALBasic using double forward slash. Sub Action() St_message.textcolor=&cFF00FF // Changes text to colour Fuchsia End Sub External documentation consists of any written support material. This may include a problem statement, input data, output data, processes, algorithm, test data and a listing of the program, user manuals and installation guides. Questions: 1. Why is documentation an important aspect of writing programs? 2. Explain the difference between internal and external documentation. 3. Inadequate ________________ makes a program difficult to understand and modify. 30 4. Edit the program below by inserting appropriate documentation in the right hand column. BEGIN Set X to 0 Set Count to 1 REPEAT Set X to X + 1 Print X Increment Count by 1 UNTIL Y=4 Print Count END 31 References Powers, G. K. (2004) Information and Software Technology Wilson, C. (2007) Exploring Information and Software Technology (4th ed) 32