CARIBBEAN EXAMINATIONS COUNCIL CARIBBEAN SECONDARY EDUCATION CERTIFICATE INFORMATION TECHNOLOGY A GUIDE TO PROBLEM-SOLVING AND PROGRAM DESIGN RESOURCE MATERIALS FOR INFORMATION TECHNOLOGY SYLLABUS Copyright© 2008 Caribbean Examinations Council The Garrison, St. Michael 20, BB 11158, Barbados CARIBBEAN EXAMINATIONS COUNCIL CSEC Information Technology Prepared by: P. Francis-Cobley Contents Introduction Chapter 1 Defining the Problem 1 1.0 1.1 1.2 1 1 4 Chapter 2 Chapter 3 Chapter 4 Chapter 5 The Role of the Computer Programmer How Instructions are Given to the Computer Understanding the Problem The Defining Diagram The Problem with Problem Specifications Finding a Solution to the Problem 6 2.0 2.1 2.2 2.3 2.4 2.5 6 7 10 10 14 14 Introduction The Concept of Variables Choosing Variable Names A More Complex Problem Initialization of Variables Summary Evaluate Alternative Solutions 15 3.0 3.1 15 20 Introduction Determine the most Efficient Solution Represent the Solution as an Algorithm 22 4.0 4.1 4.2 4.3 4.3.1 4.3.2 4.4 4.5 22 23 23 25 25 27 36 36 What is an Algorithm? Algorithmic Structure Control Structures Algorithmic Language How To Write Pseudocode Flowcharts The Structured Programming Concept Do’s and Don’ts when Writing Pseudocode Test the Algorithm for Correctness 37 5.0 5.1 37 39 Tracing the Algorithm Choosing Appropriate Test Data Chapter 6 List Processing Using Arrays 6.0 6.1 6.1.1 6.1.2 6.1.3 6.1.4 6.2 Chapter 7 Chapter 8 The Top-Down Design Methodology 49 7.0 7.1 7.2 7.3 7.4 7.5 7.6 49 52 53 55 56 59 59 Chapter 10 What is Top-Down Design Hierarchy Charts How to Sub-divide a Problem into Modules Steps in Modularization Representing Modules in Pseudocode & Flowchart Communication between Modules Advantages of the Top-Down Design Method From Algorithms to Pascal Programs 60 8.0 8.1 60 8.2 8.3 8.4 8.5 Chapter 9 What is an Array Accessing the Elements of an Array Initializing Arrays Reading Values into an Array Displaying Array Values Traversing Arrays Point to Note when Manipulating Arrays 42 42 43 43 44 44 45 46 48 Introduction Translate Algorithm into a Specific Programming Language Structure of a Pascal Program Pascal in a Nutshell Translating Pseudocode into Pascal Code Summary 60 61 62 64 68 Program Execution on the Computer 69 9.0 9.1 9.2 69 70 71 Steps in Executing a Program on the Computer Types of Errors Debugging Programming Style and Quality 73 10.0 10.1 73 73 Program Quality Programming Style Appendix A – Programming Exercises 74 77 Appendix B – Suggested Reading FOREWORD This document is designed to be used as a guide for teaching the Problem-Solving and Program Design component for the CSEC Information Technology (IT) Syllabus. However, the content is sufficiently general such that it can be used for any introductory programming course. This document provides a step-by-step approach to problem-solving by using the computer. The course is designed to equip students with problem-solving skills that will be useful in any career that they may choose. The course provides an excellent foundation for those who wish to become good programmers. Advanced programming concepts such as multi-dimensional arrays, file processing, records, sets, pointers, recursion and some aspects of modular programming are not dealt with at this level. A NOTE TO TEACHERS This document is intended as a guide for the teaching of the problem–solving component of the CSEC IT syllabus. The format is a simple, step-by-step approach to the concept of problemsolving. The material is deliberately presented in a simplistic manner. The aim is to de-mystify the programming concept. Many people have a phobia for programming because there is a misconception that it is a difficult subject. Computer programming is no more difficult than a foreign language. The difficulty lies in the way the subject is taught. The most important phase in computer programming is the problem-solving phase. Too often, this phase is given a cursory treatment in many courses, where the majority of time is spent on teaching and debugging code on the computer. The problem here is that the student is often trying to get the program to work, without having developed a working solution in the form of an algorithm beforehand. This is like trying to give someone directions to a destination, without knowing how to get there yourself. You cannot tell the computer what to do, if you do not know how to do it. Computers do not solve problems, programmers do. Therefore, programmers must be able to solve problems before they write code. This document is geared towards helping the student develop an approach to problem-solving which is simple and non-technical. The phobia for computer programming often lies in the fact that it is seen as a highly technical skill and many people are techno-phobic. The approach outlined herein is one that anyone can follow. One will notice that in the early chapters, the emphasis is more on finding a solution, rather than writing formal algorithms. The formal way to write an algorithm is presented in Chapter 4. The idea here is not to minimize in any way the importance of formal algorithms, but rather, the intention is to get students to focus on defining the problem and figuring out how to solve it, without having to worry about the formal representation of their solutions at the same time. This is not to suggest that the formal algorithm (whether in pseudocode or flowchart) is not important. It is like teaching a baby how to talk. We do not introduce them to formal sentences from day one. They first learn words, then we teach them how to put multiple words together to facilitate meaningful conversations. So for example, a baby will learn the words, ‘Mommy” and “milk” and will communicate to us in the following way: “Mommy milk”. This is perfectly understandably to a mother. The baby is asking for some milk. A pedantic linguist will say that this is not a proper sentence; it is simply two nouns. However, from the mother’s point of view, this is an important step in the communication process. As the child grows, his/her language will become more refined as he/she learns more words and how to form sentences. Ultimately, the child will learn formal English and be able to communicate using perfectly phrased sentences. When it comes to teaching programming, we should adopt a similar nurturing approach. We should not expect students to grasp everything about problem-solving in a short time. We do not want students to become bogged down with semantics and refinements in the initial stages of the process. We want them to be able to take a problem, define it, do a manual solution and then write down the instructions in their own words. Once they have become comfortable with devising solutions, then we show them how to refine their solutions and ultimately how to choose the best solution and write it in a formal way. This is what algorithm development is all about. This is a simple, step-by-step approach, which hopefully, will become a habit after a while. It is very important that teachers spend a considerable amount of time on the problem-solving section of the syllabus. It is recommended that teachers spend between 15 – 20 hours on problem-solving and algorithm development. The majority of the time should be spent on practicing a wide range of programming problems. Problems should be chosen with a view to exposing the students to various programming features, such as, condition statements, loops, the use of sentinel values, list processing, modularization, and so on. It is only through constant application of problem-solving principles that students will learn how to design effective algorithms. Solving problems requires logical thinking. Unfortunately, a teacher cannot teach a student how to think logically. What the teacher can do is provide the student with a set of guidelines that must be followed in order to decipher a problem and figure out how to solve it. Human beings all have the capacity to think logically, some with a greater capacity than others. The ones with the greater capacity are the ones who will be able to figure out the solutions quickly. Others may take a lot longer, but, given time, they will be able to figure it out also. This document outlines a disciplined approach to problem-solving. By ‘disciplined’, we mean that the steps outlined must be followed every time the student is faced with a problem. Through constant practice, the student will find that problem-solving/programming is not so difficult after all. As with all subjects, teachers should endeavour to approach problem-solving with enthusiasm and try to stimulate interest in the subject matter by choosing simple problems that the students can relate to. Problem-solving classes should be highly interactive. They should not be “chalk and talk” affairs. Start off the subject by discussing the solutions to everyday problems and try to involve every student in the class. Some examples are given in the Introduction. Interactive sessions have many advantages, but can also be disadvantageous sometimes, as some students tend to get carried away with the discussions and before you know it, the period has ended and the substantive material for that session has not been covered. It is tempting to compensate by resorting to “chalk and talk” in the next session. The problem with “chalk and talk” in a problem-solving course, is that you do not get the students’ full attention, because they are too busy writing notes. They cannot learn to write algorithms by reading class notes. They have to learn through practice. They also learn from making mistakes and figuring out how to recover from those mistakes. This is why it is so important that the students should be allowed to interact in the problem-solving sessions, with the guidance of the teacher. One tried and proven way to get the students full attention is to prepare the relevant class notes as handouts and give them out after the class. Let the students know ahead of time that you will be giving them the handouts, so it is not necessary for them to take copious amounts of notes during the session. Use the class sessions to work examples and point out pitfalls. Follow the simple step-by-step approach and involve the students in the discussion of the various the solutions. Finally, keep the problems simple. We want students to see how easy it is to develop solutions. We do not want them to become bogged down with complex problems. This could be a deterrent and will only serve to feed the phobia of programming. Remember, the course is introductory. We do not expect the students to become expert programmers upon completion of this course. The course will give the students a solid foundation in problem-solving, upon which they can build. Those who have the aptitude for programming can go on to become expert programmers; others can apply the same problem-solving skills to any real-life problem that they might encounter later on in their studies or careers. INTRODUCTION The Role of the Computer Programmer Computers are designed to solve problems speedily and accurately. There is no problem that can be solved by a computer that could not be solved by humans as well. It is just that it would take considerably longer for humans to solve it and the degree of accuracy would not be the same. Although computers are used to solve problems, they do not have brains. They cannot think. They cannot reason, although recent advances in artificial intelligence may seem to suggest otherwise. A computer is a moron – it simply does exactly what we tell it to do. This is why the role of the computer programmer is so important. There is nothing magical about the way computers solve problems. A computer simply follows (executes) a set of instructions given to it by the programmer and produces the specified results. The computer programmer creates the instructions for the computer to follow. If the computer produces undesirable results, it is not the fault of the computer; it is the programmer’s fault. As an analogy, consider the case of a food processor. The food processor has buttons that indicate the various functions that can be performed – chop, grate, puree, liquefy, and so on. Let’s suppose that you want to chop some carrots to make a stew. You place the carrots in the food processor and press the chop button, but instead of chopping the carrots, the food processor purees it. The result is a stew that looks like mush. Who would you blame for this undesired result – the food processor, the designer or yourself (the user)? In this case, the user did what he was supposed to do (that is, press the chop button), the food processor did what it was supposed to do, that is, it carried out the instructions associated with the chop button. The designer/manufacturer is at fault here. Clearly, the incorrect instructions were linked to the chop button. In a similar way, a computer carries out the instructions given to it by the programmer. The programmer must, therefore, ensure that the correct instructions are given at all times, and that the instructions are precise and unambiguous. Otherwise, the results might be undesirable and in some critical situations such as airline navigation, the result could be fatal. How are Instructions given to the Computer? Instructions are given to the computer in the form of computer programs. A computer program is a finite set of precise instructions, written in a programming language. Before we write a computer program, we first have to find a way to solve the problem at hand. After we have figured out how to solve the problem, we then translate the solution into a language that is meaningful to the computer. Giving instructions to a computer can be challenging at times and require a certain amount of skill. This is because giving precise, unambiguous instructions is not inherent in human nature. Humans tend to make assumptions when giving instructions and they expect other humans to reason things out in order to get to a logical conclusion. Consider the problem of giving directions to someone to get to the nearest post office starting from point A. A possible set of instructions might be: 1. Proceed a mile or so down the road until you reach the roundabout. 2. Turn left at the roundabout and follow the road until you see a green house on the right hand-side. 3. The post office is about the 3rd or 4th building on the right after the green house. You’ll see the sign in front, you can’t miss it. To the average person, the above instructions may appear to be clear and straightforward. However, once you start following the instructions, you may find that vital pieces of information may have been omitted and other bits of information might not be as precise as they could have been. For example, one might discover (as is often the case in such situations) that there might be a junction or a fork in the road before one reaches the roundabout. This vital piece of information was omitted so when the person arrives at the junction he/she would have to make a decision as to which way to proceed. Do I turn right, left or proceed straight ahead? Making the wrong decision could lead to all sorts of consequences. Let us examine the instructions in some detail. Firstly, it is imprecise and ambiguous in many respects: Consider the first instruction, proceed a mile or so down “a mile or so” is imprecise. How far should the person have to walk or drive before arriving at the roundabout? One person might assume 1½ miles, another may assume 2 or 2½ miles. No-one would expect it to be 5 or 6 miles. So what if there is no roundabout in sight after 4 or even 5 miles? The person will be left to make a decision based on his intuition. Another ambiguity in instruction 1 is “down the road”. Which way is down? Is it to the person’s right or to the person’s left? Is it northwards or southwards? In the second instruction, what if there is more than one green house on the right? The instruction should precisely state whether it is the first, second or third green house that is being referred to. This will make the statement more precise. In the third instruction, it might not be difficult for a person to figure out whether or not the post office is the 3rd or 4th building. However, it would be impossible for a computer to execute an instruction written in this form. Computers must be told exactly what they must do in the correct sequence. We call the set of instructions an algorithm. Algorithms have four very important attributes. 1. 2. 3. 4. They must be precise. They must be unambiguous. They must be finite, that is, terminate after a finite number of steps. The instructions must be in a logical sequence. These four attributes must be underscored and the teacher should ensure that students have a good grasp of what these attributes mean. This can be achieved by allowing the students to engage in problem-solving exercises involving everyday problems that they can relate to. This is a good way to de-mystify the problem-solving process. It must be emphasized that writing a program is simply a formal way of giving instructions to someone (in this case a computer) to perform a particular task. The only difficulty in writing a program is in knowing how to solve the problem. If you do not know how to get to the post office, you would not be able to give directions to someone. Likewise a programmer must figure out a way to solve the problem before he/she proceeds to tell the computer what it should do. Figuring out how to solve a problem takes practice and skill. The subsequent sections will present a formal approach to problem-solving. Students should be required to adhere to the rules/approach specified therein for all programming problems. It is only through constant practice and a disciplined approach that the phobia of computer programming can be overcome. Problem-solving on the Computer The design of any computer program involves two major phases: 1. 2. The Problem-Solving Phase The Implementation Phase The problem-solving phase comprises the following steps: Step 1:Define the problem Step 2:Find a solution to the problem Step 3:Evaluate alternative solutions Step 4:Represent the most efficient solution as an algorithm Step 5:Test the algorithm for correctness. The implementation phase comprises the following steps: Step 1:Translate the algorithm into a specific programming language Step 2:Execute the program on the computer Step 3:Maintain the program The details of what is done in each of the above steps will be outlined in subsequent chapters. Class Exercise Formulate problem statements for simple everyday tasks, then ask each student to write instructions/algorithms for each task. Ask two or three students to write their solutions on the board. Then ask the class to give a critical analysis of each solution, with respect to the 4 stated attributes of an algorithm. They must identify any ambiguous or imprecise statement, identify also any omissions and must comment on the logic of the solution. To make it more interesting, you can ask each student to follow his/her algorithm, as written, to see if the desired result is achieved. The results should be reported to the class at a later date. Some examples of everyday problems are given below: 1. 2. 3. 4. Write a recipe for making a cheese omelette. Write instructions to teach your mom how to retrieve voice messages from a generic cell phone. Write instructions to give directions to a visitor to get to the nearest hospital, starting from the school premises. Write instructions to tell a novice how to download music from the Internet. Chapter 1 DEFINING THE PROBLEM Problem-Solving Step 1 1.0. Understanding the Problem Defining the problem is the first step towards solving a problem. It is one of the most important steps in problem-solving, as it leads to a clearer understanding of what is given and what is required. If the programmer does not fully understand what is required, he/she cannot produce the desired solution. Many students tend to overlook this stage of the problem-solving process and dive right into the algorithm or sometimes even the program code – much to their peril. Much of the frustrations experienced by junior programmers are due to a misunderstanding of the program requirements. Sometimes improperly specified problem statements can lead to such misunderstandings. It is therefore necessary that the teacher pay careful attention to how programming problems are worded. Here is an example of an improperly constructed problem statement: Write a program that prints a list of all students in the class who will be celebrating their birthdays during the month of May. The problem with this statement is that it is ambiguous and could lead to several different interpretations. For example, a student might assume that the program should determine all the students who were born in May, another student might assume that the program should take into account those students whose birthdays fall in April, but their birthday parties or celebrations are held in May. A more precise way of stating the problem would be: Write a program that prints a list of all students in the class who were born in the month of May. One of the biggest challenges that beginner programmers face is that of understanding the problem they are asked to solve. Defining the problem is a way to help the programmer understand what he or she is required to do. It involves decomposing the problem into three key components: 1. what is given (that is, the inputs), 2. the expected results (that is, the output), 3. the tasks that must be performed (that is, processing). 1.1. The Defining Diagram A formal approach to defining a problem is to construct a defining diagram. A defining diagram is a table with three columns, which represent the three components: input, output and processing. The input is the source data provided. The input can be easily identified by the keyword that precedes it – given, read or accept. The output is the end result required. 1 Keywords that help identify the output are, print, display, produce, output. The processing column is a list of what actions are to be performed to achieve the required output. This is usually the most challenging part of the problem definition. If it is done properly, writing the algorithm will be fairly straightforward. If the student is unsure of exactly what goes under the processing column, the student should ask of himself the following question: “What do I have to do with the inputs in order to produce the desired output?” The answer to this question is essentially what should be listed in the processing section. Let us now look at a few simple examples of defining a problem. Problem 1 A program is required to read three (3) numbers calculate and print their total. Defining Diagram: INPUT 3 NUMBERS Say num1, num2, num3 PROCESSING 1. Read/get 3 numbers 2. Add numbers together 3. Print total OUTPUT TOTAL The first step is to identify the input (that is, the data that is given). The keyword, read, identifies the input as three numbers, that is, any three numbers. We need to find a way to refer to each of these numbers. We can call them A, B and C or we could refer to them as num1, num2 and num3, as illustrated in the defining diagram. Any name will do as long as we are consistent when referring to the numbers by name. The next step is to identify the output. The keyword, print, identifies the output as the total (or sum) of the three numbers. The final step in defining this problem is to list the processing steps. Here, we list all the actions that must be performed in order to get the desired results. What do we have to do to the three numbers in order to print their total? 1. 2. 3. We must first get the numbers. We must then calculate their sum. We must then print the total. It is very important to note that at this stage, we are not writing the algorithm, so we need not be concerned about the details of how each action is performed. That is, in defining the problem, we do not need to worry about how the total is calculated; we just need to know that calculating the sum is an action that must be performed. The details of how the actions are performed are the subject of the algorithm development process. At this stage, the focus is on understanding the problem. 2 Note that: 1. 2. 3. In the defining diagram, the actions must be listed in a logical sequential order. All the necessary actions must be explicitly stated. For example, the read action and the print action must not be assumed. The processing section is NOT the solution to the problem. It is simply a list of the things that must be done in order to solve the problem. Later on we will proceed to write an algorithm to tell the computer how to solve the problem. In some problems the input, output and processing parameters might not be stated as explicitly as in the one above. Let us look at an example of such a problem. Problem 2 Given three integers representing the age of three boys respectively, write a program to find their average age and also determine the age of the oldest boy. In this example, the input data is explicitly stated, but the required output is not; it is implicit in the clause, “find their average weight”. The value(s) that you are asked to find, should always be reported (displayed). The problem statement could have been stated more precisely to read: Given three integers A, B, C, representing the age of three boys respectively, write a program to find and display their average age as well as the age of the oldest boy. In this problem, there are two major tasks to be performed. Each task consists of multiple actions. The actions to be performed are not all explicitly stated in this problem. This is typical of many programming problems. The task of finding the average involves multiple actions. Likewise, the task of finding the highest age. Defining Diagram INPUT 3 integers Say age1, age2, age3 PROCESSING 1. Read/accept/get 3 integers 2. Find the average of the 3 integers 3. Find the highest age 4. Print average, highest age Let’s look at a more complex problem. 3 OUTPUT Average-age Highest-age Problem 3 The cost of a new car is the sum of the wholesale cost, the local sales tax and the dealer’s percentage mark-up. Assuming the dealer’s mark-up is 10 percent of the wholesale cost and the sales tax is 6 percent, design a program to read a car ID (an integer value) and the wholesale cost of the car (a real value) and print the car ID and the cost to the consumer. Here, the input and output data are clearly stated. To arrive at the processing steps, look at what is given and what is required, then ask: “what should I do with wholesale cost in order to find the cost to the customer?” One rule of thumb to remember is that ALL the information given in a problem statement should be taken into account when formulating a problem solution. Problem statements do not usually contain redundant information. It means, therefore, that in most cases, all the information given is necessary for solving the problem. To find the cost to the consumer we must therefore apply the wholesale cost, the 10 percent dealer’s markup and the 6 percent sales tax. The defining diagram would look something like this: 1.2. INPUT PROCESSING OUTPUT Car-ID, Wholesale-cost 1. 2. 3. 4. Car ID Consumer-cost Read/get wholesale-cost Calculate the dealer’s markup Calculate the sales tax Find the sum of the wholesale- cost, the dealer’s mark-up and the sales tax 5. Print results. The Problem with Problem Specifications If problem statements are properly specified, defining (or understanding) the problem would be fairly straightforward. Unfortunately, many real-world programming problems are not always precise. Initial descriptions of such problems are often vague and sometimes ambiguous. This is because the person posing the problem often does not know how to solve it on a computer. They are users who simply need a computerized solution. They do not care how it is done. It is the job of the programmer to seek the necessary clarifications before embarking upon a solution. The programmer should investigate the user’s need by asking the relevant questions and then finetuning the initial problem specifications to ensure that it is precise and clearly defined. Students should therefore be encouraged to evaluate problems statements and ask questions if there are perceived ambiguities before they proceed. They should not make assumptions about what is required. The fact that a given problem is found in a textbook does not necessarily mean that it is properly specified. Remember also that some information may be implicit in the problem statement and such information must be taken into account when defining the problem. A useful exercise is to give the class a series of problem statements, ranging from simple to complex, some precise, some 4 ambiguous, some lacking pertinent information. Ask the students to evaluate each problem statement and determine if it is properly specified. If it is, they should proceed to define the problem. If it is not, they should suggest how it could be improved, whether by adding more information or rephrasing the statement to improve clarity. Here are a few examples of imprecise problem statements: Problem 4 Write a program to print a list of all employees who have been in the company for over five years. Problem 5 Write a program that reads a file that contains information on the height, weight and age of 100 children. The program should print the names of all the children who are overweight. 5 Chapter 2 FINDING A SOLUTION TO THE PROBLEM Problem-Solving Step 2 2.0. Introduction Now that we have defined the problem, we know what we need to do. We now have to figure out how to do it. A problem can have many different solutions. For beginner programmers, the most important thing is to arrive at a solution that works. The initial solution might appear to be clumsy or long-winded, however, what is important at this stage is that it works. Once a solution is found, we can then review it to see how it can be optimized or made more efficient. We may have to go through a series of refinements in order to get the most efficient solution. The first thing to do in deriving a solution is to do the problem by hand, noting each step as you proceed. Some problems might give sample input data, others might not. In cases where no sample input data is given, the student should create his/her own, based on the information given. For example, let us revisit the problem of finding the average of set of numbers. Problem 6 Find and print the average of three numbers. 1. 2. Define the problem INPUT PROCESSING OUTPUT 3 NUMBERS Say num1, num2, num3 1. Read/get 3 numbers 2. Find the average 3. Print average average Create sample input data Sample input data: 5, 3, 25 3. Execute each processing step in the defining diagram manually. Get the three numbers 5, 3, 25 The next step says find the average. How do we find the average? Manually, we add the numbers together, 6 5 + 3 + 25 = 33 Then we divide the result by 3, that is: 33 3 = 11 The next step in the processing section says, “print average”. This is where we display the result that we obtained above. We have now completed a manual solution to the problem. The next step is to write your solution as a sequence of instructions. Initial Solution Get the first number, call it num1 Get the second number, call it num2 Get the third number, call it num3 Add num1 + num2 + num3 Divide result by 3 Print result Stop Intuitively, this solution works. However, when we write solutions for the computer, we must remember that computers do not have intuition, so we cannot assume anything. We have to state the obvious. For example, in our solution above, we gave an instruction to the computer to add num1, num2 and num3, but we have not told the computer where to put the result of the add operation. The result must be stored somewhere so that it can be accessed later (that is, in the next statement). This leads us to the concept of variables. 2.1. The Concept of Variables Consider the following recipe. It’s grandma’s secret recipe for making fluffy, delicious, oldfashioned pancakes. Ingredients: 4oz plain flour, sifted pinch of salt 2 eggs 200ml milk mixed with 75ml water 2oz butter caster sugar juice of one lemon 7 Method Sift the flour and salt into a large mixing bowl. Make a well in the centre of the flour and break the eggs into it. Then begin whisking the eggs with a wire whisk or fork. Next gradually add small quantities of the milk and water mixture, still whisking. Melt the butter in a frying pan. Spoon 2 tbsp of it into the batter and whisk it in, then pour the rest into a bowl and use it to lubricate the frying pan. Now get the pan really hot, and then turn the heat down to medium. Spoon about 2 tbsp of the batter into the hot pan. As soon as the batter hits the hot pan, tip it around from side to side to get the base evenly coated with batter. It should take only half a minute or so to cook. Flip the pancake over with a spatula - the other side will need a few seconds only - then simply slide it out of the pan onto a plate. To serve, sprinkle each pancake with freshly squeezed lemon juice and caster sugar. Note the words highlighted in the recipe – mixing bowl, frying pan, bowl, plate. These all represent storage or containers for holding the batter, the butter and batter, the oil and the cooked pancakes, respectively. Just as we need containers to hold or store ingredients of this recipe, likewise, when we perform computations, we need something to store or hold the values that we manipulate. In the computer, values are stored in memory locations. There are a large number of storage locations in memory and so in order to keep track of where our values are stored, we need to place an identifier or a label on a particular memory location so that we will know what is stored therein. The label or identifier is called a variable. Definition: A variable is a symbolic name assigned to a memory location that stores a particular value. In our average problem above, when we say, “get a number, call it num1”, we are actually defining a variable or an identifier for each number, so that we can refer to (or access) it later. So, in fact, we have assigned variables (identifiers) to the three input values, namely, num1, num2 and num3, respectively. Num1 identifies the first number, 5. Num2 identifies the second number, 3 and num3 identifies the third number, 25. There are other values in the program for which we need variables. For example, we computed the sum of num1, num2 and num3, but we did not tell the computer where exactly to store the result. We need a variable for this purpose. So the correct instruction is as follows: Add num1 + num2 + num3, storing in sum This would tell the computer that the result of the computation should be stored in a memory location called sum. If we omit this very vital piece of information, the computer would perform the computation, but would not store the result in memory, resulting in an incorrect answer. Likewise, in the last two statements in the initial solution, we said, Divide result by 3 Print result 8 There are two problems here. Firstly, we did not tell the computer to store the result of the division. Then there is ambiguity in the statement, “print result”. To which result are we referring? Is it the result of the addition operation, or is it the result of the division operation? We need to be more precise. Remember, the computer does not possess reasoning skills. Computations are performed by the arithmetic logic unit (ALU), which is part of the CPU. Results of computations are temporarily stored in registers within the CPU. Any intermediate or final values that we need to access later in the program must be stored in locations in memory. This means that such values must be identified by variable names. The label for the memory location is called a variable for a good reason. The word variable is derived from the verb “to vary”. This means that the value stored in a particular location can change from time to time, although the label remains the same. In the recipe above, if we wanted to make another batch of pancakes (because the first set was consumed rather quickly), we would simply pour another set of the ingredients into the same mixing bowl (that is, same storage location) and put the second batch of pancakes on the same plate. Likewise, in our average problem, if we wanted to find the average of another set of numbers within the same problem, the input statement would be: Get num1, num2, num3 This time, the same variable names are used, but the variables store different numbers. For example, let’s say that the second set of numbers were 12, 9, 44. This time, num1 would identify the location in which 12 is stored, num2 would identify the location which stores 9 and num 3 would identify the location which stores 44. When new values are placed into previously assigned memory locations, the old values are replaced by the new ones. Therefore, if it is necessary to retain the existing value in a memory location, a different variable must be declared to hold the new data value. It is always helpful to illustrate the concept of a variable by using diagrams to show the memory locations associated with each variable name. For example, Memory Locations num1 Variables 5 num2 3 num3 25 25 9 Let us revise our initial solution to include two new variables called sum and average. Get three numbers say, num1, num2, num3 Add num1 + num2 + num3, storing in sum Divide sum by 3, storing in average Print average 2.2. Choosing Variable Names It is a good practice to choose variable names that reflect the kind of data that is being stored. It helps the programmer as well as the reader to understand the solution better, if the variable names reflect what they store. For example, the variable name sum indicates quite clearly that a total value is stored in the memory location called sum. If instead, we used a variable name X to store the sum, this does not clearly convey the contents to the reader of the solution, and it will make debugging and program maintenance more difficult. Most programming languages have strict rules regarding variable names, for example, they must begin with an alphabetic character and must be of a certain length, and so on. While we are not concerned with the syntax of the programming language at this stage, it is necessary to note this, so that students will develop the habit of choosing appropriate and meaningful variable names during the problem-solving phase. It is imperative that students grasp the concept of variables very early in the problem-solving phase. They must understand that variables are associated with memory locations and as such, manipulation of variables will result in changes in the values stored in memory. 2.3. A More Complex Problem The average problem was a fairly simple problem, the solution of which was straightforward; however, some refinement is needed, to make the solution more efficient. This will be discussed in the next chapter. We will now look at a far more complex problem to see how we can apply the principles outlined above to arrive at a solution. Problem 7 An architect’s fee is calculated as a percentage of the cost of a building. The fee is made up as follows: 8% of the first $5000 is the cost of a building and 3% on the remainder, if the remainder is less than or equal to $80.000, OR 2.5% on the remainder, if the remainder is more than $80,000. 10 The architect has hired you to design a program that will accept the cost of a building and calculate and display the architect’s fee. In this problem there are a number of items of information that must be taken into account. Remember, every bit of information in the problem statement is relevant to the solution. The first step is to define the problem to get a better understanding of what is required. Step 1: Draw a defining diagram as follows: INPUT Building cost PROCESSING 1. Get building cost 2. Calculate architect’s fee 3. Display architect’s fee OUTPUT Architect’s fee Step 2: Create Sample input Data A look at the defining diagram tells us what we have and what we need to do. However, it doesn’t tell us how to do it. To figure out how to do it, we need to use all the information given in the problem and do a manual solution. Before we do a manual solution, we first need to create sample input data. Sample input data: $100,000 Why choose $100,000 and not $100.00? Even if we have no idea of building costs, we can deduce from the information given that the cost is likely to be much greater than $5,000, since it is possible that the remainder could be in excess of $80,000. So any value in excess of $5000 would be reasonable for this problem. Step 3: Do a manual Solution of each processing step 1. 2. 3. 4. 5. 6. Get building cost: 100,000 Calculate architect’s fee: HOW? We must ask, “What do you have to do with the input, in order to get the output?” According to the information given, the architect’s fee comprises two components. The first component is the value of 8% of 5000. That is, 400. We then need to subtract 5000 from the input value to find the remainder. That is, 100,000 – 5,000 = 95,000 The next component is either 3% of the remainder or 2.5% of the remainder. We then check the remainder (that is, 95,000) to see whether or not it is less than or equal to 80,000. In this case, it is not. Check to see if the remainder is greater than 80,000. In this case, it is, (95,000 is greater than 80,000), so according to the information given, we should calculate the next component of the architect’s fee by finding 2.5% of 95,000, which is 2375. 11 7. Having found both components, we then add them together to find the total architect’s fee. That is, 2375 + 400 = 2775. Having, completed a manual solution to the problem, we are now in a position to formalize our solution. The next step is to write the solution as a sequence of instructions. Remember to use variables to store values that we obtain or results that we compute. We need to define the following variables: Building-cost, architects-fee, remainder, first-component, second-component Initial Solution: 1. 2. 3. 4. 5. 6. 7. 8. Get building-cost Set first-component to (5/100) * 5,000 Set remainder to building-cost – 5,000 If remainder <= 80,000 then set second-component to (3/100) * remainder If remainder > 80,000 then set second-component to (2.5/100) * remainder Set architects-fee to first-component + second-component display architects-fee Stop. The solutions outlined above seem to be very wordy. They contain many English-like statements. In Chapter 4, we will look at the correct way to represent our solution. For now, the focus is on getting a working solution in a language that we are familiar with. Let us look at one more example before we move to the refinement stage. Problem 8 Given a list of 20 integers, design a program to count the number of integers in the list that are larger than the first integer in the list. Display the first value and the number of values that are greater. 1. Define the problem: INPUT PROCESSING OUTPUT number 1. 2. First-number Total-greater 3. 4. Get first-number Compare each number with first-number Count numbers greater than first-number Print total-greater 12 There are three things to note in this problem: (i) (ii) (iii) Certain steps must be repeated until the last number is processed. We do not use a repetition or loop structure in the defining diagram. The defining diagram simply states what must be done and a note is made to indicate that some steps must be repeated. The defining diagram is not the solution. It is a tool to assist in understanding the problem. Repetition is implied in the diagram, that is, “compare each number”. 2. Create sample input data: The input list contains 20 data values, but we do not need to write down all 20 integers for our manual solution. We could choose say about 5 integers. If the solution works for 5 integers, it should work for 20, or even 1000 integers. So let us assume that our input list is as follows: 7, 6, 9, 23, 5. 3. Do a manual solution: Get first number: 7 Get next number: 6 Is 6 greater than first number (7)? No Get next number: 9 Is 9 greater than first number? Yes, then count = 1 Get next number: 23 Is 23 greater than first number? Yes, then count = 2 Get next number: 5 Is 5 greater than first number? No We have reached the end of the list. Now all we have to do is display the number 7 which is our first number and the value of count which is 2. 4. Outline solution as a sequence of instructions: In problems such as this, which involve repetition of a sequence of instructions, we can simplify the steps by using a structure called a loop. If we look at the manual solution above, we will notice that there are several steps that are repeated: Get next number Is number greater than first number…… We could encapsulate these statements within a loop as follows: Initial Solution: 1. 2. 3. 4. Get first-number Repeat the following 19 times: a. get next-number b. if next-number > first-number then increment count end-repeat Print first-number, count Stop. 13 2.4. Initialization of Variables There is one seemingly small problem with the above solution. The variable count does not have an initial value. Why does it need an initial value? In order to increment (that is, add one to) a variable, it must have had a value in the first place. Increment means add one to something. If there is nothing in the storage area (nothing does not mean 0, 0 is an actual value), we cannot add one to nothing. We cannot increment the number of cookies in a cookie jar, if it is empty. We can place a cookie in an empty jar, and then increment the number of cookies by adding cookies, one at a time. Likewise, we must place an initial value of 0 in the memory location called count and as we count a number, we increment the value of count. So, we need to revise the solution above to include the following as the first statement: count = 0 In computer programming terms, the variable count is referred to as a counter. Variables that are used as counters or used to store totals should always be assigned an initial value of 0 before they are incremented. Initialization may appear to be a trivial step, but it is very important. Many programs have failed to work correctly, simply because the programmer forgot to initialize a counting variable. Rule of Thumb: A simple rule of thumb with respect to initialization is: In the solution, if a variable appears on the right-hand side of an assignment statement before a value is assigned to it, then it must be assigned an initial value before the assignment statement is executed. 2.5. Summary In this section, we developed a systematic approach for arriving at a solution to any given problem: 1. 2. 3. 4. check the processing steps in the defining diagram to see what needs to be done, then create sample input data carry out a manual solution write a sequence of instructions, based on the manual solution. Often times the initial solution might not be very efficient and may appear to be clumsy, but what is important, is that the solution works. Once, we arrive at a working solution, we can then proceed to optimize the solution through a series of refinements, to make it more efficient. The next chapter illustrates how solutions can be refined. 14 Chapter 3 EVALUATE ALTERNATIVE SOLUTIONS Problem-Solving Step 3 “There’s more than one way to skin a cat”. 3.0. Introduction Having arrived at an initial solution to the problem, the programmer should not rest on his laurels, but instead, should explore alternative solutions. The aim is to arrive at the most efficient solution. Usually, the initial solution is the first method that comes to mind, but it is not always the best solution. This is the case with more complex problems. For most simple problems, like the problem involving average in the previous section, the solution is trivial and there may be very few alternative ways of doing it. Needless to say, alternative ways must always be explored so that an informed decision can be made about which solution is the most efficient. Here are some points to consider when developing alternative solutions: Can you derive the result differently? Can you make the solution more general? For example, would it work if there were 100 integers, instead of 3? Can you use the solution or method for another problem? For example, average temperature or average grade? Can you reduce the number of steps and still maintain the logic? Can you make it more robust? Would it work properly if incorrect data is entered? Let us revisit our problem involving average and see if we can find a better way of doing it. Recall our initial solution: 1. 2. 3. 4. 5. 6. 7. Get the first number, call it num1 Get the second number, call it num2 Get the third number, call it num3 Add num1, num2 , num3, storing in sum Divide sum by 3, storing in average Print result Stop Can you reduce the number of steps and still maintain the logic? A look at the first three statements shows that we are reading each input value directly after each other. Instead of using three separate statements, we could use one statement to read the three values as follows: Get num1, num2, num3 15 This is a much more efficient way because, by reducing the number of statements, we are reducing the number of CPU cycles required, thereby resulting in faster execution of the program and better utilization of system resources. This should be the aim of every good programmer. We can further reduce the number of statements by combining the two arithmetic statements into one. Let’s rewrite the arithmetic statements in a more concise form: Add num1, num2 , num3, storing in sum can be written more succinctly as: sum num1+ num2 + num3 which reads, “the variable sum is assigned the value of num1 plus num2 plus num3”. Likewise, Divide sum by 3, storing in average can be written as: Average sum 3 So instead of two statements : sum num1+ num2 + num3 Average sum 3 We can combine them as follows: Average (num1 + num2 + num3) 3 Here, we have not only reduced the number of statements, we have also eliminated one variable. This means that we are using less memory. Less memory implies greater efficiency. So let us rewrite our new and improved solution to the average problem: Revised Solution: 1. 2. 3. 4. Get num1, num2, num3 Average (num1 + num2 + num3) 3 Print average Stop. Wow! We have reduced the number of statements from seven to four and have reduced the number of variables from five to four, without changing the logic of the solution. This is essentially what refinement entails. Can you derive the result differently? In refining our solution, we looked at an alternative way to write our input and arithmetic statements. We need also to look at alternative ways of finding the average. Is there another method that can be used to find the average of three numbers? Maybe not, in this particular case, but as a general rule, alternative methods should always be explored. 16 Can you make the solution more general? For example, in the problem involving average, we were asked to find the average of just three integers. Suppose the requirements change and we are asked to find the average of 100 integers. Would the same algorithm work or would we have to write a different algorithm? The algorithm above was written specifically to read three integers, so we used three variables, num1, num2, num3, respectively for each integer. Now we need to find the average of 100 integers. Clearly, the algorithm above would not work for 100 values. If we revise the algorithm by declaring 100 variables, num1, num2,……., num100, and dividing the sum by 100, that would work. However, suppose next time we need to find the average of 50 integers, the algorithm would have to be revised again to declare 50 variables, and so on. There are three problems with this approach. 1. 2. 3. Using so many variables (100, 50) results in very poor utilization of memory. It becomes very cumbersome to write and manipulate so many variables in one statement, for example, get num1, num2, num3, num4, num5, num6, num7, num8, num9, num10, num11, num12,………,num100. Besides, we’ll run out of paper eventually! What if there were 1000 integers instead? It is very difficult to maintain this algorithm, because, each time the number of integer changes, we have to make major changes to the algorithm. It is, therefore, better to design the algorithm in a general way so that it will respond correctly to any number of integers, whether positive or negative. The way we do this is to use one variable, number to store the current integer and an entity called N to store the number of integers (say, 3 or 100). We then employ a loop structure that repeatedly reads a value into the variable, number and then calculates the sum. When all the values have been read and summed, we exit the loop and compute the average. This is illustrated below: Set N to 100 Set sum to 0 Repeat N times Get number Add number to sum, storing in sum End-repeat Average = sum N Display average Stop. Notice that in this version of the algorithm, we use an entity called N to store the number of integers that we want to manipulate. In this case the algorithm computes the average of 100 integers. If we wish to compute the average of 3 integers, all we have to do is make one simple change to the algorithm: that is, change 100 to 3 17 Constants: We refer to the entity N as a constant. Recall that a variable is an area of storage whose value can change during processing. A constant is an area of storage whose value never changes (that is, remains the same) during processing. Constants are very useful in making a program maintainable, (that is, easily modified). In the average problem, by simple changing the value we assign to the constant N, we can make our program calculate the average of any given integer. Can you make the program more robust? A program is said to be robust if it survives various unexpected events, such as incorrect or invalid input data. One way to make a program robust is to validate the input data. This means that the program should always check to see if the correct data has been entered and should have a strategy to deal with unexpected data. One should not assume that the correct data will be entered at all times. Data entry operators, programmers and users in general are all human. We are all prone to making mistakes. For example, the following program is designed to read students test scores and compute the letter grade for each student. It is understood that the maximum test score is 100 and the minimum would be 0. Start Read ID, test-score If (test-score >= 80) then Lettergrade = ‘A’ Else If (test-score < 80 and test-score >= 70) then Lettergrade = ‘B’ Else If (test-score < 70 and test-score >= 60) then Lettergrade = ‘C’ Else Lettergrade = ‘D’ Display ID, lettergrade Stop. What would happen if the user accidentally enters a test score of 150, instead of 50? What would happen is this: The program would print 2005346 A It would not recognize the input error and would treat the 150 as a legitimate value for the testscore. The student with the above ID would get a very pleasant and perhaps unexpected surprise when the results are published. There are two ways in which we could modify the above program to make it more robust. 18 1. We could insert a statement after the input statement to check if test-score is greater than 100 or less than 0. That is: Read ID, test-score If (test-score > 100 or test-score < 0) then Display “Error Message: Invalid Test-score” else If (test-score >= 80) then Lettergrade = ‘A’. 2. Alternatively, we could insert a test for the upper limit of 100 in the first if statement as follows: If (test-score >= 80 and test-score <= 100) then Lettergrade = ‘D’ Also at the end, instead of “else lettergrade = ‘D’, we could insert a test for the lower limit of 0 as follows: If (test-score < 60 and test-score >= 0) then ….. Either of the above adjustments would make the program more robust. Errors resulting from invalid or incorrect input data can have very serious consequences. Consider what would happen if the data entry personnel at the Water Authority enters 600 cubic metres instead of 60 for your water usage last month and you get a bill amounting to 10 times the usual amount; or what would happen if the interest on your credit card balance was computed at 215 percent instead of 21.5 percent? These are the sorts of real-world problems that can result when a program is not written to guard against invalid or incorrect data. Another Refinement Example: Let us look at another example to see how we can refine the original solution. The following is the initial solution for calculating the architect’s fee. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Get building-cost First-component = (5/100) * 5,000 Remainder = building-cost – 5,000 If remainder <= 80,000 then Second-component = (3/100) * remainder If remainder > 80,000 then Second-component = (2.5/100) * remainder Architects-fee = first-component + second-component Display architects-fee Stop 19 In this example, we cannot combine multiple arithmetic statements as we did in the previous example without affecting the logic of the solution. However, there are other areas that can be refined. This example involves the use of selection structures (that is, the “If” statements). If we trace through the logic of the solution, we can see that both If statements will be executed regardless of the outcome of the first. Let’s assume that the input data for this problem is 50,000. A trace of the solution would yield the following values: Remainder = building-cost – 5000 = 45,000 Step 4 would then be executed, (remainder is less than 80,000) giving, Second-component 15,000 Although we have computed a value for second-component, the solution requires us to execute Step 5, nevertheless. So we would still check to see if remainder is greater than 80,000, even though the previous check had rendered this check unnecessary. While this step does not compromise the logic of the solution, it creates additional, unnecessary work for the processor. Decision statements are taxing on the CPU and therefore must be used wisely. We could eliminate the second If statement and still maintain the correct logic. The following is a more efficient method: If remainder <= 80,000 then Second-component = (3/100) * remainder Else Second-component = (2.5/100) * remainder Here, we have replaced two consecutive If statements with a single if-then-else statement. Finally, let’s evaluate the solution for the looping example in problem 8. 1. 2. 3. 4. 5. Get first-number Repeat the following 19 times: Get next-number If next-number > first-number then Increment count End-repeat Print first-number, count Stop. Is there another way we could do this problem and arrive at the same result? This is left as an exercise for the reader. 3.1. Determine the most Efficient Solution The next step in the problem-solving phase is to determine the most efficient solution. Having evaluated and refined our solutions, choosing the most efficient solution is fairly straightforward. 20 When we use the term “efficient” in this context, we mean, not just which solution results in better utilization of memory, or which solution is shorter. The most efficient solution should have the following attributes: 1. 2. 3. should be maintainable; should use memory efficiently; should be robust. Use the above criteria to determine the most efficient solutions for the problems we evaluated in this chapter. 21 Chapter 4 REPRESENT THE SOLUTION AS AN ALGORITHM Problem-Solving Step 4 4.0 What is an Algorithm? So far, we have been developing and evaluating solutions to various problems. We have not been referring to such solutions as algorithms, even though many of them are algorithms. Not every solution though, is an algorithm. As mentioned in the introduction, algorithms have certain properties, so in order to qualify as an algorithm; a solution should exhibit certain specific attributes. Definition: An algorithm is a sequence of precise instructions for solving a problem in a finite amount of time. Properties: An algorithm must: 1. be precise; 2. be unambiguous; 3. give the correct solution in all cases; 4. eventually end. Precision versus Ambiguity A description is ambiguous if it is vague or has two or more meanings. A description is precise if it is strictly defined or definite. Consider the following descriptions: (a) a tall person (b) a tall building (c) a tall flower. Each is intelligible, though not very precise. In these instances, the word tall is vague or ambiguous, although understandable. However, if the goal is precision, we have to make the descriptions clear and definite. For example, we could replace (a) with a 6-foot, 8-inch person and (b) with a 4-storey building. Abernathy & Allen, “Exploring the Science of Computing” 22 4.1. Algorithmic Structure Every algorithm should have the following sections, in the stated order: Header Declaration : : Body Terminator : : Algorithm’s name or title. A brief description of algorithm and variables used. That is, a statement of the purpose, as well as initialization of variables Sequence of steps An end statement. It is very important that students get in the habit of writing properly structured algorithm at an early stage. If the structure outlined above is strictly adhered to, then it will be very easy to translate an algorithm into a programming language, as most programming languages have a similar structure. 4.2. Control Structures A structure is a basic unit of programming logic. A structure can be a sequence, a selection or a loop (that is, repetition). The body of the algorithm is comprised of various structures. Sequential structures are: 1. Input statements, for example, a. Get num1, num2 b. Read price, tax-rate c. Accept guess 2. Output statements, for example, a. Print total-cost b. Display average 3. Statements involving arithmetic operations, such as: a. Sum = num1 + num2 b. Average = sum 2 4. Statements that assign values to variables, such as: a. Count = 0 b. Maximum = 20 Selection structures are: IF or IF-then-else statements. They allow decisions to be made, based on some condition that evaluates to true. In the case of if-then-else, alternatives are executed if the condition is false. 23 a. b. If (A > B) then Display A If (age >= 50) Print “Old” Else Print “Young” Repetition or Loop structures allow statements to be repeated a fixed number of times or until some condition evaluates to false. If the number of repetitions is known beforehand, the loop structure is called a counted loop. For example, c. Repeat 10 times: Print “I am good-looking” End-repeat If the exact number of repetitions is unknown beforehand and is based upon some condition, then the loop is called a conditional loop. For example, d. While (price 0) do Read price Total = total + price End-while Here, the idea is to read a set of prices and calculate the total price. The read and assignment statements are executed until a price of 0 is encountered. Problem 9 Write an algorithm that prompts the user to enter his/her name, accepts the name and then display a welcoming message on the screen, such as “Hello Pat! Have a nice day!” This is a simple algorithm that uses only sequential statements. The solution would look like this: Algorithm Welcome {Header} This algorithm displays a welcome message to the user on the screen. {Declaration} Display “Enter first name:” Accept first-name Body Display “Hello”, first name Display “Have a nice day!” Stop {Terminator} In this simple example, there is only one variable, that is, first-name. We do not need to initialize this variable because, the accept statement takes care of that. It allows a value to be read from the keyboard and stored in the variable first-name. 24 4.3. The Algorithmic Language During development of an algorithm, the language gradually progresses from English towards a notation that resembles that of a programming language. An intermediate notation called pseudocode is commonly used to express algorithms. Algorithms can also be expressed as flowcharts or as Nassi-Scheidermann diagrams. A flowchart is a pictorial representation of an algorithm. 4.3.1. How to Write Pseudocode What exactly is Pseudocode? There seems to be a general misunderstanding of what actually constitute pseudocode. At one extreme, are programmers and authors who use mostly English-like statements to express their algorithms. For example, Read the first number Read the next number Compare numbers If the first number is greater than the second then print first number, Otherwise, print the second number. At the other extreme, are those who write algorithms using undiluted programming language code (such as C or Pascal statements) and call it pseudocode. Here is one such example of “pseudocode”. { } int number1, number2; cout << “Please enter the first number:”; cin>> number1; cout << “Please enter the second number:”; cin >> number2; if (number1 > number2) cout << number1 << end1; else cout<< number2 << end1; //end main The above is written in undiluted C code. We could take the code as written, add a line or two in the pre-processor area and execute it on any C/C++ compiler. To a C novice, this is pure mumbo jumbo. The whole purpose of writing algorithms is to develop a solution that is easily understood by all and one that can be readily translated into any programming language. The constructs in an algorithm, therefore should not be specific to any one programming language. The reader should not have to be familiar with a programming language in order to understand the solution. For this reason, the universally-accepted format for pseudocode is code that includes some English as well as common programming constructs that are well understood. For example, 25 Accept x, y If (x > y) then Display result OR Read x,y If (x>y) then Print result Read and print are English words, but they are also common to many programming languages. The If statement is also easily understood in English and is also a construct that is used in most programming languages for decision-making. Some words that are acceptable for input and output in pseudocode are stated below. INPUT Get Accept Input Read OUTPUT Display Print Output Always Use the Correct Terminology The algorithm is not a different entity from the pseudocode or the flowchart. Pseudocode and flowchart are simply ways of representing or expressing algorithms. So it is incorrect to say, write an algorithm, a flowchart and pseudocode to solve a given problem. A flowchart is an algorithm, likewise is pseudocode. So instead, we should say, write an algorithm using (or/in) pseudocode and/or flowchart to….. Flowcharts versus Pseudocode: A frequently asked question is when do we use pseudocode and when is it better to use flowcharts? Answer: For experienced programmers, flowchart versus pseudocode is a matter of preference. Most prefer pseudocode. However, beginner programmers should be required to use both pseudocode and flowchart to represent their algorithms. While pseudocode is more concise and closely resembles the programming language, the flowchart gives a good view of the structure and flow of the logic. Beginners tend to find it easier to follow the logic in a flowchart than that of pseudocode. It is easier to visualize the connection between the statements. However, long, complex solutions are better represented as pseudocode, rather than flowchart, which would be long and cumbersome to navigate. To illustrate pseudocode and flowchart representations, let us represent the solution for the problem involving average, first in pseudocode, then in flowchart form. Recall that in Chapter 3, we revised the solution to make it more efficient. The revised solution, written in pseudocode would look something like this: 26 Pseudocode version Algorithm Average This algorithm finds the average of three numbers Start read num1, num2, num3 Average (num1 + num2 + num3) ÷ 3 Print average Stop. Notice that the pseudocode version is quite similar to what we had before. It contains less English-like statements and a bit more general programming constructs. Note also that we have retained the assignment symbol in our pseudocode. This is important in terms of the logic. In such statements, we are assigning values to variables; we are NOT equating the operation on the right with the variable on the left. Different programming languages use different symbols for assignment. In Pascal, the “:=” symbol is used, while in C the “=” symbol is used. This means assignment in C and the equate symbol is “= =”. 4.3.2. Flowcharts Flowcharts use special geometrical objects to designate the basic steps of a program, which are: input, processing and output. A parallelogram is used to represent the input operation, as well as the output operation, and a rectangle is used to represent a processing/assignment statement. A diamond is used to represent a decision (if-then-else) structure. An elliptical shape is used to represent the terminal indicators, START or STOP. Directional arrows are used to indicate the flow of the logic in the algorithm. The four basic shapes are: Input/Output Processing/Assignment Decision 27 Start/Stop Below are examples of how the various control structures are depicted in a flowchart: Sequence A do A B do B Selection (Decision) No Yes C If C is true then do E D Else E do D Loop (Repetition) Yes F While F is true do G G No 28 Flowchart version of the Average Algorithm Start Read num1, num2, num3 Aver = (num1+num2+num3)/3 Print average Stop The example above, uses only the sequential structure. Let us look at an example that involves a decision statement. Both the pseudocode and flowchart versions of the solutions will be presented. Problem 10 Design an algorithm that accepts two values, the hourly rate and the number of hours worked by an employee. If the number of hours exceeds 40, then the excess hours should be paid at an overtime rate of twice the hourly rate. Calculate the wages (including overtime, if any) due to the employee. The first step is to define the problem as follows: INPUT Rate, hoursworked 1. 2. 3. 4. PROCESSING read rate, hours-worked Calculate overtime pay, if any Calculate wages print wages 29 OUTPUT wages The next step is to create sample data and then do a manual solution. Sample data: 10.00 45 The first data item is $10.00 that represents the hourly rate and the second is 45 that represent the number of hours worked. The manual solution is left as an exercise for the student. A possible pseudocode version of the algorithm would look like this: Algorithm Wages This algorithm calculates the wages (including overtime) due to an employee based on the number of hours worked. Read rate, hours-worked Basic-Pay = rate * 40 Overtime = hours-worked – 40 If (overtime > 0) then Wages = Basic-Pay + (overtime * 2* rate) Else Wages = rate * hours-worked Print wages Stop. 30 The flowchart version of the Wages problem would look like this START Read rate, hourworked Basic-Pay = hoursworked * 40 Overtime = hours-worked – 40 Yes Wages = Basic-Pay + (overtime * 2* rate) No Overtime > 0 Print Wages STOP 31 Wages = rate * hours-worked The next problem is an example of a situation where the flowchart representation can become quite complex. Sequential, looping and nested decision structures are used. Problem 11 Design an algorithm that reads a list of students test scores and determines the letter grade that corresponds to each test score, according to the following table: Test Score Range 80 – 100 70 – 79 60 - 69 Below 60 Letter Grade ‘A’ ‘B’ ‘C’ ‘D’ The input list is terminated by a value of –1, which is used as a sentinel. Print each test score and the corresponding letter grade. The first step is to define the problem: INPUT test-score 1. 2. 3. 4. PROCESSING read test-score for each test score, determine corresponding letter grade print test-score & letter grade OUTPUT Test score Letter grade The next step is to create sample data. Note that the problem did not state the actual number of test scores in the list. For this problem, we do not need to know how many scores there are. However, in another scenario, if we need to know the exact number of data items in the input list, we have to count them as each item is read. We read the data until the end of the list is reached. A value of –1 indicates the end of the list. The –1 is not valid data, it is merely an indicator, to let us know that there is no more data in the input list. This value is called a sentinel. A sentinel can be any value that does not represent valid data in the context of the problem. We can choose any number of input values for our sample list. For example, Sample input data: 75 43 56 87 72 91 35 –1 The manual solution will be left as an exercise for the reader. The following is the solution algorithm in pseudocode: 32 Algorithm Grades This algorithm reads test scores and determines the letter grade for each score. Read test-score While (test-score -1) do If (test-score >= 80 and test-score <= 100) then {Validate the input data} Lettergrade ‘A’ Else If (test-score < 80 and test-score >= 70) then Lettergrade ‘B’ Else If (test-score < 70 and test-score >= 60) then Lettergrade ‘C’ Else If (test-score < 60 and test-score >= 0) then Lettergrade D’ End-If End-If Print test-score, lettergrade Read test-score End-while Stop Note the indentation which highlights the structure of the logic and the flow of control of the if statements and the while loop. The structure will be more evident in the flowchart version of the algorithm. 33 Flowchart Version Start Read testscore L O O P Yes test-score = -1 Print Error Message No No Yes No Yes test-score <= 100 Yes Lettergrade = ‘A’ Lettergrade = ‘B’ test-score >= 70 ? test-score >= 80 ? No No Lettergrade = ‘D’ Yes test-score >= 60 ? Print Lettergrade Stop 34 Lettergrade = ‘C’ It can be seen that the flowchart above is quite complex and the logic is somewhat difficult to navigate. In such circumstances, the programmer may find it easier to represent the algorithm as pseudocode, using appropriate indentation to show the logic and to delineate the control structures. In problems where the flowchart cannot fit on a single page, a special symbol called a connector is used to connect sections of the flowchart over multiple pages. The connector symbol is a small circle with a number or letter inscribed therein, for example. Flowchart Section 1 1 Flowchart Section 2 2 Flowchart Section 3 35 4.4. The Structured Programming Concept The main goal of structured programming is to structure the flow of control in such a way that the execution sequence is as close as possible to the reading sequence. The structure of a program is determined by the constructs used to direct the flow of control. This enforces a discipline on the programmer in terms of the control structures that can be used and in the manner in which they are used [Tremblay & Bunt]. Structured programs are designed using three basic control structures: 1. 2. 3. Sequence Selection Repetition Algorithms that are carefully designed using only these control structures are much more readable and maintainable. The “go to” statement should be avoided at all cost when writing algorithms, as it tends to make the solution very unstructured. The resulting code or flowchart becomes very difficult to decipher and resembles spaghetti. Hence, the term, “Spaghetti code”. 4.5. DO’s AND DON’TS WHEN WRITING PSEUDOCODE Do 1. 2. 3. 4. Use the assignment symbol ( ) in assignment statements instead of the equal sign. Use meaningful variable names Use indentation to show the logic and scope of control structures. Insert comments to clarify meaning of blocks of code. Do Not 1. Use language-specific constructs such as case, switch statements or for loops in the pseudocode. Constructs such as while, repeat and if-then-else are sufficiently general and can therefore be used in the pseudocode. Keywords such as readln, writeln, printf, scanf should not be used. 2. Attempt to write Pascal code before writing the algorithm. That is, do not execute the program first and then try to write the algorithm afterwards, based on the program code. This is a very bad practice and is essentially a waste of time. 36 Chapter 5 TEST THE ALGORITHM FOR CORRECTNESS USING TRACE TABLES Problem-Solving Step 5 5.0. Tracing the Algorithm Checking the algorithm to see if it works is a very important step in the problem-solving process. This must always be done before the program is translated into programming language. This is a discipline that beginner programmers must learn and practice. Teachers should endeavour to enforce this practice by making it a requirement for students to show the trace of their algorithms, before they get on the computers in the lab. We can check to see if our algorithm produces the desired results by tracing through the logic of the algorithm, using some chosen test data. Steps in Tracing an Algorithm: 1. 2. 3. 4. 5. 6. Choose simple input test cases (data sets) which are valid. Two or three test cases are usually sufficient. Establish what the expected result should be for each test case. That is, do a manual solution beforehand. Create a table of all the variables used in the algorithm. “Walk” the first test case through the algorithm, keeping a step-by-step record of the content of each variable in the table as the data passes through the logic until the algorithm reaches its logical end. Check that the expected result established in Step 2 matches the actual result developed in Step 4. Repeat the process using a different set of test data. Example: Trace the following algorithm using two data sets: Problem 12 Super Six is the name of a junior cricket team. The coach of Super Six has hired you to design a program to provide him with statistics after each match. Design an algorithm that reads a list consisting of the number of runs scored by each batsman in the team. There are 6 batsmen in the team. Compute and print the average number of runs scored by the team. Algorithm Average Score This algorithm computes the average number of runs scored by a cricket team. Start 1. 2. 3. Total-runs 0 Count 1 (used to keep track of the number of times the loop is executed) While (count <= 6) a. Input runs b. Total-runs Total-runs + runs 37 4. 5. 6. Stop. c. Count count +1 End-While Average-Runs Total-runs 6 Display Average-Runs We have numbered the statements to make reference to them easier. Step 1: Choose 2 or 3 sets of test data Test Data Set 1: Test Data Set 2: 46 18 50 9 0 1 67 102 13 0 0 0 Step 2: Establish what the expected result should be: Manual Solution gives an expected result of: 20.67 runs for data set 1 and 30.3 runs for data set 2 Step 3: Create a table of all the variables used in the algorithm Keep track of the contents of each variable in the table as each statement in the algorithm is executed. Instruction in Execution 1 2 3a 3b 3c 3a 3b 3c 3a 3b 3c 3a 3b 3c 3a 3b 3c 3a 3b 3c 5 6 Count Runs TotalRuns AverageRuns Output 1. Total-runs 0 2. Count 1 3. While (count <= 6) 0 1 46 a. Input runs 46 b. Total-runs Total-runs + runs 2 18 c. 64 3 4. End-While 5. Average-Runs Total-runs 6 6. Display Average-Runs 7. Stop 50 114 4 9 123 Count count +1 5 a. Test Dat a Test Data Set 1:46 18 50 9 0 1 0 123 6 b. Test Data Test Data Set 2: 67 102 13 0 0 0 1 124 7 20.67 20.67 38 The actual result of the trace matches the expected result from our manual solution for the first test case. A second trace of the algorithm, using test data 2 is left as an exercise for the students. An accurate trace will result in a match in the expected and actual results for the second set of test data. We can then conclude that the algorithm yields the correct results. 5.1. Choosing Appropriate Test Data It is important that we test our algorithms using a wide range of data values (including extreme cases) that will test every segment of the code. Some algorithms can give the correct solution with one set of data and incorrect solution with another set of data. This depends on the flow of control within the algorithm and which section the data actually traverse. As an example, let us look at the following pseudocode segment: : read A, B if (A > B ) then increment N else decrement N : If we use different sets of test data for this code in which the value of A is always greater than the value of B, then the else statement will never be executed. Since the else alternative is never be tested, errors in that section of the code could go undetected for quite some time. So the appropriate set of test data for these kinds of problem should be one set which tests the if alternative and another set that tests the else alternative. The following test data would be appropriate for the code segment above: Data set 1: Data set 2 23 7 9 35 The following test data would not be appropriate because A would greater than B in both cases and the alternative statement would never be tested. Data set 1: Data set 2 23 7 9 35 Let’s trace another algorithm to see if we can detect any errors. Algorithm Large_Count Given a list of 20 positive integers, this algorithm counts the number of values that are smaller than the first value. Set count to 0 Set No-of-Values to 0 get first-number 39 While (No-of-Values < 20) do the following: get next-number increment No_of-Values if next-number < first-number then increment count end-while print first-number, count stop. For the sake of brevity, we will revise the algorithm to read a maximum of 5 numbers instead of 20. The trace is shown on the next page. Trace of Algorithm Large Count: Test data set 1: Test data set 2: 5 9 13 7 4 22 27 -3 1 16 The expected result for data set 1 is: 5 and 2 Instruction Count in Execution 1 0 2 3 4a 4b 4a 4b 4c 1 4a 4b 4a 4b 4c 2 4 Error - No_Of_Values Firstnumber Nextnumber Algorithm Large_Count Given a list of 5 positive integers, this algorithm counts the number of values that are smaller than the first value. 0 5 13 1 4 2 27 3 1 4 No more data! 40 Set count to 0 Set No-of-Values to 0 get first-number While (No-of-Values < 5) do the following: a. get next-number b. increment No_of-Values c. if (next-number < first number) then increment count end-while 5. print first-number, count 6. stop. 1. 2. 3. 4. An error is detected when No-Of-Values is 4. The algorithm attempts to execute the loop, only to find that there is no more data in the input list. The data set has the correct number of integers, that is, 5, so there must be an error in the logic, with respect to the variable No-Of-Values. A closer examination of the logic reveals that the error occurred because we initialized No-OfValues to 0 before entering the loop. The input list consists of 5 data items and we read the first number before we entered the loop. Therefore we should either increment No-Of-Values before we enter the loop or simply initialize the variable to 1 to reflect the fact that the first number was read outside the loop. So we can revise the algorithm as follows: Revised Algorithm Large_Count Given a list of 20 positive integers, this algorithm counts the number of values that are smaller than the first value. Set count to 0 get first-number Set No-of-Values to 1 {initialize the variable to 1 instead of 0} While (No-of-Values < 20) do the following: get next-number increment No_of-Values if next-number < first-number then increment count end-while print first-number, count stop. Alternatively, we could leave the initial value of No-of-Values as zero and revise as follows: Set No-of-Values to 0 Get first-number Increment No-of-Values While (No-of-Values < 20) do the following: : This method however, would be less efficient. As an exercise, the reader should trace the algorithm using data set 2 and see if any more errors are detected. Hint: a second trace should reveal another error. 41 Chapter 6 LIST PROCESSING USING ARRAYS 6.0. What is an Array? So far, we have been using what is called simple data types in our algorithms. That is, the data values we manipulated were unrelated and stored in separate variables. For example, in the average problem, we used three variables, num1, num2, num3. These variables are independent of each other in terms of where they are stored in memory and also in terms of what type of data they store. Num1 could store an integer value; num2 could store a real value. Num1 could be stored at location 101 in memory and num2 could be stored at location 1510. It doesn’t matter. An array however, is a more complex data type, in that; it is a structure which contains not just one data value, but multiple data values, all of the same type. That is, the values are all integers or all floating point numbers or are all characters. Also, the values are stored in contiguous (that is, adjacent) memory locations. So if the first value is stored at location 100, then second value would be stored at location 101, the third value at location 102 and so on. Definition: An array is a data structure that is used to store a fixed number of data items all of the same type. The items (or elements) of the array are organized in sequence and can be accessed directly by specifying their positions in the sequence, using an index or subscript. If only one index is used, the array is called a one-dimensional array. If more than one index is used, the array is called a multi-dimensional array. One-dimensional arrays are list structures (or vectors), two-dimensional arrays are table structures (or matrices). At this level, we are only concerned with one-dimensional arrays. Multi-dimensional arrays are outside the scope of this course and will be addressed in advanced programming courses. A one-dimensional array can be represented as follows: ARRAY TEMP TEMP [1] TEMP [2] TEMP [3] TEMP [4] TEMP [5] TEMP [50] 25 28 31 29 30 : : Data values stored in adjacent memory locations To refer to the 2nd temperature value, 28 we would specify its position in the array as a subscript of the array name. That is, Temp [2] Subscript or index (that is, the position of the item in the array) Temp is the name of the array, 2 indicates the location of the value 28. 42 Arrays are sometimes called subscripted variables, because the values stored in the locations of an array are accessed via the subscript or index of the array. Arrays are typically used to store and process a list of items. That is why, in some texts, manipulation of arrays is sometimes referred to as list processing. Example A program is required to read a list of temperature values and find their mean. Two Possible Solutions: (i) If the list is short, say 3 temperature values, we may use 3 different variables, say temp1, temp2, temp3. (ii) If the list is long, say 120 temperature values, it would be cumbersome and awkward to use 120 different variables. Arrays are used to solve these types of problems efficiently. 6.1. Accessing the Elements of an Array The elements of an array can be accessed individually by specifying the name of the array, followed by the index or subscript, which identifies the position of the element in the sequence. Therefore, when manipulating arrays, a special variable must be declared as the index of the array. A single letter (such as i, j, or k) is commonly used as array index. Using the index, the array elements can be manipulated in the same way that we manipulate ordinary variables. For example, we can assign an initial value to an array, we can read a value into an array location, we can display values stored in arrays, and we can perform arithmetic and logical operations on the elements in an array. Traversing the array (that is, access each element in a sequential manner) requires a loop structure. Examples of this will follow. 6.1.1. Initializing Arrays Initial values can be assigned to array locations either by reading values directly into the locations or by use of an assignment statement. Let’s say for example, that we want to set all 10 locations of the array list to 0, initially. We would write: Set i to 1 Repeat 10 times list [i] 0 increment i end-repeat The first time the loop is executed, i = 1 and list [1] would be assigned 0. The second time through the loop i = 2 and list [2] would be assigned 0. 43 The third time through the loop, i = 3 and list [3] would be assigned and so on, until the last location, that is list [10] is assigned 0. When manipulating arrays, a special variable must be declared for use as the index of the array. It is better to use short variable names, such as single letters of the alphabet for the index or subscript. 6.1.2. Reading Values into an Array If the number of values to be read is known beforehand, we use a counted loop (that is, repeat n times). If the number of values is unknown, we use a conditional loop (that is, while some condition is true do). The case of the known number of items: To read 10 values into the array temp: Set i to 1 Repeat 10 times read temp[i] increment i end-repeat { get next value and store it in location temp[i] } {add 1 to index to get to the next array location} If the input data was say, 21 23 25 26 22 27 29 26 24 26 First time through the loop, i = 1, the value 21 is stored in temp [1] Second time through the loop, i = 2, the value 23 is stored in temp [2] Third time through the loop, i = 3, the value 25 is stored in temp [3] Fourth time through the loop, i = 4, the value 26 is stored in temp [4] and so on until the last location temp[10] is filled with the value 26 After the loop has been executed 10 times, the array would look like this: LIST 1 21 2 23 3 25 4 26 5 22 6 27 7 29 8 26 9 24 10 26 The top numbers represent the index or subscript of the array, the bottom numbers are the data values stored in the various locations. 44 The Case of the Unknown number of Items: Consider the following problem: Write an algorithm that reads a list of integers from the input stream and stores the values in an array called Positive_Num. The list is terminated by a sentinel value of –1. Here, the number of items to be read is unknown, so we use a conditional loop as follows: Set i to 1 Read number While (number is not equal to –1) do the following: Positive_Num[i] number {store current number in array location} Increment i {add 1 to index to get to next location} Read number {read next number} End-while Num_of_Items i - 1 The loop will execute until number receives a value of –1. The –1 will not be stored in the array. The last statement stores the actual number of data items read in the variable Num_of_Items. This value will be needed in future if we need to print the data values or traverse the array for any other purpose. 6.1.3. Displaying Array Values Writing or printing values stored in arrays is similar to reading values into arrays, except that, in the case of writing, we would already know the actual number of items stored in the array. Therefore, we can simply use a counted loop as in the example below: Set j to 1 Repeat 10 times display temp[j] increment j end-repeat Alternatively, if we stored the actual number of items read in a variable called Num_of_Items, we could display the items in array list as follows: Set j to 1 While (j <= Num-of_Items) do the following: Display list[j] Increment j End-while 45 6.1.4. Traversing Arrays Traversing an array simply means, moving through the array in a sequential manner, visiting each element, in order to manipulate the elements in one way or another. For example, an array is traversed when we print the elements, or when we search the list for a particular item, or when we sort the list of items in a particular order. In traversing an array, we must establish a loop and increment the index within the loop in order to get to the next element. A classic example of traversing an array is to perform a linear search. A linear search involves examining each element in the array, one by one, starting with the first element and comparing each element with the item/value being search for. The search ends when a match is found or when the end of the array is reached. The following is an algorithm for performing a linear search of an array: Algorithm Linear Search This algorithm searches a list for the presence of an item called, target. In this case, target is an integer value. A Boolean variable Found is set to true if the target value is found, otherwise, Found is set to false. Set Size to 50 Set Target to 27 Set index to 1 Set Found to false {maximum no of elements in the array} while ((Found is false) AND (index <= Size)) if (list[index] = target) then {if value is found set the flag to true} set Found to true else increment index {otherwise, move on to the next element in the array} end-while if (found = true) then display “Target found at location”, index else display “Target not found” stop. An Array Processing Example: Design an algorithm that reads a list of students test scores and determine the number of students who failed the test. A student is deemed to have failed if his/her score is less than the class average. The list is terminated by a value of 999. Print the class average as well as the number of students who failed. 46 Defining Diagram Input List of testscores 1. 2. 3. 4. Processing Read test scores Calculate average score Count number of scores that are less than the average Print average, number-failed Output Average Number-Failed Algorithm Test-Report This algorithm reads a list of test scores and determines the number of scores that fall below the class average. Variables Used: scores is the array which stores the test scores, i is the index to the array, sum stores the total scores, Number-Failed stores the number of students who failed the test. Set i to 1 Set sum to 0 Set Number-Failed to 0 Read num While (num not equal to 999) do the following: Scores[i] num {store each data value in the array} Sum sum + scores[i] {compute the sum of the scores} Increment i {advance to the next location in the array} Read num End-while Set No-of-Items to i average sum No-of-Items {Traverse array to count how many values are less than average} set index to 1 Repeat No-of_items times {counted loop} If (scores [index] < average) then Number-Failed Number-Failed + 1 End-if Increment index End-repeat Display “The average test score is:”, average Display “The number of students who failed is:”, Number-Failed Stop. 47 6.2. Points to Note when Manipulating Arrays 1. It is illegal to refer to an element outside the array bounds. For example, if the size of an array temp is declared to be 50, then it would be illegal to have a statement such as: print temp [51] 2. Therefore, when looping through an array, the subscript should never go below 1 and should always be less than or equal to the total number of elements in the array. So temp [-1] would be an illegal reference, since the index is negative. 3. Some programming languages such as C require that the first element in an array start at subscript 0, and the last element at subscript SIZE-1. Other languages such as Pascal require that the initial subscript be 1 and the last element is at subscript SIZE. For the purposes of an algorithm, which should be language-independent, it is recommended that array subscripts start at 1. This makes it easier for the beginner to comprehend. 4. All the elements in an array must be of the same type. That is, all integers, or all real numbers, or all characters. 5. The size of the array is fixed when the array is declared in a programming language. For example, an array may be declared to have 100 locations. This does not mean that there are 100 elements in the array. This means that 100 locations are reserved in memory for a particular array or list. Depending on the number of data items in the input list, all 100 locations may not be used. If there are 75 data items, then only 75 locations of the array will have data; if, on the other hand, there are 100 data items, then all 100 locations will be used. The algorithm should keep track of the number of data items stored in the array and manipulate the list accordingly. There must also be a means of determining which locations have values and which are empty. 48 Chapter 7 THE TOP-DOWN DESIGN METHODOLOGY 7.0. What is Top-Down Design? The Top-Down Design Approach or Modular Programming as it is sometimes called involves breaking a problem into a set of smaller problems, called sub-problems or modules, followed by breaking each sub-problem into a set of tasks, then breaking each task into a set of actions. This is called a “divide and conquer” approach. When faced with a complex problem, it is easier to break the problem down into smaller, more manageable sections and tackle each section as a separate entity, rather than trying to solve the large problem in one go. PROBLEM Problem Sub-problem-1 Task1 Action1 Action2 Actionn Task2 Sub-problem-2 Taskn Task1 Sub-Problem-n Task1 Action1 Action2 Action1 Action2 Task2 Action1 Action2 A sub-problem is a set of related tasks. A task is a set of related actions. An action is a basic instruction that needs no further refinement. For example, an action might be a simple instruction such as, add two numbers. The process of dividing the problem into sub-problems or modules and breaking them down into smaller units is called stepwise refinement. One advantages of modular programming is that when a problem has been decomposed into smaller sub-problems, each sub-problem can be solved as a single entity. However, the solution of each individual sub-problem does not necessarily solve the larger problem. There must be cohesion between the modules. That is, there must be a mechanism for communicating between the different sub-problems. This mechanism will be discussed in Section 7.2. The concept of top-down design with stepwise refinement is best explained by looking at several examples. 49 Problem 1 Turn on a light in a room. This is a fairly simple problem that involves two simple tasks. The problem can be decomposed as follows: Sub-problem 1: locate switch Sub-problem 2: depress switch (one task, one action) (one task, one action) Both sub-problems comprise one task and each task constitutes on action. Problems such as this are too simple to apply the top-down design method. The method is usually applied to more complex problems like the one below. Problem 2 Mom asks Jan to prepare dinner for the family tonight. Jan plans the menu, which includes baked chicken, rice and peas and coleslaw. As mealtime approaches, Jan begins to feel overwhelmed at the task of preparing a large meal for the family. So she decides to adopt a “divide and conquer” approach. She solicits the help of her two sisters, Monica and Flo and together they prepare the meal. She divided the problem of preparing the meal into the three smaller problems: 1. 2. 3. Prepare baked chicken Cook rice and peas Prepare coleslaw Jan asked Monica to do the baked chicken, and Flo to do the coleslaw. Jan decides to do the rice and peas herself. In order to avoid confusion in the kitchen, Jan directs the order in which each dish should be prepared. She first called on Monica to prepare the chicken and put in the oven for baking. After Monica exits the kitchen, Jan starts preparing the rice and peas. After the rice and peas is cooked, Jan calls on Flo to begin preparing the coleslaw. When the coleslaw was completed, Flo returns control of the kitchen back to Jan. Jan continues to direct the various tasks until the dinner was completed. This is a classic example of the “divide and conquer” approach to problem-solving. The problem of preparing dinner comprises multiple tasks or sub-problems: Sub-problem 1: Prepare baked chicken Sub-problem 2: Cook rice and peas Sub-problem 3: Prepare coleslaw Each sub-problem can be further sub-divided into a set of tasks. For example, a possible set of tasks associated with prepare baked chicken might be: 50 Tasks: 1. 2. 3. 4. clean chicken cut chicken into quarters season chicken bake chicken Some of the above tasks can be further divided into a set of actions. For example, a possible set of actions for the clean chicken might be: Actions: 1. 2. 3. 4. remove giblets remove any excess feathers squeeze lime juice over chicken wash chicken in cold water. The above steps need no further refinement, as they each represent a single instruction or a basic operation. Some tasks may comprise a single action. For example, the fourth task, bake the chicken, is a basic operation that needs no further refinement. That is, the task itself constitutes a single action. Likewise, in less complex problems, some sub-problems may themselves constitute a single task, as we saw in Problem 1. The refinement of tasks 2 and 3 in the problem above is left as an exercise for the student. Preparation of the complete meal requires coordination between the various modules, namely, Jan, Monica and Flo. Jan plays the role of the coordinator. She decides the order in which each dish is to be prepared and calls the appropriate module to take control of the kitchen at the appropriate time. She presents (displays) the meal (that is, the output) to the family when all the tasks have been completed. In programming terms, we refer to Jan as the Main module and to Flo and Monica as sub-programs. The main module issues a call to the sub-program to perform a particular function. Upon completion of that function, the sub-program returns control to the module that called it. Any module can issue a call to any other module, depending on the logic of the program. In our dinner example, we could reorganize the logic such that Jan calls Monica to prepare the chicken and upon completion, Monica calls Flo to do the coleslaw, before Monica ultimately returns control to Jan. The two scenarios can be illustrated as follows: Jan (Main) Monica Prepare Baked Chicken Flo Prepare Coleslaw 51 Jan (Main) Monica Prepare Baked Chicken Flo Prepare Coleslaw Scenario 1 Scenario 2 In the real-life scenario presented above, we defined several sub-problems, each comprising multiple tasks. In modular programming however, this is not strictly allowed. The general rule is that each module should perform a single task. Tasks which involve multiple actions should be treated as sub-programs or modules. 7.1. Hierarchy Charts The diagrams above are called hierarchy charts, or structure charts. A hierarchy chart is a treelike structure that shows visually the relationships between the modules of a program. The root of the tree (the top box) represents the controlling or main module. The next level shows the modules that are called directly by the main module, while the next level shows those modules which are called by the ones above them, and, so on and so forth. The saying, “a picture is worth a thousand words”, holds true here. One look at a hierarchy chart for a given problem will reveal immediately, which module is the main module and also the position of all the other modules, as well as the flow of control between the modules. The hierarchy chart does NOT tell you what tasks are to be performed within a module; neither does it tell you the order in which the modules are executed. It simply tells you which modules exist and which module calls which other module. It is a good practice to draw hierarchy charts of all modular problems before writing the algorithm. Let us look at a programming example. We will demonstrate how a problem can be decomposed into smaller problems, how we represent the sub-problems in a hierarchy chart and then how we establish communication between the various modules. Problem 3 Given a list of students test scores, find the highest and lowest score as well as the average score. Four sub-problems can be identified here: 52 1. 2. 3. 4. Sub-problem 1: Sub-problem 2: Sub-problem 3: Sub-problem 4: read list of test scores find_the_highest_score find_the_lowest_score find_the_average The hierarchy chart for the modules above would look something like this: MAIN Read_test_ scores Find_Highest _Score Find_Lowest _Score Find_ Average The main module will be read the input data and pass it to each of the sub-programs. The subprograms will in turn perform their tasks and return the results to the main module, where they will be printed. Only modules or sub-programs are included in the hierarchy chart. The actions are not. The actions are shown in the algorithm. General Rule: In modular programming, a module should be comprised of statements that contribute to a single, specific task. 7.2. How to Sub-Divide a Problem into Modules When decomposing a large, complex problem, it is sometimes difficult to decide what exactly should comprise a module. This is where the defining diagram can be really helpful. Recall that the defining diagram lists all the major tasks that must be performed in order to solve the problem. Let us look at the defining diagram for Problem 3 above. 53 Defining Diagram: Input List of testscores 1. 2. 3. 4. 5. Processing Read test scores Find the highest score Find the lowest score Find the average score Print highest, lowest, average Output Highest Lowest Average There are 5 main things that must be done in order to solve this problem. How do we decide which of the tasks above should be a module in our program? The easiest thing would be to make each task in the defining diagram a module. However, this approach might not be very efficient. In some problems, a task might be fairly simple and requires a single action. For example, read num1, num2, num3 In such cases, we do not need to make the task a module, as the action can easily be done by the main module. Where the task is involves multiple actions, such a task is a candidate for a module. For example, the read test scores task involves reading the data into an array and validating each test score read. Therefore, it would be reasonable to declare the read test scores task as a module. Also, find the highest score is a task that involves several actions; it is a subproblem in itself. Therefore, it should be treated as a module in the solution. Likewise, find the lowest score and find the average. Print highest, lowest, average is a simple task that involves a single action, so this task does not become a module, but instead can be an action in the main module. If the task itself is fairly complex, it should be further refined and broken down into more than one module. Consider for example, our dinner problem earlier. Recall that Monica’s job was to prepare the baked chicken and this module involved several tasks, some of which were fairly complex. One complex task was that of seasoning the chicken. This involves assembling a variety of herbs and spices and then applying them to the chicken in a special manner. So Monica decides to solicit the help of her grandmother to prepare the seasoning. When Monica was called on to prepare the chicken, she in turn, called Grandma (a subordinate module), who set about preparing the herbs and spices. At Monica’s request, Grandma prepared the seasoning and then passed it on to Monica. Grandma then exited the kitchen. This scenario can be represented in a new hierarchy chart as follows: 54 Jan (Main) Monica Flo Prepare Baked Chicken Prepare Coleslaw Grandma Prepare Seasoning Modules Calling other Modules: The above scenario is an example of a subordinate module calling other modules. Just as the main module can call a module, any module can call another module. There is an art in deciding whether to break down any particular module further into its own sub-problems. The general rule is that a module should be comprised of statements that contribute to a single, specific task. For example, it would be okay to have a module that prompts the user to enter input data, reads the data and then checks the validity of the data, since all three actions contribute to the specific task of reading the input data. On the other hand, a module that sorts names in alphabetic order and then searches for the presence of the name “Brown” is performing two specific tasks. This module would be very complex and would make the module less cohesive. It would be better to separate the tasks and write one module to sort the names and another module to search for the name. Terminology for Modules: Different programming languages use different names for modules in a program. In some languages, such as Pascal, they are referred to as functions and procedures. In C, they are referred to as functions. Other languages may refer to them as subroutines or methods. When writing algorithms, modules are referred to as sub-algorithms. 7.3. Steps in Modularisation 1. 2. Define the problem. From the processing section, identify the tasks that will determine the modules that will make up the program. Each non-trivial task should constitute a module. Construct a hierarchy chart showing the modules and the relationship between them. Formulate the algorithm for the main module in either pseudocode or flowchart. The main algorithm should include the initialization of variables used in main, appropriate logic to call each module in the correct sequence to perform their various tasks, and the printing of results, if the printing is straightforward. Develop sub-algorithms for each module, including any parameters that may be passed to and from the modules. 3. 4. 5. 55 6. Test the algorithm for correctness. Trace the main algorithm and each module separately to check if the logic is correct. 7.4. Representing Modules in Pseudocode and Flowchart Representing a modular solution in pseudocode is similar to what we did before when we were writing algorithms in Chapter 4. The only difference is that instead of writing one algorithm, we write a main algorithm as well as an algorithm for each sub-problem or module. The algorithm for each sub-problem is called a sub-algorithm. Each module will issue a call to (invoke) the module subordinate to it when a particular task is to be performed. The last statement in the subordinate module is Return, indicating that the program is not ending, but instead, control is being returned to the calling module. The main module is the only one that terminates with a Stop or End statement. This will be illustrated in the pseudocode below. The following is the pseudocode solution for the test-scores problem in problem 3. Algorithm Test_Score_Analysis This algorithm read test scores and finds the highest, the lowest and the average test score. This is the main algorithm. Call Read_Data Call Highest {call module to find highest score} Call Lowest {next, call module to find lowest score} Call Average {call module to compute average} Print high, low, average Stop. Sub_Algorithm Read_Data This module reads and validate test scores and stores them in an array called scores. Set i to 1 Read test-score While there is data do the following: If (test-score > 100 OR test-score < 0) then {check if the data is valid} Display “Error Message: invalid data” else Scores[i] test-score {store each valid data value in the array} End-if Increment i {advance to the next location in the array} Read test-score End-while Size i Return Sub-Algorithm Highest 56 This module searches a list to find the highest test score Set index to 2 Set high to scores[1] {assume the first value in the array is the highest, initially} While (index <= Size) do If (scores[index] > high) then High scores[index] End-if Index index + 1 End-while Return high {return the result to the calling module} Sub-Algorithm Average This algorithm finds the average test score Set sum to 0 Set i to 1 While (i <= size) do the following: sum sum + scores[i] { compute the sum of the scores } i i+ 1 end-while average sum size return average {return average to the main program where it will be printed} The pseudocode for the sub-algorithm Lowest is similar to that of Highest, except that we test for scores less than low. In the Flowchart representation of a modular solution, we draw a separate flowchart for each module. The flowcharting symbol for a sub-algorithm is a rectangle with a bar across the top. That is 57 As in the pseudocode representation, the sub-algorithms terminate with a return statement. The symbol is similar to the Start/Stop symbol. The flowchart representation for the problem above is shown below: MAIN MODULE START Read_ Data READ_DATA MODULE Start Set i to 1 Read score Find_ Highest Yes Print Error Message score >100 or score < -1 ? Find_ Lowest No Find_ Average Store score in scores[i] Increment i STOP No Is there more data? Set Size to i Yes Read score 58 Return Notice how straightforward the main algorithm is. The flowchart versions of the remaining modules are left as exercises for the reader. Can you imagine how complex the flowchart would be if we attempt to solve this problem without modularization? 7.5. Communication between Modules Modules may be written as separate independent entities, meaning that no variables are shared between the main module and the sub-algorithm. However, there are many instances when it is necessary to share information among modules in a program. There are several mechanisms to facilitate information flow between modules. These include: Global variables The passing of parameters Global variables constitute data that is declared in the main program and can be used or accessed by all the modules of the program. This is an easy way to share data among the modules, however, because every module in the program have access to and can change global variables, undesirable consequences can result, if they are not used properly. Parameters are data items that are passed as input to the called module by the calling module. Values may also be transferred back to the calling module from the called module upon completion of its task. NOTE: The use of global variables and parameter passing are outside the scope of this course. Students will therefore not be required to implement these concepts in their programming problems. All that is required at this stage is for them to be able to apply the top-down design approach to complex problems. That is, take a complex problem, decompose it into relevant modules, construct a hierarchy chart and develop the algorithm for the individual modules. 7.6. Advantages of the Top-Down Design Method 1. It makes the problem solution more manageable. It is easier to comprehend the solution of a smaller and less complicated problem than to grasp the solution of a large and complex problem. It is easier to test segments of solutions, rather than the entire solution at once. This method allows one to test the solution of each sub-problem separately until the entire solution has been tested. It is often possible to simplify the logical steps of each sub-problem, so that when taken as a whole, the entire solution has less complex logic and hence easier to develop. A simplified solution takes less time to develop and will be more readable. The program will be easier to maintain. 2. 3. 4. 5. 59 Chapter 8 FROM ALGORITHMS TO PASCAL PROGRAMS A GUIDE TO PROGRAM IMPLEMENTATION “The hardest part of programming is now over. This is where the fun begins.” 8.0. Introduction The final phase in the development of a program is the Program Implementation phase. This phase involves: Step 1: Step 2: Step 3: Translate the algorithm into a specific programming language. Execute the program on the computer. Maintain the program. Recap: Let us refresh our memory on what we have done so far. Given a problem, we first tried to get an understanding of what was required. We did this by constructing a defining diagram. We then proceeded to outline a solution for the problem. Then, we looked at alternative ways of solving the problem. After refining and evaluating our solutions, we chose the most efficient solution and then proceeded to represent that solution in the form of pseudocode or flowchart. Having produced an algorithm, we tested it for correctness by tracing through the logic with different sets of test data. If any errors were found, we corrected the errors. Having done all that, we now have a working solution and are ready to execute it on the computer. The hardest part of programming is now over. Now the fun begins. This is where we speak directly to the computer and tell it to execute our instructions and produce our expected results. In order to execute it on the computer, we have to first translate our algorithm into a language that is meaningful to the computer. The next section will illustrate how this is done. 8.1. Translate the Algorithm into a Specific Programming Language Translating an algorithm into a programming language is like learning how to speak a foreign language. We know what we want to say (via the algorithm), we just have to find out how to say it in a different language. Once we learn how to communicate in a computer language, all we have to do is take our algorithm and sequentially translate each instruction into the programming language. This produces a computer program that we can then execute on the computer. Piece of cake isn’t it! The programming language chosen for this course is Pascal. Why Pascal? It’s simple, that’s why, but there’s more to it than that. Pascal was designed by Professor Nicklaus Wirth in 1968. It was specifically designed as a tool to teach programming to beginners. Pascal is a really great language for beginner programmers because: 60 (i) (ii) (iii) (iv) It is well-structured. It is easy to implement. The syntax (grammar) is easy to learn and follow. It encourages the programmer to adopt a disciplined approach to programming. Some may argue that Pascal is not the language of choice in many real-world programming environments today. While this may be true, it cannot be used as a valid argument for teaching other languages such as C/C++ in this particular course. This is an introductory programming course. Upon completion of this course, students are not expected to become fully competent programmers who are ready to be employed as professional programmers. Before we translate our algorithm into Pascal, we have to first learn the structure and syntax of the language. By syntax, we mean the rules of the language that govern the grammatical issues such as the vocabulary, word placement and punctuation. Since Pascal is a language for communicating with a computer, the rules are somewhat different from those of humaninteraction languages such as English or French. For example, there is a limit on the number of characters that a Pascal word can have. Just as most languages have certain variants, called dialects, so does the Pascal language. The version approved by the International Standard Organization (ISO) is referred to as Standard Pascal. Other variants include Turbo Pascal and Think Pascal. In this course, only standard Pascal will be used. The syntax of the language must be strictly adhered to when writing program code. 8.2. Structure of a Pascal Program A Pascal program has three distinct parts: 1. 2. 3. the program heading the program block the program terminator (a period). The program heading is a single statement beginning with the word program. The heading assigns a name to the program and lists the input and output streams in parentheses. The program block is the body of the program. It consists of the Pascal statements for executing the algorithm. The block is divided into two distinct parts: 1. 2. the variable declaration section where all the variables and data structures used by the program are defined. the statement section is where all the action statements of the program are specified. The statement section is encapsulated within begin and end statements. Begin and end are examples of keywords used in Pascal. Keywords (or reserved words) are words that have special meaning in Pascal and can only be used in the predefined context. That is, they cannot be used as variable names or in any other context. Other keywords are: program, type, var, const, read, write, readln, write ln. 61 Refer to any Pascal textbook for a list of all Pascal reserved words. Pascal Program Template Program name (input, output); Definition/Variable declaration section: label declarations const definitions type definitions procedure/function declaration {Main Program} begin statement; statement; : statement; end. The keywords are highlighted in bold. Not all the categories need appear in any given program. For example, in this course we are not concerned with the implementation of modules, so the procedure and function declarations can be ignored for now. 8.3. Pascal Syntax in a Nutshell Declaring Variables Variables must always be declared in the variable declaration section prior to their use in the program. Variables are declared in Pascal by specifying the keyword var followed by a list of variables, a semi-colon and the data type. For example, var num1: integer; average: real; Data Types A data type is a collection of elements that are all formed and treated the same way. For example integers, real numbers, characters, Boolean, strings are all data types. Pascal uses the following keywords to identify the various data types: Data Type Integer Real Character string Boolean Keyword integer real char String* boolean 62 * Non-standard Pascal is referred to as a strongly typed language because it requires that a variable can only store values of one type. A variable must be declared with a single data type and the mixing of data types is not allowed. That is, only values of the type that was declared can be stored in the variable. User identifiers are names created by the programmer. These include variable names, the program name, names of symbolic constants or names of sub-programs. The rules governing user identifiers in Standard Pascal are as follows: an identifier must start with an alphabetic character (upper or lower case) it can be composed of only alphabetic characters or a mixture of alphabetic and numeric characters, commonly called alphanumeric. No special characters are allowed (like &, + -). Some variants of Pascal allow the underscore character. The identifier may be of any length, but standard Pascal only recognizes the first 8 characters to determine uniqueness. Examples of valid user identifiers are: temp, value1, Number of Items, X, X1. Reserved words or Pascal keywords cannot be used as identifiers. Pascal is NOT case-sensitive. Therefore, there is no distinction between the identifiers Number and number, for example. They are considered to be the same. Punctuation: Every Pascal statement (except begin) is terminated by a semi-colon. The last statement in the program (that is, the end statement) is terminated by a period. The Assignment Symbol: In Pascal the assignment symbol is “:=”. This corresponds to the “ “in our pseudocode. Begin and end delimiters: The begin/end keyword pair is used to delimit the body of the program as well as logical blocks within the program. For example, when multiple statements are to be executed within a while loop, such statements are encapsulated within a begin/end pair. For every begin in a program, there must be a corresponding end statement. Comments: Comments are used to document a program internally, to aid in the understanding of the program statements and segments. Comments in Pascal are enclosed within curly braces {….}. Comments are ignored by the Pascal compiler. That is, they do not affect the logic of the program or the syntax in any way. They are only there to allow the programmer to make notations about the code. The best way to learn any programming language is to look at a simple program and note the common features. Let us take one of the algorithms we developed in previous chapters and see how it translates into Pascal. 63 8.4. Translating Pseudocode into Pascal Code The first step in translating an algorithm into Pascal code is to make a list of all the variables used in the algorithm and determine their type. That is, the type of values that each variable is to store. In the algorithms, we were not too concerned about the types of variables we used, so we did not explicitly declare them in the algorithms. However, most programming languages require that variables be declared explicitly before they are used. The next step is to translate each statement in the algorithm into its Pascal equivalent. We will now illustrate how this is done. We will use the algorithm for calculating the average of three numbers for our first illustration. Translation of the Average Problem: The first step is to list all the variables and their types. There are three variables in this algorithm, num1, num2 and num3. They are all of integer-type. Pascal Code Pseudocode Program Average (input, output); Algorithm Average {This algorithm finds the average of three This algorithm finds the average of three numbers numbers} var Start get num1, num2, num3 num1, num2, num3: integer; Average (num1 + num2 + num3) ÷ 3 begin readln (num1, num2, num3); Display “The average is;”, average average := (num1+num2+num3)/3; Stop. writeln (‘The average is: ‘, average); end. Notice the close correspondence between the pseudocode version and that of the Pascal version. Both solutions have a header, a body and a terminator. The start/stop pair in the algorithm translates into begin/end in Pascal. The format of the statements in the algorithm has a direct relation with that of the Pascal code. This makes translation fairly easy. All we have to do is look at each statement in the pseudocode and ask, “how can I say this in Pascal?” We can illustrate this step by step as follows: What is the format of the header in Pascal? 64 Answer: Every Pascal program begins with the keyword program, followed by the name of the program and the input/output streams in parentheses. Our algorithm is called Average, so we can use the same name for the program to get: program Average (input, output); A glance at the Pascal template will show that the next step is to declare our variables. In our algorithm, we did not declare all the variables. However, we must do so in the Pascal program. What is the equivalent of “start” in Pascal? Answer: Begin. So we write begin in our program. Next, how do we input (read) values in Pascal? Answer: use read or readln. So we look up the format of the read/readln statement and write that in our program. How do we perform arithmetic operations in Pascal? Answer: The arithmetic operators in Pascal are pretty similar to those in mathematics, with a few exceptions – multiplication is the asterisk (*) and division is the slash (/). What is the assignment symbol in Pascal? Answer: The symbol is “=:” So now all we have to do is rewrite the assignment statement, using the Pascal constructs. That is, average := (num1+num2+num3)/3; How do we display results/messages in Pascal? Answer: we use the write or writeln statement. Just like we did for the read statement, we look up the format of the write/writeln statement. Finally, how do we end a Pascal program? Answer: by the keyword, “end” followed by a period. Just like that we have translated our algorithm into a Pascal program. Note that the pseudocode version of the algorithm was not designed to correlate with the Pascal programming language in particular. This algorithm is easily translatable into any structured programming language. That is why it is very important to write algorithms that are wellstructured. This makes translation much easier. Let us look at another example. To illustrate a few more Pascal constructs, we will use the algorithm developed in Chapter 2 for counting the number of integers that are larger than the first. The variables used in the program are: count, No-of-Values, first-number, next-number. The variables are all of integer type. Standard Pascal does not allow special characters in variable names, so we will remove the hyphen from the names and replace as follows: NoOfValues, firstNumber, nextNumber. 65 Pascal Code Pseudocode Program programLarge Count (input, output); Algorithm Large_Count var Given a list of 20 integers, this algorithm count, NoOfValues : integer; counts the number of values that are larger firstNumber, nextNumber: integer; than the first value. begin Set count to 0 count := 0; Set No-of-Values to 1 NoOfValues := 1; get first-number readln (firstNumber); While (No-of-Values <= 20) do the while (NoOfValues <= 20) do following: begin get next-number readln (nextNumber); increment No-of-Values No Of values := No Of Values + 1; if next-number > first-number then if (nextNumber > firstNumber) increment count then end-if Points to Note in this Example: count := count +1; end-while end; print first-number, count {end of while loop } writeln (firstNumber, count); stop. end. Points to Note in this example The name of the program does not include the underscore character, as this is illegal in Standard Pascal. Multiple statements are to be executed within the while loop. This is referred to as a compound statement. Compound statements in Pascal are encapsulated within begin/end pairs. The increment count statement in the algorithm translates into count: = count + 1; in Pascal, likewise for increment No-of-Values. The end statement, which marks the end of the while loop, is terminated by a semicolon. The statement which ends the program is terminated by a period. 66 A More Complex Example: The next algorithm exemplifies a complex control structure called the nested If structure. Nested control structures are outside the scope of this course. However, the algorithm demonstrates the importance of organizing the appearance of the statements to reflect the logic. Every if must have a corresponding else and the scope of the if statements must be clearly delineated. Algorithm Grades This algorithm reads test scores and determines the letter grade for each score. Read test-score While (test-score -1) do If (test-score > 100) OR (testscore < 0) then {check if the data is valid} Print “Error: Invalid test score” else If (test-score >= 80 and test-score <= 100) then Lettergrade ‘A’ Else If (test-score < 80 and test-score >= 70) then Lettergrade ‘B’ Else If (test-score < 70 and test-score >= 60) then Lettergrade ‘C’ Else Lettergrade ‘D’ Print test-score, lettergrade End-If {if test score is invalid} Read test-score End-while Stop. Pascal Implementation: program Grades (input, output); var testscore: integer; lettergrade: char; begin readln (testscore); while (testscore <> -1) do begin { Start of While loop } if (testscore > 100) or (testscore < 0) then writeln (‘Error: Invalid test score’); else begin 67 end. if (testscore >= 80) and (testscore <= 100) then Lettergrade := ‘A’; else if (testscore < 80) and (testscore >= 70) then Lettergrade := ‘B’; else if (testscore < 70 and (testscore >= 60) then Lettergrade := ‘C’; else Lettergrade := ‘D’; writeln ( testscore, lettergrade) end; {if test score is invalid } readln ( testscore); end; {end of while loop } { end program } Points to Note in this Example: 8.5. In the nested if statements all the matching else’s are in line with the corresponding ifs. The variable testscore is not hyphenated as in the algorithm. Compound statements are enclosed within begin/end blocks. The symbol for “not equal” is “< >” The format of the if and while statements closely resembles that of the algorithm. That’s okay. This format is used by many other languages, including English; it is not peculiar to Pascal. Summary The above treatment of the Pascal programming language is not intended to be viewed as “all one needs to know about the programming language”. This was a mere overview of the basic features of the language to facilitate an illustration of the algorithm translation process. There are many other Pascal constructs that the student needs to learn. For example, how to represent arrays in Pascal as well as various other syntax rules. The reader is advised to refer to texts on the Pascal language to gain a comprehensive treatment of the syntax of the language. Several such texts are recommended in Appendix B. 68 Chapter 9 PROGRAM EXECUTION ON THE COMPUTER 9.0. Steps in Executing a Program on the Computer: 1. 2. 3. 4. 5. create source code compile source program link the modules run (execute) program maintain program. Creating the source code involves the translation of the algorithm into a programming language. This process should first be done manually on paper. The resulting Pascal program is then entered into the computer using a suitable text editor. Most language compilers provide their own editing features. The source code (as it is now called) is then stored in a file with the appropriate extension. Most Pascal compilers use the .pas extension. At the completion of this process, we should have a complete Pascal program, ready for compilation. Student interaction with the computer begins at this stage. Up until now, there is no need for the student to interact with the computer for the purpose of programming. The algorithm must first be developed and tested, then translated into a programming language on paper, before they go on the computer. There is nothing to be gained by introducing beginner programmers to the compiler on the computer before they learn how to develop and translate algorithms into a programming language. It is simply a waste of time and computer resources. Compiling the source code is the process of translating the source code into object code. Object code is the machine language equivalent of the source code. During the compilation process the syntax of the source code is checked to ensure conformity with the rules of the language. If syntax errors are found, these are reported. Syntax errors result in incomplete compilation. The errors must be corrected and the program must be re-compiled. This process is repeated until the code is free of syntax errors. A common misconception is that the sole purpose of the compiler is to check for errors. While this may be a resultant function, the primary purpose of the compiler is to translate the source code into object code. Most compilers do perform error detection during the translation process. Linking the Modules: A compiled object program is not executable by itself. It needs to be combined with other system modules to form an executable image that can be loaded into memory. The process of linking the module is done by a link editor or link-loader. The resulting executable module is then loaded into memory where it can then be executed. 69 Program execution is the process whereby the program is dispatched to the CPU. The control unit interprets each instruction and passes it to the appropriate unit for execution. During execution, if any run-time errors are detected, the program will terminate prematurely. It is important that the student is made aware of the steps that the program has to go through, regardless of the features of a particular compiler. In many of the language compilers today, the process of compiling, linking and executing are transparent to the user. On some menu-driven systems the user selects “compile” from the menu and the program is compiled and linked from this single option. On other systems, selecting “run” option results in the program being linked and executed. It is important that the student is made aware of the steps that the program has to go through, regardless of the features of a particular compiler. Maintaining the program involves making periodic modifications to the program when the requirements change. For example, if the grading scheme used by a teacher changes, she may require a change to the Lettergrade program to reflect those changes. Or, if the hourly rate for an employee changes, due to a pay increase, then such a program would have to be modified to reflect the change in the hourly rate. 9.1. Types of Errors During program development, various types of errors may be encountered. These can be defined as: logic errors syntax errors run-time errors. Logic errors are mistakes in the program logic that results in an incorrect output or outcome. For example, let’s suppose we wanted to print the names of all girls who are under the age of 18 and we wrote the following program segment: : if (gender = ‘F’) and (age <= 18) then writeln (name); This code would result in the printing of the names of girls who are 18 as well as those who are under 18. This is because in the if statement, we used <= instead of =. This is a common error in the logic that will give undesirable results. Syntax errors are mistakes made by a non-conformance to the grammatical rules of the programming language. For example, if we omit to terminate an assignment statement with a semi-colon. The following statement would result in a syntax error during compilation: 70 total := total + x or if we used a reserved word incorrectly, such as in: var := a + b; Syntax errors can be easily located and corrected as the compiler usually issue messages which identify the location and cause of the errors. Run-time errors are generated during execution of the program. They result when the processor encounters an instruction that violates the processing rules. For example, if we had an instruction that attempts to divide by 0, such as: Number: = 1; Number: = number –1; Answer: = total / number; illegal! The last statement would generate a run-time error as an attempt is being made to divide by a value of 0. Another common cause of run-time errors is when a reference is made to an array location that is outside the legitimate bounds of the array. For example, if we declared an array to have a maximum of 100 locations, as in: var list: array [1..100] of integer; Then if somewhere in the code we write: list [101] := num; this means that we are attempting to access a memory location (that is, 101) that has not been assigned to the array list. This is a violation of the memory allocation rules. Run-time errors are usually fatal errors, in that they result in premature termination of the program. 9.2. Debugging Logic errors and run-time errors are commonly referred to as bugs. They are more difficult to locate. Locating and correcting such errors can sometimes be very frustrating. Tracing the algorithm before coding is a very good way to detect logic errors. However, as we saw in Chapter 5, it is possible to miss some errors, depending on the test data chosen. The process of locating and correcting such errors is called debugging. Debugging can be done by manual tracing of the code. However, for large, complex programs, this can be time-consuming. A tool called a debugger aids in this process. A debugger is an interactive program that can be used to facilitate error detection and correction. It can inspect each statement in the program and show what results from the execution of each statement. It 71 also keeps track of the contents of all the memory locations used by the program. A debugger is an invaluable tool for programmers. Most high-level compilers today have built-in debugging tools. That is, the debugger is integrated into the compiler package. 72 Chapter 10 PROGRAMMING STYLE AND QUALITY 10.0. Program Quality What are the characteristics of a good program? The program should produce the correct results at all times. The program is free of errors. The program is well-documented. The program is maintainable A program can be made program maintainable by: - dividing the program into modules - making sure that the program is well-documented - using symbolic constants The program is robust. A computer program without documentation is like a major appliance without operating instructions. [Tremblay & Bunt] 10.1. Programming Style Here are some guidelines to aid in the development of “good” programming style: Choose meaningful variable names Use structured programming techniques Use comments to clarify the purpose and function of a statement or program segment. Use indentation to delineate logical blocks of code. This enhances readability and helps to reveal the logical structure of a program or algorithm. Use white spaces effectively to avoid clutter and make program more readable. Most programming languages allow the use of blank lines and spaces embedded within the program code to improve its readability. Always separate blocks by one or more blank lines. Good comments cannot do much to improve bad code, but bad comments can seriously detract from good code. Always make sure that comments and code agree. If you make a change to the code, be sure that a similar change is made to any comment relating to it. [Tremblay & Bunt] 73 Appendix A PROGRAMMING EXERCISES For each of the following problems: define the problem design a solution algorithm in pseudocode or flowchart trace the solution using two valid test cases. Examples of Algorithms Using only Sequential Statements: 1. A program is required to read a VAT rate as a percentage and the prices of five items. The program should calculate the total price of the items before tax and then the VAT payable on those items. The VAT payable is computed by applying the VAT rate percentage to the total price. Both the total price before tax and the VAT payable are to be printed as output. (VAT = value-added tax). 2. A sales company pays its employees strictly on a commission from sales. The input data consists of the employee’s ID (a 4-digit number), followed by the amounts of each sale that the employee made within the past week. There are a total of 60 employees. The commission rate is 3.45 percent. Design a program that reads and adds up the sales total for each employee and compute the commission for each employee. Examples of Algorithms using Sequential and Selection Statements: 3. Design an algorithm that will receive the weight of a parcel in kilograms and determine the delivery charge for that parcel. Calculate the charges as follows: Parcel Weight (kg) Cost Per Kilogram ($) <2.5 kg 3.60 per kg 2.5-5 kg 2.85 per kg over 5 kg 2.45 per kg 4. Construct an algorithm that accepts three integers and report which of them are positive. Your output should be in the form of a single sentence. For example, “The first and third numbers are positive”. Examples of Menu-Driven Programs: 5. Construct an algorithm that will read two integers and an alphabetic code (A, B, C or D) from the keyboard. If the code entered is ‘A’, compute the sum of the two numbers. If the code is ‘B’, compute the difference (first minus second) of the two numbers. If the code is ‘C’, compute the product of the two numbers. If the code is ‘D’, and the second number is non-zero, compute the quotient (first divided by second). The program is to then display the two numbers, the code and the computed result. 74 6. Write an algorithm that allows the user to convert measurements from either feet to metres, or minutes to hours, or degrees Fahrenheit to degrees Celsius. Design the algorithm so that it displays a menu of options and allow the user to select one of the options, depending on what conversion he/she requires. The available options are: 1. 2. 3. 4. 5. Display this menu Convert minutes to hours Convert feet to metres Convert Fahrenheit to Centigrade Quit. Use the following conversion table: 1 foot = 0.3048 metre Celsius = 5/9(F – 32) 60 minutes = 1 hour. Examples of Algorithms using Sequential, Selection and Repetition Statements 7. A program is required to print all the even numbers between 1 and 50. 8. A program is required to read a list of students’ test scores and print the IDs of all students who failed. A student is deemed to have failed if his/her test score fall below fifty percent (50%). Each record in the input list consists of the student’s ID, followed by his test score. The last record in the list contains an ID of 0. This is to be used as the sentinel value. Examples of Programs using Arrays 9. Write an algorithm that reads a list of numbers and reduces each value in the list by 3. Print the values in the modified list. A value of 999 marks the end of the list. 10. A program is required to read a list of names and ages of contestants in a beauty pageant, and print the names of all the eligible contestants. A contestant is deemed eligible if she is less than 25 years old. Print also the names of the contestants who are younger than 18 years. There are 50 contestants in the list. Hint: store the names and ages in separate arrays. Examples of Modular Programming Draw the hierarchy chart for each of the problems below: 11. Write a program that reads a list of integers, representing students’ test scores, ranging from 0 to 100. Your program should compute and print the range of test scores. The range is defined as the difference between the highest and the lowest test scores. 75 12. A local insurance agency has compiled data on traffic accidents over the past year. For each driver involved in an accident, a record has been prepared with the following information: Year driver was born (integer), sex (‘M’ or ‘F’), registration code (1 for local drivers, 0 for other drivers). Design an algorithm to read the records until the end of the data is reached and print the following statistics on drivers involved in accidents: (i) (ii) (iii) (iv) The percentage of drivers under age 25. The percentage of drivers who are female The percentage of drivers who are males between the ages of 18 and 25. The percentage of local drivers over 50. 76 Appendix B SUGGESTED READING Highly Recommended: For Algorithms: 1. Lesley Anne Robertson, “Simple Program Design - A Step by Step Approach”, 5th edition, Thomson Publishing, 2006. 2. Tremblay J., Bunt R., “Introduction to Computer Science – An Algorithmic Approach”, 2nd edition, McGraw-Hill, 1989. For Pascal Programming: 1. Dale Nell, Weems Chip, “Introduction to Pascal and Structured Design”, 4th edition, Jones & Bartlett Publishers, 1997. 2. Nyhoff Larry, Leestma Sandford, “Pascal Programming and Problem-Solving”, McMillan/McGraw Hill Recommended: 1. Hume, J.N.P., “Problem-Solving and Programming in Turbo Pascal”, Holt Software Associates Inc. 2. Joyce Farrell, “A Guide to Programming Logic and Design”, Course Technology (International Thomson Publishing). 3. Abernathy, K., Allen, J Thomas Jr., “Exploring the Science of Computing – A Laboratory Approach with Pascal”, International Thomson Publishing. 4. Rhoads, S., Gearen, M., “Disciplined Programming using Pascal”, WC Brown Publishers. Western Zone Office 09/06/2008 77