Program Style 1 COMP 14 Prasun Dewan1 4. Program Style In the last two chapters, we learnt how to implement calculators and spreadsheets. In this chapter, we focus on new ways of developing these two kinds of applications, rather than learn how to write a new class of applications. Thus, the concepts we learn here have more do with the style of the program, that is, how it is written, rather than what function it implements. We will continue with our spreadsheet example, and show an alternative way of implementing it. We will then introduce the notion of an interface as a way of uniting multiple solutions to a problem and separating the notion of specification from implementation. In English, the word interface means a boundary between two surfaces. In Java, this word also denotes a boundary, not between two surfaces but between a class and its users. We will also look at the notions of algorithms, named constants, and comments as other important tools for developing and documenting our solution. We will also learn how to reduce code duplication and control what privileges a piece of code has. Algorithm & Stepwise Refinement We can outline the approach we took in ABMISpreadsheet to implement the spreadsheet user-interface of Figure 1 (b), as follows: 1. Declare two instance variables, weight and height. 2. Define two setter methods that assign their parameters to these two variables. 3. Define two getter methods that return the values of these two variables. 4. Define another getter method that calculates the value of BMI from the values of these two variables, and returns the calculated value. Notice that the description above, like the Java code we saw before, explains the essence of our solution. Such a description is called an algorithm. Like Java code, an algorithm is description of the solution to a problem. The description can be encoded using any communication mechanism, graphical or textual. Figure 3 of the previous chapter was a graphical description of the algorithm. The textual formalism may be a natural or programming language. As we shall see later, often an algorithm is a cross between real code and natural language phrases, and is called pseudo code. An algorithm is a description of not just computer solutions to problems but any well-defined task. For instance, the following is a description of the tasks a professor may perform in a lecture: (1) (2) (3) (4) (5) (6) Enter Class Distribute handouts Set up laptop projection. Revise topics learnt in the last class. Teach today’s topics. Leave Class It is often useful to write an algorithm in a natural language or pseudo code before implementing the actual code to be executed. A process such as this that breaks a complex task into multiple simpler steps is called stepwise refinement, and is easier to follow and more likely to succeed than one that does not. Writing an algorithm before actual code is one form of stepwise refinement. We will see later other forms. 1 Copyright Prasun Dewan, 2000. Program Style 2 Algorithms can be described at various ``levels''. For instance, the graphical description of the spreadsheet given in the previous chapter shows which methods read or write a variable, but does not explain what they do with the value reads or what values they write. The textual algorithm gives us few more details, telling us, for instance, that the setter methods simply assign their parameters values to the variables they access. However, they leave out the detail of how the BMI is to be computed. We can think of a more detailed algorithm that identifies the formula. Similarly, the lecture algorithm leaves out certain important details such as how the topics should be taught or how one distributes the handouts or what color chalk should be used. Thus, in our stepwise refinement of a solution, we may create multiple levels of algorithms before writing real code, as shown below: Natual Language Graphical Algorithm • Declare weight height reads getBMI() two instance variables, weight and height. • Define a getter method that computes the value of BMI and returns it Programming Language public void getBMI() { return weight/(height*height); } Creating instance variables for Dependent Properties The following figure illustrates another approach to solve the BMI spreadsheet problem: Program Style 3 AnotherBMISpreadsheet Instance weight reads writes getWeight() setWeight() weight calls height reads bmi writes reads getHeight() setHeight() new Weight height calls calls getBMI() new Height ObjectEditor In this solution, we: 1. Declare three instance variables, weight and height as before, and an additional variable, bmi, that stores the BMI value. 2. As before, provide two setter methods, which assign new weight and height values to the variables weight and height, respectively, and in addition, compute and assign the new BMI value to the variable, bmi. 3. Provide three getter methods that return the values of the three instance variables. Thus, the difference between this and the previous solution is that we create an instance variable for the dependent property also. The getter method for this property now simply returns the value of this variable, instead of calculating its value from the two properties on which it depends. Whenever a setter method changes one of these properties, it calculates the new value of the dependent property, and stores it in the corresponding instance variable, bmi. The getter method for BMI, thus, is guaranteed to return the correct value of the property the next time it is called. The class, AnotherBMISpreadsheet, shown in Figure 1, is a coding of this algorithm Program Style 4 Figure 1 AnotherBMISpreadsheet As shown in the right window in the figure, we have put both ABMISpreadsheet and AnotherBMISpreadsheet in the same project, bmi, because they both compute a BMI value. Comments The code in the class is fairly self-explanatory. However, to clear up any possible confusion a reader of it may have, we have added comments to it. Returning to the theater analogy, think of comments as notes left by scriptwriters (in a language of their choice) for themselves (so that they can understand it later), other scriptwriters and reviewers, but not for the performers. In fact, the compiler removes them before creating object code - so the CPU never sees them and they do not occupy memory space. There are two kinds of comments in Java: single-line and arbitrary. A single-line comment begins after the comment marker // and spans the rest of the line, as shown below: double bmi; //computed by setWeight and setHeight An arbitrary comment is written between the comment markers /* and */ and can span either one line: /* recompute dependent properties */ bmi = weight / (height*height); or multiple lines: /* This version recalculates the bmi when weight or height change, not when getBMI is called */ Program Style 5 public class AnotherBMISpreadsheet {…} Javadoc Conventions While the above style of commenting for multiple lines is legal in Java, the following style illustrates the convention: /* This version recalculates the bmi * when weight or height change, not when * getBMI is called */ public class AnotherBMISpreadsheet {…} If we use this convention, a Java tool called javadoc will automatically extract the comments from the class, allowing a reader trying to understand it to focus on them while ignoring the coding details. In general, it is a good idea to insert the algorithm coded by a class as comments in the class. If we do so, javadoc will essentially extract the algorithm, allowing a reader to focus only on the algorithm. You should use this convention, even though it does require you to make the extra effort of ensuring that *’s are aligned. Commenting out Debugging Code Comments are useful not only for documentation, but also commenting out debugging code. To illustrate what this means, consider the following comment in AnotherBMISpreadsheet: /* System.out.println(newHeight); // debugging statement */ Here, we commented out the print statement once we finished debugging. By enclosing the code within comments, we ensure that it will not actually be seen by the CPU. Commenting out code is better than deleting it, because if we later need to debug it, all we have to do is remove the comment markers rather than retype the code. Single-line Vs Arbitrary Comments Why support both kinds of comments? Single-line comments save us from typing an extra end-comment marker for a short comment. Arbitrary comments, on the other hand, are necessary for long comments. Moreover, we need two kinds of comments to allow us to “comment out’’ code that has comments. If we had only one kind of comment, say the arbitrary comment, we might have tried to comment out the print statement as follows: /* System.out.println(newHeight); /*debugging statement */ */ Unfortunately, this does not work. When the compiler sees the beginning of a comment, it ignores everything until it finds the end of the comment. Therefore, in this example, it will ignore the start of the inner comment and assume that the end of the inner comment is, in fact, the end of the outer comment. It will then flag the end of the outer comment as an error. With two types of comments, this problem does not arise. What to Comment In general, we should comment any code fragment that needs explanation, such as a: Class declaration – the comment should explain characteristics of the whole class such as its purpose, design goals, and the top-level algorithm implemented by it. It can also give the author of the class and the date(s) it was modified. Variable declaration – the comment should explain the purpose of the variable, how its value is computed, and where it is used. Method declaration – the comment should explain the algorithm implemented by the method, and its parameters and return values. It should also give the author of the method and the date(s) it was modified, if more than one person has written the class in which the method is defined. Program Style 6 Complicated statement sequence within a method – the comment would explain what the statement sequence achieves. Comments that elaborate on a piece of code come after the code, while those that summarize it to tend to come before it. Comments about a variable declaration tend to elaborate on the declaration, and thus, come after the declaration. On the other hand, comments about a class, method, or some complicated statement sequence summarize it, and therefore, come before it. Long Identifiers Vs Comments What we write in comments depends on the identifiers we choose. For instance, if we use w as the name of the instance variable storing the weight, we would need a comment to explain the role of the variable: double w; // weight However, if we name it weight, we do not need any further explanation: double weight; In fact, we should never make redundant that add no information to the identifier name: double weight; // weight Such redundant comments serve only to clutter the code. Thus, with longer identifiers, we often reduce the need for comments about identifier declarations. We do not eliminate it, however, because not every aspect of an identifier can be explained by its name, as illustrated below: double bmi; //computed by setWeight() and setHeight() Javadoc Tags Javadoc defines special kinds of comments regarding an author, parameter or return value of a method. A comment identifying an author is for the form: @author <author name> and a comment about a parameter is of the form: @param <parameter name> <parameter description> as illustrated below: /* * @author Prasun Dewan * @param newWeight the new value of the property, weight. * sets new values of the variables, weight and bmi */ public void setWeight (double newWeight) { … } A comment about a return value is of the form: @return <return value description> as illustrated below: /* * @author Prasun Dewan * @return the value of the variable, weight */ public double getWeight () { … } Here, @param, @return and @author are javadoc tags, which can be thought of as “reserved words”. They are reserved, not by Java , but by Javadoc. They allow the tool to show only a certain kind of comments, such as only the comments about parameters and return values, thereby providing a useful form of filtering. In the examples above, the comments are probably redundant because the commented methods are simple and we have been careful in giving names to parameters and methods that are self-commenting. In more complex examples, Program Style 7 comments about a method can be very useful, allowing the reader to understand what the method does without looking at the code. Specifying Classes through Interfaces We have seen above two different implementations of a BMI spreadsheet whose functionality is identical. As shown in Figure 2, if we compare the edit windows ObjectEditor creates for instances of the two classes, we cannot tell the difference between the two. This despite the fact that second class contains an additional variable, bmi, and an additional method, calculateBMI(). Figure 2 Two Classes Implementing the Same Interface The reason the two classes look identical from the outside, is that that the same set of public methods can be invoked on instances of the two classes.2 This situation corresponds to two different factories producing the same kind of cars, that is, cars on which identical operations can be invoked. Two factories, of course, do not accidentally produce the same cars, they are designed to do so, by carefully defining a car specification and ensuring that the factories follow the specification. To follow a similar process when declaring classes, we would like a way of specifying the functionality implemented by them. 2 They also look the same from the outside because they implement the same properties. The set of methods is used to compare classes rather than the set of properties, for two reasons: First, properties is a Beans concept, not a Java concept. Second, we may wish to compare all methods of two classes, not just the getter and setter methods. Program Style 8 implements manufactures Accord Specification In Java such specification takes the form of an interface. The specification of the two BMI Spreadsheet classes shown in Figure 3 illustrates the nature of an interface: Figure 3 Interface Implemented by ABMISpreadsheet & AnotherBMISpreadsheet The keyword interface tells us that we are declaring an interface. An interface declaration documents the headers of public methods of the classes it specifies. It does not contain the bodies of these methods since those are considered implementation details that different classes can choose differently. It does not contain non-public methods such as calculateBMI since they are not visible to users of the class, and thus make no difference to them. For similar reasons it does not contain declarations of instance variables. It can, however, declare named constants, which are accessible in all implementations of it, and can ensure that these implementations use the same values for them. Once an interface has been defined, classes that intend to implement this specification can declare so in their headers: public class ABMISpreadsheet implements BMISpreadsheet { … } Program Style 9 public class AnotherBMISpreadsheet implements BMISpreadsheet { … } If a class header says that it implements an interface, then the Java compiler checks that the class actually implements each method whose header is specified in the interface. Thus, if we comment out the method getBMI() in ABMISpreadsheet, the Java compiler will flag an error. This corresponds to a factory inspector ensuring that the factory actually implements all car features specified in the car specification.On the other hand, if we comment out the getMethod() header from the interface, Java will not flag an error, even though the class implements this method, Thus corresponds to a factory not advertising all the features it provides, which is not a good idea. Therefore, it is our responsibility to make sure that the headers of all of the public methods of the class are in the interface(s) it implements.3 As in the case of a class, the name of the file in which an interface is saved is the interface name followed by the suffix .java. As shown in Figure 3, we have included the interface also in the bmi project, since it is related to the other two items in the project, specifying their methods. We have motivated an interface above as a mechanism for uniting two classes that implement the same set of public methods. In fact, it has other several other important uses. Someone trying to understand what a class does can just look at the interface it implements, rather than look at the complete class. It also breaks down the process of defining a class into two steps: first define the interface, and then class itself. Thus, it supports another important form of stepwise refinement. For these reasons, in the future, we will define the interface of each class we implement; and will insist that you do so also. If a class implements an interface, we will refer to an instance of the class as also an instance of the interface, because like the class, the interface describes the operations that can be performed on the instance. Thus, we will refer to instances of ABMISpreadsheet and AnotherBMISpreadsheet as also instances of BMISpreadsheet. This usage is consistent with labeling a car manufactured according to the Accord specification at the Ohio Accord factory as both an Ohio Factory Accord and also simply as an Accord. Naming an Interface We should relate the name of an interface with the names of the classes that implement it. In the two examples above, the name of a class that implements an <Interface> has the form: <Qualifier><Interface> We can distinguish between among implementations of an interface by using different qualifiers for the implementations such as: ABMISpreadsheet, AnotherBMISpreadsheet, AThreeVariableBMISpreadsheet, ATwoVariableBMISpreadsheet, and so on. A more popular convention for naming the class is: <Interface><Qualifier>Impl Examples of the use of this convention would be BMISpreadsheetImpl, BMISpreadsheetAnotherImpl, and BMISpreadsheetAThreeVariableImpl. The second approach requires more typing and is a bit awkward to read but has the advantage that in an alphabetical listing of classes and interfaces, the name of classes appear next to the interfaces they implement. In a small project such as the one above, this is not an issue. You should choose whichever convention seems more natural to you. In this book, we will continue to follow the former approach because we will be creating small projects. Some create an interface name based on a class that implements it. Usually the interface name has the form: <Class>Interface Thus, under this approach, the interface implemented by ABMISpreadsheet would be called: ABMISpreadsheetInterface 3 A class can implement more than one interface, just as a factory can produce more than one kind of product. Program Style 10 This approach, however, ties the interface name to a particular class. It does not tell us how to relate the name of another implementation of the interface to the name of the interface or the original class. It works as long as an interface has a single implementation. It is short-sighted, however, to expect an a single implementation of an interface. Thus, this naming scheme is not recommended. Space & Time Efficiency Why create multiple implementations of the same interface? Why create multiple factories that implement the same car specification? In the latter case, the answer often has to do with the fact there are limitations on how many cars can be produced by a factory, extra factories are needed because the total demand cannot be met by a particular factory. There is no analogue of this situation in the class case, however, since there are no practical limitations on the number of instances of instances we can create from a class.4 The answer to our question lies somewhat in the second reason for creating an additional factory, which has to do with cost. Sometimes we create a factory in some location because it is more efficiency to do so. Efficiency is an important reason also for creating multiple classes that implement the same interface. Let us compare the efficiency of the two implementations of BMISpreadsheet. The first implementation takes less space in memory in that it creates one less instance variable for each object. On the other hand, the second one often takes less time, that is, executes faster. The reason is that in the first case we calculate the value of BMI each time it is required, whereas in the second case, we calculate it only when it changes. Consider what happens when we ask ObjectEditor to do successive refreshes without changing any property. Recall that each refresh results in the invocation of all getter methods of the object displayed by ObjectEditor.. In the first case, the same BMI value will be calculated multiple times, on each invocation of getBMI() that results from a refresh. In the second case, on the other hand, the BMI value will be calculated only when the weight or height changes, not on all of the subsequent refreshes that happen afterwards. An implementation that takes less space is called more space efficient, and one that takes less time is called more time efficient. What we see here is an example of a classic tradeoff in programming, called the space-time tradeoff: more space efficiency often implies less time efficiency. For this reason, we often create multiple solutions to a problem that make this tradeoff differently, and allow the users to determine which solution suits their needs. 4 They must all fit in memory, of course. This is equivalent to finding parking space for all cars produced, which is independent of how many factories produce them. Program Style 11 Space Time Space Miser Time Miser The efficiency concerns here are fundamentally different from the ones in the factory case. In the latter, efficiency had to do with the manufacturing process, not with the cars produced. Here they have to with the instances created by the classes: we are interested in the space and time taken by an instance, not the class. This corresponds to a car produced by one factory being faster or bulkier than one produced by another, which would unacceptable if they are considered the same car. Such a situation is acceptable in our case because it is the only way to address the space-time tradeoff. Syntactic Vs Semantic Specification While the two example classes above offer identical functionality, not all implementations of an interface are guaranteed to do so. For instance, a class that “implements” this interface may provide the following erroneous implementation of getBMI(): public double getBMI() { return weight*weight/height; } In general, we need to give more information than method headers and constants when specifying a class. Ideally, we should be able to specify the exact semantics of each of the operations and Java should be able to check that an implementation conforms to these semantics. However, as it turns out, programs are so complex, that is impossible to do so in general.5 An interface only specifies the syntactic details about the operations. It cannot specify the semantics of these operations. For this reason, cannot ensure that the getBMI() method returns the correct body mass index, and not, say, the Bombay Market Index. We will rely on identifier names, comments, and a shared understanding of certain concepts (such as BMI) to specify the semantics of an operation. Java will ensure that all method headers defined in an interface have bodies in the implementation. There is nothing to prevent differences in the behavior of different implementations of these headers. The advantages of interfaces are based on the assumption that they are implemented as “expected”. Commenting an Interface Because an interface provides only syntactic information, it is crucial to comment it to provide the semantic details that are not obvious from the identifiers declared in it. For example, we might wish to clarify, in the declaration of 5 Java cannot even tell if a program will halt! Program Style 12 BMISpreadsheet, that the BMI stands for the Body Mass Index. Interface comments can include an overall comment and comments about individual method headers. The overall comment would be similar to the overall comment for a class except that it would not describe any algorithm, which is implementation-dependent. Similarly, the comments about method headers in an interface would be similar to comments about complete methods in a class except, again, that they would not give an algorithm. To illustrate, the difference between comments in an interface and those in a class, consider the comments we wrote for the setWeight() method: /* * @author Prasun Dewan * @param newWeight the new value of the property, weight. * sets new values of the variables, weight and bmi */ public void setWeight (double newWeight) { … } (Recall that the comment about the parameter is really redundant, and was given purely for illustration purposes.) Now consider commenting header of this method in the interface. The author comment is probably redundant, because it is likely that all method headers were put in by the same person(s). If this is not the case, then it would indeed be useful. The variables, weight and bmi are implementation-specific. Thus these comments do not apply to the header. The only comment does apply is the one about the parameter: /* * @param newWeight the new value of the property, weight. */ public void setWeight (double newWeight) ; Should this comment be repeated in both the interface and the class? Strictly speaking, no. While it useful to see this comment when looking at the method implementation, a tool can easily extract this comment from the interface and put it in the class. However, such a tool probably does not currently exist. Therefore, it would be useful to replicate the interface comment in the class until such a tool is developed. Encapsulation In each of the two classes implementing the BMI spreadsheet, the instance variables of the class could not be accessed directly by other classes such as ObjectEditor. They were accessed indirectly by other classes through the getter and setter methods of the class. In fact, we could have allowed direct access by other classes to these variables by declaring them as public: public class ABMISpreadsheetWithPublicVariables { public double height, weight, bmi; … } Like the properties of an object, the public instance variables and named constants of the object are displayed by ObjectEditor. Moreover, the displays of the public instance variables can be edited by the user to change the values of the variables, as shown in the figure below. . Program Style 13 However, instance variables of a class should not be declared as public, for two main reasons. Maintaining consistency constraints: Other classes can assign inconsistent values to public variables. For instance, as shown in the figure above, it is possible for other classes to assign to the bmi variable a value that is inconsistent with the values of the other two variables. By not making the variables public, the class declaring them can ensure that its consistency constraints are not violated. This is akin to not allowing drivers of a car to directly access all aspects of the engine, for fear that they may damage it. Ease of change: Suppose that we decide to change the implementation of a class by changing its instance variables. We may have first created an implementation of the BMI spreadsheet using three variables. Later, as we port the program to handheld computers with small memories, we may realize that we need to save on space by using only two variables. Rather than create a new class, as we did above, we may wish to change the current class, so that other classes that use it do not change. If the current class does not make any of its variables public, it would indeed be possible localize the changes only to it. If it does have public variables, then other classes might be depending on them. In this case, it would necessary to change all classes that access public instance variables that no longer exist. For instance, in our BMI example, it would be necessary to change all classes that access the bmi variable, which would not exist in the new implementation. By not declaring the variables of a class as public, it is possible to change the implementation of the class without affecting other classes as long as its public methods remains the same. This is akin to not publicizing the implementation of a car engine so that new manuals do not have to be written when the implementation changes. A class that does not make its instance variables public is said to encapsulate them. In all of the examples you create, you should encapsulate instance variables for the reasons given above. Interface Constants On the other hand, it is acceptible, and in fact usual, to declare named constants as public. The reason is that they cannot be changed by other classes (or, in fact, the class defining them). Thus, there is no danger that they can be made to hold inconsistent values. Moreover, they do not usually store implementation-specific information. When a named constant does not contain implementation-specific information, it should be declared in the interface implemented by a class rather than the class itself. This allows it to be used by all other classes implementing the interface. For example, if we declared the named constant CMS_IN_INCH in BMISpreadsheet: public interface BMISpreadsheet { public final double CMS_IN_INCH = 2.54; … } then this value can be accessed by both the implementation that creates three instance variables and the one that creates two. Thus, always declare implementation-independent named constants in interfaces rather than classes. Avoiding Code Repetition Consider the two setter methods in the class AnotherBMISpreadsheet. The code to calculate BMI: weight/(height*height) Program Style 14 is repeated twice, once in each of the two setter methods. We can avoid repeating code by putting it in a separate method and calling the method wherever we need the code executed. In this example, because the repeated code returns a value, we will put it in a separate function: double calculateBMI () { return weight/(height*height); } This is the first Now we replace the two occurrences of: bmi = weight/(height*height); with two occurrences of: bmi = calculateBMI(); Instead of repeating the body of the function, we now only repeat the call to it. Typically, the body of a function will be much larger than a call to it. Thus, we save on typing effort. However, reducing typing is not the main reason for avoiding duplication of code. In fact, another way to prevent retyping of code is to use an editor with copy and paste capabilities. There is another, more, subtle reason for avoiding code repetition, which has to do with the fact that code keeps changing (because of changes in the client’s needs and our understanding of the problem). If repeated code must be changed, then we have to locate all occurrences of the code. It is easy to miss a few occurrences, leading to an erroneous program. To illustrate, after using our example application, our clients may tell us that they would prefer to enter the weight and height in pounds and inches rather than kilograms and meters. In the previous version of AnotherBMISpreadsheet, we may change the setWeight() method but not the setHeight() method, leading to an erroneous program. On the other hand, in the new version, we would only have to change calculateBMI(): double calculateBMI() { return (weight/2.2)/(height * 2.54/100*height*2.54/100); } Calling a Method Unlike the other methods in this class, calculateBMI() is not a getter or setter method. As a result, it is not automatically invoked by ObjectEditor. It is invoked explicitly by our program when we assign a new value to bmi: bmi = calculateBMI(); After defining getter and setter methods, you may think that it is sufficient to just declare methods – Java somehow automatically calls them at the correct time. A method declaration simply tells Java that we have defined a new operation. Java does not know when it should execute this operation unless we tell it explicitly do so. This is akin to defining an accelerate operation in a car but not invoking it until a driver presses the accelerator. The reason why we did not have to explicitly call the getter and setter methods is that ObjectEditor (which is not part of Java) did that for us to support display and editing of the properties. As we will see in later examples, it is possible to call these methods explicitly in our code. A method such as calculateBMI() that is not a getter or setter method must always be called explicitly by us – ObjectEditor never calls it automatically. In fact, we have already seen the notion of calling methods explicitly. Recall that in Chapter 2, we called the two functions defined in that chapter ( square() and calculateBMI()) explicitly. We did so interactively, using ObjectEditor. The caculateBMI() function defined in this chapter, however, is called from our code. Thus, the call to its is like the call to println() we made from our code. Internal Vs External Call Notice that the syntax used for invoking calculateBMI() is different from that used for invoking println() we saw earlier. Program Style 15 System.out.println(“setWeight called”); Target Object Method Name External Call Actual Parameter Parameterless method bmi = calculateBMI() Internal Call Function Return Value In the call to calculateBMI(), the actual parameters are missing because the method defines no formal parameters. A more noticeable difference is that the target is also missing. In general, the target is needed in an external call but not in an internal call. An external call is is the invocation of a method in another class while an internal call is the invocation of a method in the same class. The method calculateBMI() is both defined in and and called from the same class. Therefore no target is needed when it is called. The method println(), on the other hand, is defined in the predefined class PrintStream, but was called from our BMI spreadsheet class. Therefore a target was necessary in its invocation. Function Vs Procedure Call Another difference in the calls to the two methods is that the invocation of calculateBMI() appears on the RHS of an assignment statement, while the invocation of println() does not. This is because the former is a function while the latter is a procedure. A function call is an expression whose value is the result returned by the call. It can appear anywhwhere a value is needed, such as the RHS of an assignment statement. Thus, assignment statement: bmi = calculateBMI(); assigns the result of calling calculateBMI() to the variable, bmi. A procedure call is not an expresison, and thus cannot be used in an assignment or any other kind of statement needing a value. It forms a statement by itself. Non-Public Methods The method calculateBMI() is the first method whose header does not contain the keyword public. We have not made it accessible to other classes because we see no need for them to call it. It has been defined solely to satisfy an internal need of the class, not to add functionality accessible to other classes such as ObjectEditor. In general, a class will contain both public methods, accessible from outside the class, and non-public methods6, accessible only to other methods in the same class. Only the public methods of a class can be invoked from an edit window created by ObjectEditor, because non-public methods are not visible to it. Moreover, only the headers of the public methods of a class appear in its interface. 6 In fact, non-public methods and variables can be given three kinds of accesses called private, protected, and default, which allow some but not all classes to access them. In this book, we will not distinguish between these three kinds of accesses and assume that only the declaring class can access a non-public method or variable. Program Style 16 Least Privilege/Need-to-Know/Information Hiding By not making calculateBMI() public, we are following the principle of least privilege, which says that a piece of code should be given the least rights or privileges it needs to do its job. ObjectEditor and other classes do not need to directly call calculateBMI(), therefore, why give them the right to do so? The main problem with unnecessary privileges is that they can accidentally hurt us, just as an accidentally loaded gun can. In this example, if the method were public, ObjectEditor would clutter the methods menu with an item for this method. Moreover, some class can accidentally call calculateBMI() when it really intends to call getBMI().7 Furthermore, a reader trying to understand how to use instances of AnotherBMISpreadsheet would be forced to unnecessarily look at calculateBMI(). Another problem with giving unnecessary privileges is that it becomes difficult to change code. Suppose, after writing calculateBMI(), we decided to make it a procedure that assigns the computed BMI value to the bmi variable: void calculateBMI() { bmi = height / (weight*weight); } After making this change, we would have to modify all calls to it, because it is now a procedure rather than a function. Thus instead a call of the form: bmi = caculateBMI(); we would make the call: calculateBMI(); If the method were public, we would have to make such a change to all classes that refer to it. By not making it public, we restrict the changes to the class that defines it. Recall that it was for the same reasons - to maintain consistency constraints and support ease of change - that we did not make instance variables public. We were following this principle then too, making sure the access to variables declared in a class was given only to methods declared in that class. This principle is also called the “need to know” principle because we ensure that code is not given more information than it needs to know. It is also called the information hiding principle, because we hide unnecessary information from code. These problems of giving unnecessary privileges are fairly minor in this example. However, in general, several subtle problems, including security breaks, can arise when this principle is not followed. Named Constants Vs Literals In the expression (weight/2.2)/(height * 2.54/100*height*2.54/100); the numbers 2.2 and 2.54 may be magic numbers to some readers of the code, that is, numbers whose role they do not understand. The expression would make more sense if we rewrite our code as: (weight/LBS_IN_KG) / (height*CMS_IN_INCH/100*height*CMS_IN_INCH/100) It is now clearer that we are converting the weight in pounds to the corresponding weight in kilograms and the height in inches to the corresponding height in meters; and then applying the original formula for calculating BMI. 7 These methods return the same value but the former is more inefficient, because unlike the latter, it calculates the value each time it is called, rather than returning the value of the bmi variable. Program Style 17 Here LBS_IN_KG and CMS_IN_INCH are named constants. Named constants are like variables in that they are names (identifiers) associated with values. While the association between a variable and its value can change, that is, a variable can be assigned different values, the association between a named constant and its value does not change. The following declarations show how named constants are defined: final double LBS_IN_KG = 2.2; final double CMS_IN_INCH = 2.54; Unlike other variable declarations we saw earlier, these are initializing declarations, that is, assign initial values to the variables. The keyword final indicates that the value of the “variables” are final, that is, never change. Java ensures that final variables cannot later be assigned new values. The convention is to capitalize all letters of a named constant to distinguish it from a regular variable and separate its various words with an _ (e.g. LBS_IN_KG) Both the number: 2.2 and the named constant: LBS_IN_KG are constants, that is, their values are fixed. The former is called a literal to distinguish it from the latter. A literal is a constant that literally indicates the value it represents. A named constant is not a literal because we must look up its declaration to determine its value. Literals are not restricted to numbers such as 2.2. As we will see later, they may be characters such as ‘h’, strings such as “hello”, or boolean values such as true. Named Constants Vs Variables Instead of using constants (literal or named), we could have used variables to compute the BMI:. return (weight/ lbsInKg) / (height*cmsInInch/100*height*cmsInInch/100); Like the corresponding named constants, the variables would, of course, have to be assigned the correct values before they are used in the expression. This expression is as clear as the one using named constants. However, nothing would prevent us from accidentally changing the values of these variables after they have been assigned their initial values, as shown below: lbsInKg = 2.2; // initial assignment … lbsInKg = 22; // reassignment return (weight/ lbsInKg) / (height*cmsInInch/100*height*cmsInInch/100); In more general terms, we are giving our code more privilege than it needs by allowing it to change the values of these variables, thereby violating the principle of least privilege. Even if we are careful in not reassigning the variable, a virus that infects our program might do so. In the case of a named constant, Java would prevent the reassignment. Furthermore, when readers of our code see a named constant, they know its value does not change. When they see a variable, they must deduce this fact by examining all parts of the code that can assign to the variable. In summary, we should always use named constants instead of magic number or variables to denote program values that do not change. Natural Vs Human-Created Constants What is magic may depend on the audience. To some readers of the BMI expression, the numbers 2.2 and 2.54 may not have been magic. To others, who do not know that 100 centimeters make a meter, the number 100 may be magic. The reason why there is ambiguity here is that these are fundamental numbers defined by nature. When a number is human invented, it is more likely to be a magic number. In the statement: return hoursWorked*hourlyWage + 50; the number 50 is probably a magic number that should be replaced by a named constant: return hoursWorked*hourlyWage + BONUS; Even if it was clear to everyone, from this context, that the literal 50 represents the bonus, there is still an important advantage in defining a named constant for it. The bonus might have been used in other parts of the program, such as a print statement: Program Style 18 System.out.println(“Bonus: “, 50); Like most human-created numbers, it may change. If we do not define a named constant for the bonus, we must locate and change all occurrences that use it. If we defined a named constant for it, all we have to do is change the place it is declared. In summary, it is best to define named constants for human-invented numbers because they are likely to change and be magic numbers to the readers of the program. Space Used by Variables, Literals, Named Constants You might think that by using literals you can make the program more space efficient, believing that variables and named constants use slots in memory to store the values associated with them while literals do not. Actually this is not the case, for two reasons: The compiler essentially replaces all occurrences of a named constant with the literal it represents. Thus, named constants are equivalent to literals when the program is running, taking no more space than literals. Except for some special, commonly occurring literals such as 0, literals are also stored in memory slots. This is because the machine language does not understand all possible literals. The compiler replaces all occurrences of literals with the addresses of memory slots storing their values. (In fact, some languages such as FORTRAN allowed the value stored in the memory slot to be change by the program, which would, for instance, allow the literal 2.2 to actually mean the value 3.3!) In summary, if you are asked to make your code as space efficient as possible, you should not try to use literals instead of named constants. Using Variables to Remove Code Repetition Notice in the modified version of calculateBMI(), the expression height*CMS_IN_INCH/100 is now repeated. We can, of course, write another function to calculate this expression. However, because the two expressions occur in a single method and yield the same value, we should use another approach, shown below: double calculateBMI() { double heightInMeters = height*CMS_IN_INCH/100; return (weight/LBS_IN_KG) / (heightInMeters * heightInMeters); } Here we compute the value once, and store it in the variable, heightInMeters. Now, wherever the value is needed in the method, we use the variable instead recalculating the value. Not only is the code to evaluate the expression not repeated, it is also not executed twice, thereby making our method faster. Local Method Variables heightInMeters is the first variable we have declared inside a method body. Other variables have been either declared as formal parameters in a method header or as instance variables outside all methods of a class. We did not make it a formal parameter because its value is not specified by an actual parameter. We did not make it an instance variable, because it does not contain state that must be maintained between multiple method invocations. We need its value only when the method is executing, not afterwards. Java would allow us to make it a global instance variable, but that would make it accessible to all methods of the class, not just calculateBMI(), thereby violating the principle of least privilege. By declaring the variable inside calculateBMI(), we ensure that it is a local variable of the method, accessible only within the method. Moreover, we can give private names to local variables without worrying they will interfere with the names of local variables of other methods. Recall that formal parameters are also local variables, accessible only within the body of the method. To distinguish variables declared inside a method body from the formal parameters declared within a method header, we will refer to Program Style 19 them as method variables. Often, the term local variables is used for method variables but not formal parameters, even though they are local to a method. Identifier Visibility and Scope The code in which an identifier (such as a variable or method) is visible, that is, accessible, is called the scope of the identifier. The scope of a formal parameter or method variable is the method in which it is declared and the scope of a (private) instance variable is the class in which it is declared, whereas the scope of a public variable includes other classes also. Similarly, the scope of a non-public method is the class in which it declared whereas the scope of a public method includes other classes also. Java does not let us make the scope of an identifier be some arbitrary code chosen by us. For example, it is not possible to declare a variable that is visible in its getter and setter methods but not in other methods of the class. The principle of least privilege essentially tells us to keep the scope of an identifier as small as possible. height Scope public class AnotherBMISpreadsheet implements BMISpreadsheet{ double height, weight, bmi; public double getHeight() { Invalid Scope return height; } public void setHeight(double newHeight) { height = newHeight; bmi = calculateBMI(); } public double getWeight() { return weight; } public void setWeight(double newWeight) { weight = newWeight; bmi = calculateBMI(); heightInMetres } public double getBMI() { Scope return bmi; } double calculateBMI() { double heightInMetres = height*CMS_IN_INCH/100; return (weight/LBS_IN_KG) / (heightInMetres*heightInMetres); } } Initializing Vs Un-Initializing Declarations The declaration double heightInMeters = height*CMS_IN_INCH/100; is like the declaration of a named constant in that it assigns an initial value to the variable. In Java, any variable, except a formal parameter, can be initialized when declaring it. An initializing declaration of a variable is equivalent to a regular declaration followed by an assignment statement: double heightInMeters; heightInMeters = height*CMS_IN_INCH/100; Combining declaration and assignment saves on typing, and more importantly, prevents creation of uninitialized variables, that is, variables that have not been assigned by our code. An uninitialized variable is assigned a default value by Java, which depends on the type of the variable, and thus, is not under our control. The problem with creating an uninitialized variable is that we may accidentally refer to its value before it has been initialized, which can lead to the program failing in very subtle ways. For instance, we may forget to initialize heightInMeters: double heightInMeters; return (weight/LBS_IN_KG) / (heightInMeters * heightInMeters); In this case, Java will use a default value for the variable (zero), not the value we wanted. Program Style 20 The moral is that we should always use an initialization declaration when we do not want to rely on the default value assigned by Java. For example, if we know good initial values for height and weight, we should initialize the three variables of AnotherBMISpreadsheet: double weight = 70, height = 1.77, bmi = calculateBMI(); As thus example shows, (a) we can define multiple variables in one declaration, and (b) we can use any expression in an initialization declaration, not just a literal. This declaration not only ensures that the initial values of these variables are determined by us, but also ensures that the initial BMI is consistent with the initial weight and height, which was not the case in our previous version of AnotherBMISpreadsheet. Using Procedures to Remove Code Duplication We have seen two ways to remove code duplication: writing a function to compute an expression and storing a computed value in a variable. In both cases, the code that was duplicated was an expression. Sometimes, the code that is duplicated does not yield a value. In this case, we should use a procedure to remove the duplication. Suppose that in the two-variable implementation of the BMI spreadsheet, ABMISpreadsheet, we wished to print the values of all three properties at the beginning of the execution of each setter method. Before we consider the code duplication problem, let us consider how we should print the three properties. Printing the weight and height properties is straightforward. These values are stored in variables, and we have seen how variables can be printed. The BMI property, on the other hand, is not stored in a variable (in the two-variable version of the spreadsheet). Therefore, you may be tempted to store its value in a local variable before printing it: double bmi = getBMI(); System.out.println(“BMI: “ + bmi); While this code works, it is more long-winded that necessary. Recall that a function call can be used anywhere a value is needed. Thus, it can be directly used in the print statement: System.out.println(“BMI: ” + getBMI()); We can now write the down the new version of setWeight(): public void setWeight(double newWeight) { System.out.println(“Weight: “ + weight); System.out.println(“Height: “ + height); System.out.println(“BMI: “ + getBMI()); weight = newWeight; } The new setHeight() is similar, duplicating the first three lines of the previous method: public void setHeight(double newHeight) { System.out.println(“Weight: “ + weight); System.out.println(“Height: “ + height); System.out.println(“BMI: “ + getBMI(); height = newHeight; } The duplicated code does not yield a value to the calling method. It simply prints values on the console. Instead of putting it in a separate function, therefore, we should put it in a separate procedure: public void printProperties() { System.out.println(“Weight: “ + weight); System.out.println(“Height: “ + height); System.out.println(“BMI: “ + getBMI(); } Now, setWeight() and setHeight() can call this procedure instead of the whole statement sequence: public void setWeight(double newWeight) { printProperties(); weight = newWeight; } public void setHeight(double newHeight) { Program Style 21 printProperties(); height = newHeight; } Methods for Independent Pieces of Code The new procedure not only removes code duplication but also makes the two setter methods clearer by removing the clutter of print statements from them. Therefore, in general, any independent piece of code should be in a separate function or procedure, even if it has not been duplicated (so far). However, if an unduplicated code is one statement or expression, it may not be worthwhile to write a separate method for it. For instance: public void setHeight(double newHeight) { System.out.println(“Height: “ + height); height = newHeight; } is no more cluttered than: public void setHeight(double newHeight) { printHeight(); height = newHeight; } But if the independent piece of code is any longer, then it is indeed worwthile to put it in a separate method. Moreover, it is not bad style to put even a one-line piece of code in a separate method. A benefit of doing this, of course, is that if we later realise that we must invoke the code somewhere else, we can simply call the method rather than duplicating the code. Thus, it is not bad style to define the printHeight() method above. If we later realize that we should print the height in setWeight() also, we can now simply call printHeight(). In summary, as long as the single-line code is not duplicated, it is acceptable to go either way, though defining the method is slightly preferable ( to support future reuse), if we are willing to put the effort to create it. Can we ever have bad style by defining too many methods? Consider the following methods: public double getBMI() { return calculateBMI(); } public void calculateBMI() { return weight/(height*height); } Here getBMI() does nothing other than call calculateBMI(). In other words, there is no difference at all between getBMI() and calculateBMI(). Thus, one of them is uncessary, and we have code duplication. Now consider: public double getBMI() { return calculateBMI(weight, height); } public double calculateBMI(double theWeight, double theHeight) { return theWeight/(theHeight*theHeight); } Here geBMI() and calculateBMI() are different methods. The former uses instance variables to compute the BMI, while the latter uses formal parameters. Both methods are useful, as we have seen in the previous chapters. This code is, in fact, a very good example of reuse as getBMI() uses calculateBMI() to do the computation, rather than repeating the code to do so. Program Style 22 More on Sequential Execution Recall that the statements in a method body are executed in the order in which they appear in the body, from first to last. In the examples we have seen so far, the statements of a method body were independent of each other. As a result, it did not matter in what order they were executed. The new version of setWeight() above is our first example of a method in which the order does matter. Suppose we reversed the order of the two statements in the method body: public void setWeight(double newWeight) { weight = newWeight; printProperties(); } The earlier and new version of the method prints the values of properties before and after, respectively, the weight is changed. For example, if the variables height and weight have the values, 1.77 and 77.0, and we make the call: setWeight(71.0); then the first version of the method will print: Weight: 77.0 Height: 1.77 BMI: 24.57 while the new version will print: Weight: 71.0 Height: 1.77 BMI: 22.62 Printing Properties to Debug At this point, some of you may be wondering what the purpose of printing the properties was. If we are not using ObjectEditor, it allows us to check if our class is calculating the correct value of BMI. Even if we are using ObjectEditor, it can be useful. ObjectEditor displays a single snapshot of our interaction with it, that is, it shows only the current values of the three properties. On the other hand, by printing the properties each time setWeight() is called, we see in the console window the values of the properties for each value of weight we tried. This allows us, for example, to distinguish between those values of weight for which the BMI value is calculated correctly and those of it for which the BMI value is calculated incorrectly. Finally, as you write and debug more complicated programs, there will be times when you will be convinced that the only reason your programs is failing is because Java does not seem to work as advertised. For example, you may be convinced that Java is not doing the assignment in setWeight() correctly – it may seem to you that weight is not being changed whereas height is, or that the getter methods are not returning the values you calculated.. By printing the value of properties, you will be able to check if Java is indeed working correctly. Accidental Recursion What if we called printProperties() in getBMI()? public int getBMI() { printProperties(); return weight / (height * height); } If you have tried to do this, you have seen that nothing gets printed out, and the program hangs. The reason is that when getBMI() is called, it calls, printProperties(), which in turn calls getBMI(), which again calls printProperties(), and so on, leading to an unterminating sequence of calls. A call that calls itself, directly or indirectly, is called a recursive method. As we have seen here, recursive methods, when called, may never return, calling themselves in an infinite sequence of calls. Therefore, we must be careful, for now, to not write such methods. We will later see how we can write useful recursive methods that do terminate. Summary Instead of directly developing code, it is often useful to first develop algorithms implemented by it. Document the algorithms in the code as comments using javadoc conventions. Program Style 23 Declare headers of all of the public methods of a class in an interface. An implementation that is more space efficient than another is often less time efficient. Avoid repeating code to save on typing and allow easy modification to it. Do not give a piece of code more privileges than it needs. Use named constants rather than magic numbers. Use initializing declarations whenever possible. Exercises 1. 2. 3. 4. 5. 6. 7. 8. Define algorithm, interface, and stepwise refinement. What are the reasons for using stepwise refinement? Distinguish between local and global variable. Distinguish named constants from variables and literals and explain when you should use them. Why does Java have both single-line and arbitrary comments? Why is it important to follow the least privilege principle? Explain the difference between public and non-public methods? What are the advantages and disadvantages of creating instance variables for both dependent and independent properties of an object? 9. Write an interface and javadoc comments for the temperature class of the previous chapter (Chapter 3, problem 4.). 10. Write another class that implements the same interface but offers better space or time efficiency. 11. What are the reasons for writing interfaces for classes? Explain why interfaces are not a form of semantics specification. Illustrate your answer using your solution to problem 10. 12. Consider the following code for representing a certain amount of money. It defines three methods, which give the amount in pounds, dollars, and cents. It also defines a setter method for setting the amount of the money in pounds It assumes that 1 pound is currently worth 1.8 dollars. Create a new version of this program that computes the same results but improves on its style. Justify the changes you make. public class C { public double v = 0; public void setP (double n) { v = n; } public double getP() { return v; } public double getD() { return v*1.8; } public double getC () { return v*1.8*100; } }