PART ONE Class Design Principles Encapsulation We begin this chapter by considering the principles that result in a well-designed and general class implementation. Primary among these is the concept of encapsulation. The dictionary provides several definitions of the word capsule. For example, it can mean a sealed gelatin case that holds a dose of medication. Early spacecraft were called capsules because they were sealed containers that carried passengers through space. A capsule protects its contents from outside contaminants or harsh conditions and keeps its contents intact. To encapsulate something is to place it into a capsule. Encapsulation Designing a class so that its implementation is protected from the actions of external code except through the formal interface What is the formal interface? In terms of class design, a formal interface is a written description of all the ways that the class may interact with other classes. The collection of methods and fields that are not private defines the formal interface syntactically. Formal Interface The components of a class that are externally accessible, which consist of its nonprivate fields and methods. What does encapsulation have to do with classes and object-oriented programming? One goal in designing a class is to protect its contents from being damaged by the actions of external code. If the contents of a class can be changed only through a well-designed interface, then we don't have to worry about bugs in the rest of the program affecting the class in unintended ways. As long as we design the class so that it can handle any data that are given to it that are consistent with the interface, we know that it is a reliable unit of software. Reliable A property of a unit of software in which it can be counted on to always operate consistently, according to the specification of its interface. Here's an analogy that illustrates the difference between a class that is encapsulated and one that is not. Suppose you are a builder, building a house in a new development. Other builders are working in the same development. If each builder keeps all of his or her equipment and materials within the property lines of the house that he or she is building, and enters and leaves the site only via its driveway, then construction proceeds smoothly. The property lines encapsulate the individual construction sites, and the driveway is the only interface by which people, equipment, and materials can enter or leave a site. Imagine the chaos that would occur if builders started putting their materials and equipment in other sites without telling one another and driving through other sites with heavy equipment to get to their own. The materials would get used in the wrong houses, tools would get lost or run over by stray vehicles, and the whole process would break down. Figure 7.1 illustrates the situation. Figure 7.1 New Art: two panels, one above the other. In the top panel, draw a picture showing three building sites, with dotted lines marking the property boundaries. Materials and tools are visible around partly constructed houses, and all are well within the property lines. Show workers randomly around the sites. In the bottom panel, draw a picture of three building sites, without the property lines. Materials and tools are scattered randomly among the houses. Two of the houses are partially sided, and some of the siding on one is clearly from the other house. Two workers are between them, and one is pointing at the siding, while the other is holding his or her head. Between the other pair of houses, two workers are having a tug-of-war over a ladder. Figure 7.1 Encapsulation Draws Clear Boundaries Around Classes. Failing to Encapsulate Classes can Lead to Chaos. Let's make this analogy concrete in Java by looking at two different interface designs for the same Date class, one of which is encapsulated, and the other not. // Encapsulated interface -// avoids errors due to misuse private int month; private int day; private int year; public void setDate (int newMonth, int newDay, int newYear) // Checks that the new date is valid, // otherwise it leaves the current // value unchanged // Unencapsulated interface -// potential source of bugs public int month; public int day; public int year; The interface on the right allows client code to directly change the fields of a Date. So, if the client code assigns the values 14, 206, and 83629 to these fields, you end up with a nonsense date of the 206th day of the 14th month, of the year 83,629. The encapsulated implementation on the left makes these fields private. It then provides a public method that takes date values as arguments, and checks that the date is valid before changing the fields within the object. This example illustrates the point that there is no special Java syntax for encapsulation. Rather, we achieve encapsulation by carefully designing the formal interface to ensure that the class and its objects have complete control over what information enters and leaves them. Encapsulation greatly simplifies the work of the programmer, because each class can be developed separately, without worrying about how other classes are being implemented. In a large project that is being developed by a programming team, encapsulation permits each programmer to work independently on a different part of the project. Just as long as each class meets its specification, then separate classes can interact safely. What do we mean by specification? Given a formal interface to a class, the specification is additional written documentation that describes how a class will behave for each possible interaction through the interface. For example, the formal interface defines how we call a method, and the specification describes what the method will do. You can think of the formal interface as the syntax of a class, and the specification as its semantics. By definition, the specification includes the formal interface. Specification The written description of the behavior of a class with respect to its interface. Abstraction Encapsulation is the basis for abstraction in programming. Consider, for example, that abstraction lets us use a Scanner without having to know the details of its operation. Abstraction The separation of the logical properties (interface and specification) of an object from its implementation Abstraction is how we simplify the design of a large application. As long as the interface and specification of a class are complete, the programmer who implements the class doesn’t have to understand how the client code uses it. As long as the programmer correctly implements the interface and specification, the programmer who uses the class doesn’t have to think about how it is implemented. Even when you are the programmer in both cases, abstraction simplifies your job because it allows you to focus on different parts of the implementation in isolation from each other. What seems like a huge programming problem at first becomes much more manageable when you break it into little pieces that you can solve separately (the divide-andconquer strategy introduced in Chapter 1). There are basically two types of abstraction: data abstraction and control abstraction. Data abstraction is the separation of the external representation of an object’s values from their internal implementation. For example, the external representation of a date might be integer values for the day and year, and a string that specifies the name of the month. But we might implement the date within the class using a standard value that calendar makers call the Julian day, which is the number of days since January 1, 4713 BC. Data abstraction The separation of the logical representation of an object’s range of values from their implemenation The advantage of using the Julian day is that it simplifies arithmetic on dates, such as computing the number of days between dates. All of the complexity of dealing with leap years and the different number of days in the months is captured in formulas that convert between the conventional representation of a date and the Julian day. From the user’s perspective, however, the methods of a Date object receive and return a date as two integers and a string. Figure 7.2 shows the two implementations, having the same external abstraction. Same external abstraction for both imp lementations of the Date class January 12, 2006 January 12, 2006 private String month; private int day; private int year; private int julianDay; Date class with month, day, and year internal representation. Date class with Julian day internal representation. Figure 7.2 Data Abstraction Permits Different Internal Representations for the Same External Abstraction In many cases, the external representation and the implementation of the values are identical. However, we won’t tell that to the user, in case we decide to change the implementation in the future. For example, we might initially develop a Date class using fields for month, day, and year. Later on, we may decide that a Julian day representation will be more efficient, and rewrite the entire implementation of the class. Because encapsulation has provided data abstraction, we can make the change without affecting client code. Recall that we changed the internal representation in the Time class in the last chapter without affecting the applications that used it. Control abstraction is the separation of the specification of the behavior of a class from the implementation of that behavior. For example, suppose that the specification for the Date class says that it takes into account all of the special leap-year rules. In the Julian day implementation, only the Julian day conversion formulas handle those rules; the other responsibilities merely perform integer arithmetic on the Julian day number. Control abstraction The separation of an object's behavioral specification from the implementation of the specification A user simply assumes that every Date responsibility separately computes leap year. Control abstraction lets us actually program a more efficient implementation and then hide that complexity from the user. Designing for Maintenance and Reuse Applying the principles of abstraction to the design and implementation has two additional benefits: modifiability and reuse. Modifiability The property of an encapsulated class definition that allows the implementation to be changed without having an effect on code that uses it (except in terms of speed or memory space) Reuse The ability to import a class into code that uses it, without additional modification to either the class or the user code; the ability to extend the definition of class Encapsulation enables us to modify the implementation of a class after its initial development. Perhaps we are rushing to meet a deadline, so we create a simple but inefficient implementation. In the maintenance phase, we can replace the implementation with a more efficient version. The modification is undetectable by users of the class with the exception that their applications run faster and require less memory. As we will see in Chapter 9, reuse also means that an encapsulated class can be easily extended to form new related classes. For example, suppose you work for a utility company and are developing software to manage its fleet of vehicles. As shown in Figure 7.3, an encapsulated class that describes a vehicle could be used in the applications that schedule its use and keep track of maintenance as well as the tax accounting application that computes its operating cost and depreciation. Each of those applications could add extensions to the vehicle class to suit its particular requirements. Reuse is a way to save programming effort. It also ensures that objects have the same behavior every place that they are used. Consistent behavior helps us to avoid and detect programming errors. Artbox figure 7.3 Reuse, formerly 4.7 located at the bottom left of p. 178 Of course, preparing a class that is suitable for wider reuse requires us to think beyond the immediate situation. The class should provide certain basic services that enable it to be used more generally. For example, it should have a full set of observers that enable client code to retrieve any necessary information from an object. Not every class needs to be designed for general reuse. In some cases, we merely need a class that has specific properties for the problem at hand, and that won’t be used elsewhere. But if you are designing a class and think that there is some possibility that it will be used in other situations, then it is a good idea to make it more general. Keep in mind that even though Java's class construct provides a mechanism to support encapsulation, it is up to the programmer to use it in a way that results in actual encapsulation. There is no keyword or construct that distinguishes a class as encapsulated. The programmer must draw the boundaries around the class in a manner that keeps other code out. There are two types of boundaries that we can draw: physical and visual. We can physically keep a user from accessing fields in a class by using the appropriate access modifier such as private, and we can make the class implementation invisible to a user by the appropriate use of the package construct that we discuss later in this chapter. We return to the issue of encapsulation later in the chapter. Mutable and Immutable Objects In Chapter 2 we saw how to create constructor-observer classes, and noted that these have the property of being immutable. That is, they cannot be changed after they are created. While there exist many objects that can be represented in this manner, some objects need to have the ability to change their internal contents after they are created. Such objects are said to be mutable. The name comes from the word mutate, and indicates that the object can change. Mutability is a key distinguishing characteristic of the interface of an object. While immutable objects are naturally encapsulated, because they are immune to change, we must take special care to ensure that mutable objects remain encapsulated. Mutable A property of a class of objects that enables the internal contents of an object to be changed after it is created Let's look at an example of a mutable object. Suppose we are creating a database of birth records for a hospital. A Birth Record is an object that contains the following information: Birth Record: Date of Birth Time of Birth Mother's Name Father's Name Baby's Name Baby's Weight Baby's Length Baby's Gender A nurse enters all of this information into the database shortly after the baby is born. However, in some cases, the parents have not yet chosen a name for the baby. Rather than keep the nurse waiting for the parents to make up their minds, the database allows all of the other information to be entered and creates a Birth Record object with an empty string for the name of the baby. Later, when the name is decided, the nurse changes the name in the database. There are two ways to change this database record. One would be to call a method that directly changes the value in the Baby's Name field. For example, we could write the method as follows: public void setBabyName (Name newName) { babysName = newName; } And then, given an instance of the BirthRecord class, called newBaby, we can call this method with the following statement: newBaby.setBabyName(new Name(first, last, middle)); // Changes the baby name field Such a method is called a transformer. Having a transformer makes BirthRecord a mutable class. Note that there is no special Java syntax to denote that setBabyName is a transformer. A method is a transformer simply by virtue of what it does: It changes the information stored in an existing object. Transformer A method that changes the information contained in an object. Wouldn't it be easier to just make the babysName field public and to assign a new value to it without calling a method? Yes, but that would destroy the encapsulation of the BirthRecord class. Making all changes through transformers preserves encapsulation because it permits us to employ data and control abstraction. For example, we could later enhance this transformer to check that the new name contains only alphabetic characters. The second way to change the name assumes that BirthRecord is an immutable (constructor-observer) class. We create a new record (with a constructor), copy into it all of the information except the name from the old record (using observers), and insert the new name at that point. Then the old record is deleted. We might even provide a constructor that takes another BirthRecord object as an argument, and automatically does the copying. Such a constructor is called a copy constructor. Here's how it would look in Java: public BirthRecord (BirthRecord oldRecord, // Copy constructor Name newName) { dateOfBirth timeOfBirth mothersName fathersName babysName = babysWeight babysLength babysGender = oldRecord.getDateOfBirth(); = oldRecord.getTimeOfBirth(); = oldRecord.getMothersName(); = oldRecord.getFathersName(); newName; // Change name to new name = oldRecord.getBabysWeight(); = oldRecord.getBabysLength(); = oldRecord.getBabysGender(); } And we would make the change to the newBaby object as follows: newBaby = new BirthRecord(newBaby, new Name(first, last, middle)); Copy constructor A constructor that creates a new object by copying some or all of the information contained in an existing object. As you can see, using the transformer is simpler. It is also much faster for the computer to call a method that assigns a new value to a field than to create a whole new object and delete an old one. Is there any reason that we shouldn't always use mutable objects? Yes, there is, and it has to do with how objects are used when they are passed as arguments to methods. Recall that an object variable contains the address (reference) where the object’s fields are stored in memory. This address is the value copied from the argument to the parameter inside the method. Only one copy of the object’s fields exists, which both the calling code and the method use. Figure 7.5 illustrates the difference between passing primitive and reference types to a method. Artbox Figure 7.5 Passing Primitive and Reference Types, formerly 5.4, located at the bottom right of p. 243 Changes to the primitive type parameters don’t affect the argument because the method works on a copy of the argument value; not the original. But in the situation pictured in Figure 7.5, wouldn’t the changes to a reference type parameter also change the original argument value? For mutable reference types the answer would be yes; for immutable types the answer is no. Immutable objects are immune to changes, because they have no transformers. What would happen if a new object was instantiated and assigned to the reference parameter? Figure 7.6 illustrates what happens when assign a new string to a parameter variable of type String. You can see that the argument is not changed. Artbox Figure 7.6 The Effect of Assigning a New Value to a Reference Type Parameter, formerly 5.5, located at the top of p. 244 In contrast, with a mutable class, the method can change the original argument. For example, suppose we have a method that takes System.out as a parameter and uses System.out.print to display the string "Java". The method’s parameter receives the address where System.out is stored. System.out is mutable because method print directly changes the window to which the argument refers. Figure 7.7 shows this process, and you should carefully compare it to Figure 7.6 to be sure that you understand the difference. Assigning a whole new value (the address of a different object) to a reference type parameter does not alter the argument object. But if you change the fields of the parameter object by calling a transformer those changes also affect the fields of the argument object. Artbox Figure 7.7 The Effect of Changing a Mutable Object, formerly 5.6, located at the top of p. 245 Mutability is an abstract property of an object. When you design the interface to a new class, you should always consciously decide whether it is mutable or immutable. The danger occurs when a programmer uses a mutable class under the assumption that it is immutable, and passes an object of that class to a method that unexpectedly changes the argument. PART TWO Packages As we noted previously, Java lets us group related classes together into a unit called a package. Classes within a package can access each other’s nonprivate members, which can save us programming effort and make it more efficient to access their data. The other advantage of packages is that they can be compiled separately and imported into our code. Packages provide additional support for implementing encapsulation because they allow us to distribute our classes as Bytecode files. The unreadable nature of Bytecode prevents users from seeing the implementation details, thus provides visibility encapsulation. Package Syntax The syntax for a package is extremely simple. We’ve been writing our separate classes as unnamed packages all along, so we merely have to specify the package name at the start of the class. The first line of a package consists of the keyword package followed by an identifier and a semicolon. By convention, Java programmers start a package identifier with a lowercase letter to distinguish it from class identifiers. package someName; Class someClass declares that is within the package as follows: package someName; // Class Documentation public class someClass {…} Java calls the file containing this class, someClass.java, a compilation unit. The file may contain one or more nonpublic classes, but only one public class. Compilation Unit A file containing one public Java class. All of the classes declared within this compilation unit have access to each other’s nonprivate members. We say “nonprivate” because, in addition to using the keywords public or private with fields and methods, we can write member declarations without any modifiers. When we do so, then the field or method is neither public nor private, but rather it is something in between—it can be accessed by any member of the package. When we use public, then a field or method can be used outside of the class and by any class that imports its package. When we use private, then the field or method can be accessed only within the class itself. When we use neither, the field or method can be used within the class and within other classes in the same package, but not by classes outside of the package. As an analogy, you can think of packages as being like a family. Some things are yours alone (private), some things you share with your family (package), and some things anyone can use (public). Classes that are imported into the package can be used by any of the classes declared in the package, but the imported classes can only access the public members of the importing package. That is, imported classes are not members of the package. You can think of an imported package as a guest in your house. Your guest may share some things (public) with your family, but the things that you share only with your family are not shared with the guest, and the things that the guest shares only with his or her family aren’t shared with you. Although we can declare multiple classes in a compilation unit, only one class can be declared public. The others must have package-level access; that is, they are written without an access modifier. If a compilation unit can hold at most one public class, how do we create packages with multiple public classes? We use multiple compilation units, as we describe next. Packages with Multiple Compilation Units Each Java compilation unit is stored in its own file, which is named to match the one public class in the file. All of the compilation units of a package are stored in a single directory, which is named after the package itself. Suppose we want to create a package that contains some of our existing classes, such as Name and Time. We would have to make the following changes in classes Name and Time. package utility; // Class Documentation public class Name {…} package utility; // Class Documentation public class Time {…} Is that it? Well, that's the only change to the code. We also need to create a file directory called utility, and place the files into that directory: directory utility file Name.java file Time.java The Java compiler uses the combination of the name of the class and the package that contains it to locate the source file within the appropriate directory on the disk. For example, suppose you have the following import statement in a class declaration: import utility.Name; This tells the Java compiler to look in the utility directory for a file called Name that provides the Name class. It first looks for a file called Name.class that contains the Bytecode version of the class. In many development environments, if Name.class is not present, but Name.java is, then the source is compiled to produce Name.class. Thus the Java philosophy is to use packages to enforce visibility encapsulation. The system assumes that the Bytecode version of a package is available, not the source version. Java restricts us to having a single public class in a file so that it can use file names to locate all public classes. That's why a package with multiple public classes must be implemented with multiple compilation units, each placed in a separate file having the same name as the class. Many programmers simply place every class in its own compilation unit. Others gather the nonpublic classes into one unit, separate from the public classes. How you organize your packages is up to you, but you should use a consistent approach to make it easy to find the members of the package among all of its files. Splitting a package among multiple files has one other benefit. Each compilation unit can have its own set of import declarations. Thus, if the classes in a package need to use different sets of imported classes, you can place them in separate compilation units, each with just the import declarations that are required.