Notes on Encapsulation

advertisement
PART ONE Class Design
Principles
Encapsulation
We begin this chapter by considering the principles that result in a well-designed and general class
implementation. Primary among these is the concept of encapsulation. The dictionary provides several
definitions of the word capsule. For example, it can mean a sealed gelatin case that holds a dose of medication.
Early spacecraft were called capsules because they were sealed containers that carried passengers through space.
A capsule protects its contents from outside contaminants or harsh conditions and keeps its contents intact. To
encapsulate something is to place it into a capsule.
Encapsulation Designing a class so that its implementation is protected from the actions of external code except
through the formal interface
What is the formal interface? In terms of class design, a formal interface is a written description of all the
ways that the class may interact with other classes. The collection of methods and fields that are not private
defines the formal interface syntactically.
Formal Interface The components of a class that are externally accessible, which consist of its nonprivate fields
and methods.
What does encapsulation have to do with classes and object-oriented programming? One goal in designing
a class is to protect its contents from being damaged by the actions of external code. If the contents of a class can
be changed only through a well-designed interface, then we don't have to worry about bugs in the rest of the
program affecting the class in unintended ways. As long as we design the class so that it can handle any data that
are given to it that are consistent with the interface, we know that it is a reliable unit of software.
Reliable A property of a unit of software in which it can be counted on to always operate consistently, according
to the specification of its interface.
Here's an analogy that illustrates the difference between a class that is encapsulated and one that is not.
Suppose you are a builder, building a house in a new development. Other builders are working in the same
development. If each builder keeps all of his or her equipment and materials within the property lines of the
house that he or she is building, and enters and leaves the site only via its driveway, then construction proceeds
smoothly. The property lines encapsulate the individual construction sites, and the driveway is the only interface
by which people, equipment, and materials can enter or leave a site.
Imagine the chaos that would occur if builders started putting their materials and equipment in other sites
without telling one another and driving through other sites with heavy equipment to get to their own. The
materials would get used in the wrong houses, tools would get lost or run over by stray vehicles, and the whole
process would break down. Figure 7.1 illustrates the situation.
Figure 7.1 New Art: two panels, one above the other. In the top panel, draw a picture showing three building
sites, with dotted lines marking the property boundaries. Materials and tools are visible around partly constructed
houses, and all are well within the property lines. Show workers randomly around the sites. In the bottom panel,
draw a picture of three building sites, without the property lines. Materials and tools are scattered randomly
among the houses. Two of the houses are partially sided, and some of the siding on one is clearly from the other
house. Two workers are between them, and one is pointing at the siding, while the other is holding his or her
head. Between the other pair of houses, two workers are having a tug-of-war over a ladder.
Figure 7.1 Encapsulation Draws Clear Boundaries Around Classes. Failing to Encapsulate Classes can
Lead to Chaos.
Let's make this analogy concrete in Java by looking at two different interface designs for the same Date
class, one of which is encapsulated, and the other not.
// Encapsulated interface -// avoids errors due to misuse
private int month;
private int day;
private int year;
public void setDate (int newMonth,
int newDay,
int newYear)
// Checks that the new date is valid,
// otherwise it leaves the current
// value unchanged
// Unencapsulated interface -// potential source of bugs
public int month;
public int day;
public int year;
The interface on the right allows client code to directly change the fields of a Date. So, if the client code assigns
the values 14, 206, and 83629 to these fields, you end up with a nonsense date of the 206th day of the 14th month, of
the year 83,629. The encapsulated implementation on the left makes these fields private. It then provides a public
method that takes date values as arguments, and checks that the date is valid before changing the fields within the
object.
This example illustrates the point that there is no special Java syntax for encapsulation. Rather, we achieve
encapsulation by carefully designing the formal interface to ensure that the class and its objects have complete control
over what information enters and leaves them.
Encapsulation greatly simplifies the work of the programmer, because each class can be developed separately,
without worrying about how other classes are being implemented. In a large project that is being developed by a
programming team, encapsulation permits each programmer to work independently on a different part of the project.
Just as long as each class meets its specification, then separate classes can interact safely.
What do we mean by specification? Given a formal interface to a class, the specification is additional written
documentation that describes how a class will behave for each possible interaction through the interface. For example,
the formal interface defines how we call a method, and the specification describes what the method will do. You can
think of the formal interface as the syntax of a class, and the specification as its semantics. By definition, the
specification includes the formal interface.
Specification The written description of the behavior of a class with respect to its interface.
Abstraction
Encapsulation is the basis for abstraction in programming. Consider, for example, that abstraction lets us use a
Scanner without having to know the details of its operation.
Abstraction The separation of the logical properties (interface and specification) of an object from its implementation
Abstraction is how we simplify the design of a large application. As long as the interface and specification of a
class are complete, the programmer who implements the class doesn’t have to understand how the client code uses it.
As long as the programmer correctly implements the interface and specification, the programmer who uses the class
doesn’t have to think about how it is implemented.
Even when you are the programmer in both cases, abstraction simplifies your job because it allows you to focus on
different parts of the implementation in isolation from each other. What seems like a huge programming problem at
first becomes much more manageable when you break it into little pieces that you can solve separately (the divide-andconquer strategy introduced in Chapter 1).
There are basically two types of abstraction: data abstraction and control abstraction. Data abstraction is the
separation of the external representation of an object’s values from their internal implementation. For example, the
external representation of a date might be integer values for the day and year, and a string that specifies the name of the
month. But we might implement the date within the class using a standard value that calendar makers call the Julian
day, which is the number of days since January 1, 4713 BC.
Data abstraction The separation of the logical representation of an object’s range of values from their implemenation
The advantage of using the Julian day is that it simplifies arithmetic on dates, such as computing the number of
days between dates. All of the complexity of dealing with leap years and the different number of days in the months is
captured in formulas that convert between the conventional representation of a date and the Julian day. From the user’s
perspective, however, the methods of a Date object receive and return a date as two integers and a string. Figure 7.2
shows the two implementations, having the same external abstraction.
Same external abstraction for both
imp lementations of the Date class
January 12, 2006
January 12, 2006
private String month;
private int day;
private int year;
private int julianDay;
Date class with month, day,
and year internal
representation.
Date class with Julian day
internal representation.
Figure 7.2 Data Abstraction Permits Different Internal Representations for the Same External Abstraction
In many cases, the external representation and the implementation of the values are identical. However, we won’t
tell that to the user, in case we decide to change the implementation in the future. For example, we might initially
develop a Date class using fields for month, day, and year. Later on, we may decide that a Julian day representation
will be more efficient, and rewrite the entire implementation of the class. Because encapsulation has provided data
abstraction, we can make the change without affecting client code. Recall that we changed the internal representation
in the Time class in the last chapter without affecting the applications that used it.
Control abstraction is the separation of the specification of the behavior of a class from the implementation of that
behavior. For example, suppose that the specification for the Date class says that it takes into account all of the special
leap-year rules. In the Julian day implementation, only the Julian day conversion formulas handle those rules; the other
responsibilities merely perform integer arithmetic on the Julian day number.
Control abstraction The separation of an object's behavioral specification from the implementation of the specification
A user simply assumes that every Date responsibility separately computes leap year. Control abstraction lets us
actually program a more efficient implementation and then hide that complexity from the user.
Designing for Maintenance and Reuse
Applying the principles of abstraction to the design and implementation has two additional benefits: modifiability
and reuse.
Modifiability The property of an encapsulated class definition that allows the implementation to be changed without
having an effect on code that uses it (except in terms of speed or memory space)
Reuse The ability to import a class into code that uses it, without additional modification to either the class or the user
code; the ability to extend the definition of class
Encapsulation enables us to modify the implementation of a class after its initial development. Perhaps we are
rushing to meet a deadline, so we create a simple but inefficient implementation. In the maintenance phase, we can
replace the implementation with a more efficient version. The modification is undetectable by users of the class with
the exception that their applications run faster and require less memory.
As we will see in Chapter 9, reuse also means that an encapsulated class can be easily extended to form new
related classes. For example, suppose you work for a utility company and are developing software to manage its fleet of
vehicles. As shown in Figure 7.3, an encapsulated class that describes a vehicle could be used in the applications that
schedule its use and keep track of maintenance as well as the tax accounting application that computes its operating
cost and depreciation. Each of those applications could add extensions to the vehicle class to suit its particular
requirements. Reuse is a way to save programming effort. It also ensures that objects have the same behavior every
place that they are used. Consistent behavior helps us to avoid and detect programming errors.
Artbox figure 7.3 Reuse, formerly 4.7 located at the bottom left of p. 178
Of course, preparing a class that is suitable for wider reuse requires us to think beyond the immediate situation.
The class should provide certain basic services that enable it to be used more generally. For example, it should have a
full set of observers that enable client code to retrieve any necessary information from an object. Not every class needs
to be designed for general reuse. In some cases, we merely need a class that has specific properties for the problem at
hand, and that won’t be used elsewhere. But if you are designing a class and think that there is some possibility that it
will be used in other situations, then it is a good idea to make it more general.
Keep in mind that even though Java's class construct provides a mechanism to support encapsulation, it is up to
the programmer to use it in a way that results in actual encapsulation. There is no keyword or construct that
distinguishes a class as encapsulated. The programmer must draw the boundaries around the class in a manner that
keeps other code out.
There are two types of boundaries that we can draw: physical and visual. We can physically keep a user from
accessing fields in a class by using the appropriate access modifier such as private, and we can make the class
implementation invisible to a user by the appropriate use of the package construct that we discuss later in this
chapter. We return to the issue of encapsulation later in the chapter.
Mutable and Immutable Objects
In Chapter 2 we saw how to create constructor-observer classes, and noted that these have the property of being
immutable. That is, they cannot be changed after they are created. While there exist many objects that can be
represented in this manner, some objects need to have the ability to change their internal contents after they are created.
Such objects are said to be mutable. The name comes from the word mutate, and indicates that the object can change.
Mutability is a key distinguishing characteristic of the interface of an object. While immutable objects are naturally
encapsulated, because they are immune to change, we must take special care to ensure that mutable objects remain
encapsulated.
Mutable A property of a class of objects that enables the internal contents of an object to be changed after it is created
Let's look at an example of a mutable object. Suppose we are creating a database of birth records for a hospital. A
Birth Record is an object that contains the following information:
Birth Record:
Date of Birth
Time of Birth
Mother's Name
Father's Name
Baby's Name
Baby's Weight
Baby's Length
Baby's Gender
A nurse enters all of this information into the database shortly after the baby is born. However, in some cases, the
parents have not yet chosen a name for the baby. Rather than keep the nurse waiting for the parents to make up their
minds, the database allows all of the other information to be entered and creates a Birth Record object with an empty
string for the name of the baby. Later, when the name is decided, the nurse changes the name in the database.
There are two ways to change this database record. One would be to call a method that directly changes the value
in the Baby's Name field. For example, we could write the method as follows:
public void setBabyName (Name newName)
{
babysName = newName;
}
And then, given an instance of the BirthRecord class, called newBaby, we can call this method with the
following statement:
newBaby.setBabyName(new Name(first, last, middle));
// Changes the baby name field
Such a method is called a transformer. Having a transformer makes BirthRecord a mutable class. Note that
there is no special Java syntax to denote that setBabyName is a transformer. A method is a transformer simply by
virtue of what it does: It changes the information stored in an existing object.
Transformer A method that changes the information contained in an object.
Wouldn't it be easier to just make the babysName field public and to assign a new value to it without calling a
method? Yes, but that would destroy the encapsulation of the BirthRecord class. Making all changes through
transformers preserves encapsulation because it permits us to employ data and control abstraction. For example, we
could later enhance this transformer to check that the new name contains only alphabetic characters.
The second way to change the name assumes that BirthRecord is an immutable (constructor-observer) class.
We create a new record (with a constructor), copy into it all of the information except the name from the old record
(using observers), and insert the new name at that point. Then the old record is deleted. We might even provide a
constructor that takes another BirthRecord object as an argument, and automatically does the copying. Such a
constructor is called a copy constructor. Here's how it would look in Java:
public BirthRecord (BirthRecord oldRecord,
// Copy constructor
Name newName)
{
dateOfBirth
timeOfBirth
mothersName
fathersName
babysName =
babysWeight
babysLength
babysGender
= oldRecord.getDateOfBirth();
= oldRecord.getTimeOfBirth();
= oldRecord.getMothersName();
= oldRecord.getFathersName();
newName;
// Change name to new name
= oldRecord.getBabysWeight();
= oldRecord.getBabysLength();
= oldRecord.getBabysGender();
}
And we would make the change to the newBaby object as follows:
newBaby = new BirthRecord(newBaby, new Name(first, last, middle));
Copy constructor A constructor that creates a new object by copying some or all of the information contained in an
existing object.
As you can see, using the transformer is simpler. It is also much faster for the computer to call a method that
assigns a new value to a field than to create a whole new object and delete an old one. Is there any reason that we
shouldn't always use mutable objects? Yes, there is, and it has to do with how objects are used when they are passed as
arguments to methods.
Recall that an object variable contains the address (reference) where the object’s fields are stored in memory. This
address is the value copied from the argument to the parameter inside the method. Only one copy of the object’s fields
exists, which both the calling code and the method use. Figure 7.5 illustrates the difference between passing primitive
and reference types to a method.
Artbox Figure 7.5 Passing Primitive and Reference Types, formerly 5.4, located at the bottom right of p. 243
Changes to the primitive type parameters don’t affect the argument because the method works on a copy of the
argument value; not the original. But in the situation pictured in Figure 7.5, wouldn’t the changes to a reference type
parameter also change the original argument value? For mutable reference types the answer would be yes; for
immutable types the answer is no. Immutable objects are immune to changes, because they have no transformers. What
would happen if a new object was instantiated and assigned to the reference parameter? Figure 7.6 illustrates what
happens when assign a new string to a parameter variable of type String. You can see that the argument is not
changed.
Artbox Figure 7.6 The Effect of Assigning a New Value to a Reference Type Parameter, formerly 5.5, located at the
top of p. 244
In contrast, with a mutable class, the method can change the original argument. For example, suppose we have a
method that takes System.out as a parameter and uses System.out.print to display the string "Java". The
method’s parameter receives the address where System.out is stored. System.out is mutable because method
print directly changes the window to which the argument refers. Figure 7.7 shows this process, and you should
carefully compare it to Figure 7.6 to be sure that you understand the difference. Assigning a whole new value (the
address of a different object) to a reference type parameter does not alter the argument object. But if you change the
fields of the parameter object by calling a transformer those changes also affect the fields of the argument object.
Artbox Figure 7.7 The Effect of Changing a Mutable Object, formerly 5.6, located at the top of p. 245
Mutability is an abstract property of an object. When you design the interface to a new class, you should always
consciously decide whether it is mutable or immutable. The danger occurs when a programmer uses a mutable class
under the assumption that it is immutable, and passes an object of that class to a method that unexpectedly changes the
argument.
PART TWO Packages
As we noted previously, Java lets us group related classes together into a unit called a package. Classes within a
package can access each other’s nonprivate members, which can save us programming effort and make it more efficient
to access their data. The other advantage of packages is that they can be compiled separately and imported into our
code. Packages provide additional support for implementing encapsulation because they allow us to distribute our
classes as Bytecode files. The unreadable nature of Bytecode prevents users from seeing the implementation details,
thus provides visibility encapsulation.
Package Syntax
The syntax for a package is extremely simple. We’ve been writing our separate classes as unnamed packages all along,
so we merely have to specify the package name at the start of the class. The first line of a package consists of the
keyword package followed by an identifier and a semicolon. By convention, Java programmers start a package
identifier with a lowercase letter to distinguish it from class identifiers.
package someName;
Class someClass declares that is within the package as follows:
package someName;
// Class Documentation
public class someClass
{…}
Java calls the file containing this class, someClass.java, a compilation unit. The file may contain one or more nonpublic classes, but only one public class.
Compilation Unit A file containing one public Java class.
All of the classes declared within this compilation unit have access to each other’s nonprivate members. We say
“nonprivate” because, in addition to using the keywords public or private with fields and methods, we can write
member declarations without any modifiers. When we do so, then the field or method is neither public nor
private, but rather it is something in between—it can be accessed by any member of the package.
When we use public, then a field or method can be used outside of the class and by any class that imports its
package. When we use private, then the field or method can be accessed only within the class itself. When we use
neither, the field or method can be used within the class and within other classes in the same package, but not by
classes outside of the package. As an analogy, you can think of packages as being like a family. Some things are yours
alone (private), some things you share with your family (package), and some things anyone can use (public).
Classes that are imported into the package can be used by any of the classes declared in the package, but the
imported classes can only access the public members of the importing package. That is, imported classes are not
members of the package. You can think of an imported package as a guest in your house. Your guest may share some
things (public) with your family, but the things that you share only with your family are not shared with the guest,
and the things that the guest shares only with his or her family aren’t shared with you.
Although we can declare multiple classes in a compilation unit, only one class can be declared public. The
others must have package-level access; that is, they are written without an access modifier. If a compilation unit can
hold at most one public class, how do we create packages with multiple public classes? We use multiple
compilation units, as we describe next.
Packages with Multiple Compilation Units
Each Java compilation unit is stored in its own file, which is named to match the one public class in the file. All of
the compilation units of a package are stored in a single directory, which is named after the package itself. Suppose we
want to create a package that contains some of our existing classes, such as Name and Time. We would have to make
the following changes in classes Name and Time.
package utility;
// Class Documentation
public class Name
{…}
package utility;
// Class Documentation
public class Time
{…}
Is that it? Well, that's the only change to the code. We also need to create a file directory called utility, and
place the files into that directory:
directory utility
file Name.java
file Time.java
The Java compiler uses the combination of the name of the class and the package that contains it to locate the
source file within the appropriate directory on the disk. For example, suppose you have the following import
statement in a class declaration:
import utility.Name;
This tells the Java compiler to look in the utility directory for a file called Name that provides the Name class.
It first looks for a file called Name.class that contains the Bytecode version of the class. In many development
environments, if Name.class is not present, but Name.java is, then the source is compiled to produce
Name.class. Thus the Java philosophy is to use packages to enforce visibility encapsulation. The system assumes
that the Bytecode version of a package is available, not the source version.
Java restricts us to having a single public class in a file so that it can use file names to locate all public
classes. That's why a package with multiple public classes must be implemented with multiple compilation units,
each placed in a separate file having the same name as the class. Many programmers simply place every class in its
own compilation unit. Others gather the nonpublic classes into one unit, separate from the public classes. How you
organize your packages is up to you, but you should use a consistent approach to make it easy to find the members of
the package among all of its files.
Splitting a package among multiple files has one other benefit. Each compilation unit can have its own set of
import declarations. Thus, if the classes in a package need to use different sets of imported classes, you can place
them in separate compilation units, each with just the import declarations that are required.
Download