Based on Concepts in
Programming Languages
By John C. Mitchell
• James Gosling at Sun, 1990
• Originally called Oak
• Designed to run on a device called the “set-top box” – (TV controller)
• Programs would be downloaded to the box
• Internet programming language was needed
• Oak started as a reimplementation of C++
• C++ wasn’t reliable enough
• Portability – easy to transmit programs over a network
• Reliability – Program crashes avoided as much as possible
• Safety – Receiving environment protected from programming errors and malicious code
• Dynamic Linking – programs distributed in parts, loaded by JRE
• Multithreaded Execution – support for concurrent programming
• Simplicity and Familiarity – appealing to
C++ and Web programmers
• Efficiency – important but secondary goal
• Interpreted – Java bytecode executed on a
Java Virtual Machine (Portability, Safety)
• Type Safety – Three levels:
– Compile time checking of source
– Type checking of bytecode before execution
– Run-time checking (array bounds checking)
• Objects and References – Not everything is an object (Compromise between simplicity and efficiency)
• Garbage Collection
– Necessary for complete type safety
– Simplifies programming. Avoids memory leaks
– Uses concurrency. GC runs as a background thread
• Dynamic Linking - Classes can be loaded incrementally as needed. Shorter wait times for transmitted programs.
• Concurrency Support – Model based on Threads
– Standard concurrency primitives built into language. Doesn’t rely on OS specific concurrency mechanisms
• Simplicity – smaller and simpler than most production quality languages. C++ features not included in Java:
– Structures and Unions (Classes take their place)
– Functions ( Java uses Static methods)
– Multiple Inheritance (What was Stroustrup thinking?)
– GoTo (Mama mia! It’s spaghetti!)
– Operator overloading (small bang for the buck)
– Automatic coercions – complex and unsafe
– Pointers – Reference variables are conceptually easier and less prone to programming errors
• All Java objects are explict heap-dynamic variables (nameless)
• Dog d = new Dog(“Fido”); //inside method
• Fido object is explicit heap-dynamic
• Variable d is stack-dynamic variable
• No destructors – objects are garbage collected when no references are made to them (when GC wakes up)
• Initialization – Constructors called to create every object
– All variables given initial values
• Instance variable – one for each object
• Static variable – one for entire class
• Static fields initialized once with initialization expressions or static initialization block inside class
• Public static int x = 3;
• public class Dog
{ … static {/* code executed once when class is loaded */}
}
• Overloading of methods – based on signature of method (method name, number of parms, and parm types)
• Two methods with the same name and different signatures are overloaded
• Overloading resolved at compile time
• Garbage Collection – Don’t have to explicitly free objects. No dangling references
• Finalize() – Method called by GC just before space is reclaimed. Called by the virtual machine when the virtual machine exits
• Method main is invoked with the name of the class
• public static void main(String [] args)
• toString() – invoked when a string representation of the object is needed.
(Much easier than C++ operator overloading of <<
• Four visibility distinctions for methods and fields
– public – accessible anywhere the class is visible
– protected accessible to methods of the class and any subclasses, as well as to other classes in the same package
– Private – accessible only in the class itself
– Package –accessible only to code in the same package. Members declared without an access modifier have package visibility
Package
CLASS B
CLASS A public int w;
Protected int x;
Private int y; int z;
CLASS C
Package
CLASS B
CLASS A public int w;
Protected int x;
Private int y; int z;
CLASS C
• Used to organize classes into logical and physical units
• Set of classes in a shared name space
• Package names match directory structures
• Package names combine with
CLASSPATH names to define paths to classes
}
{ public class Rectangle private int length, width; public Rectangle(int length, int width)
{ this.length = length; this.width = width; }
{ public setLength(int l) length = l;
{
} public int getLength() return length; }
public class Box extends Rectangle
{ private int height; public Box(int length, int width, int height)
{ super(length,height);
{ this.height = height; } public setHeight(int h) height = h;
{
} public int getHeight() return height; }
}
• Class B extends A
• B is the Subclass
• A is the Superclass
• Class B inherits all fields and methods of A
• If A and B have a method with the same name, the B method overrides the A method
• If A and B have duplicate field names, the name in B hides the name in A
• Constructors are called to create objects
• Box b = new Box(3,4,5);
• For derived classes, superclass constructors are called at the beginning
• Rectangle(3,4)
• Default constructors pass no parms
• Methods or classes can be declared final
• public final void myMethod( )
• public final Class A …
• Final methods can’t be overriden
• Final classes can’t be subclassed
• All classes in Java are subclasses of Object
• Subclassing Object occurs by default
• Object methods:
– getClass()
– toString()
– Equals()
– hashCode()
– Clone()
– Wait(), notify(), notifyAll()
– Finalize()
• A class that does not implement all of its methods
• Can’t be used to instantiate objects abstract class Shape { abstract int getSize(); abstract void doNothing();
}
• A Java interface is a “pure abstract” class
• All interface members must be constants or abstract methods
• No direct implementation
• Classes “implement” the interface by agreeing to code every method in the interface
• public interface Speakable { public void speak();
}
• public class Dog implements Speakable …
• Classes can implement several interfaces. This takes the place of multiple inheritance used in other languages
• Interfaces can be used as the type argument for methods public void makeSpeak(Speakable s) …
• Java types fall into two categories:
– Primitive – values:true,false and numbers
• boolean
• byte, short, int, long, char, float, double
– Reference – values refer to objects
• Class
• Inteface
• Array
• There are no explicit pointer types in Java
• Java does have implicit pointers
• Every reference variable is a pointer that can refer to an object
• Dog d = new Dog(“Fido”); d
Fido
• Dog d = new Dog(“Fido”);
Dog e; e = d;
Object obj = d; obj d e
Fido
• If class B extends class A, then the type of
B objects is a subtype of the type of A objects
• A class can implement one or more interfaces
• (Multiple) interface subtyping allows objects to support (multiple) common behaviors without sharing a common implementation
• Type rule in Java: If B is a subclass of A, B[ ] is a subtype of A[ ]
• A <: B implies A[ ] <: B[ ]
• This causes a problem called the array covariance problem
Class A { …}
Class B{ …}
B[ ] bArray = new B[6];
A[ ] aArray = bArray; //Ok since A[ ] <: B[ ] aArray[0] = new A( ); //allowed but causes
// run-time error - ArrayStoreException
• Exceptions are objects
– can indicate errors
– can indicate unusual events that deserve special attention
• Four categories of exceptions
– Code or data errors – bad array index
– Standard method exception – substring() can generate StringIndexOutOfBoundsException
– Programmer generated exceptions – build your own
– Java errors – JVM can generate exceptions
• Java forces programmers to deal with certain errors
• Certain exceptions don’t need to be caught – nothing to be done
• Provide a structured form of jump for exiting a block or function
• Data can be passed when the exit occurs
• Return is made to a point in the program that was set up to continue the computation
• Two mechanisms for supporting exceptions:
– throw - a statement or expression for raising
(throwing) an exception – aborts current computation and causes a jump
– try-catch - a handler that allows some code to respond to an exception (catching)
• In Java, exceptions are represented as objects of some subclass of class
Throwable
• The exception object carries information from the point the exception was thrown to the handler that catches it
• Designed to work well in multithreaded programs
• Exceptions are thrown and caught inside a try-catch block try { … some statements that might cause an exception…} catch(excp1 e) { …response statements } catch(excp2 e) { … response statements } finally { …statements}
Throwable
Error
Exception
Runtime
Exception
Unchecked
Exceptions
User-
Defined
Exception
Classes
Checked
Exceptions
• Compiler checks that a handler exists for each checked exception
• Checked exception that might occur must be named in a throws clause public void foo() throws IOException
• Error and RuntimeExceptions are usually thrown by the operating system and are exempt from being listed in the throws clause
• With Java subtyping, if any method m will accept any argument of Type A, then m will accept any argument from any subtype of A
• (Every subtype of A “is-an” A object)
• Subtype polymorphism provides a means for writing generic programs
Stack myStack = new Stack()
Dog d1 = new Dog(“Fido”); myStack.push(d1);
…
Dog d2 = (Dog) myStack.pop();
• Java supports Generic types
Stack<Dog> myStack = new Stack<Dog>( )
Dog d1 = new Dog(“Fido”); myStack.push(d1);
Dog d2 = myStack.pop();
• Java compiler produces “bytecode” in a .class file
• Class file contains bytecode and symbol table – constant pool
• Class loader reads the class file and arranges the bytecode in memory
• Class Verifier checks that the bytecode is type correct
• Linker resolves interfile references
• Bytecode interpreter “executes” the bytecode
A.java
B.class
Network
Java
Compiler
JVM
Loader
Verifier
Linker
Bytecode Interpreter
A.class
• Classes are loaded incrementally when needed
• Classes are objects
• Customized ClassLoader objects can be defined
• Makes sure of the following:
– Every instruction has a valid op-code
– Every branch instruction branches to the start of an instruction
– Every method has a structurally correct signature
– Every instruction obeys the Java type discipline
• Executes Java bytecode
• Performs run-time tests like index checking on arrays
• Run-time architecture includes program counter, instruction area, stack and heap
• Stack contains activation records containing local variables, parms, return values, and intermediate calculations for method invocations
• JVM has no registers. Intermediate values left on stack
• Objects stored on the heap
• All threads running of the same JVM share the same heap
• New threads are given a program counter and their own stack
• Activation records have three parts
– Local variable area – for local method variables
– Operand stack (within a stack) – for intermediate calculations and passing parms to other methods. Instructions are shorter since they implicitly reference the stack
– Data area – constant pool resolution, normal method return, exception dispatch
• Bytecode contains a data structure called the “Constant Pool”
• Symbolic names – fields, classes, methods
• Each entry is numbered
• Bytecode instructions reference constant pool numbers
• Performs run-time tests:
– All casts are checked to make sure they are type safe: Dog d = (Dog) e;
– All arrary references are checked to insure the index is within bounds: x[i] = x[j] + x[k];
– References are checked to make sure they are not null before a method is invoked: d.toString( );
– Garbage collection and absence of pointer arithmetic contributes to type-safe execution
• Bytecode references to a field or method cause table lookups for addresses that can be a bottleneck for concurrent programs getfield #5 <Field Obj var>
• Bytecode references are modified dynamically during execution with instructions that have direct addresses getfield … quick 6
• Reuse of the instruction is more efficient
• Two types of methods
– Instance
• Require and instance of the class before they can be invoked and use dynamic (late) binding
• Person myself = new Person(“David”); mySelf.speak();
– Class
• Do not require an instance of the class and use static (early) binding
• Integer.parseInt(x);
• JVM selects the Class method to invoke based on the type of Object reference.
This is known at compile time (static)
• JVM selects the Object method to invoke based on the actual class of the object at run time (dynamic)
• There are four bytecodes for invoking methods
– Invokevirtual - used when the superclass of an object is known at compile time
– Invokeinterface – used when only the interface of the object is known at compile time
– Invokestatic – used to invoke static methods
– Invokespecial – special cases
• Assume Person is a class that overrides toString() in Object
• Person p = new Person(“David”);
Object obj = p; obj.toString(); //invokes toString() in
Person
• Invokevirtual is used to call the toString() method in Person b
• Invokevirtual causes a method to be selected from subclass method tables based on the runtime type of the object
• This requires a lookup in the constant pool
• After the first lookup, the bytecode is modified to avoid table lookup by inserting an offset to the method in the bytecode
• Similar to Invokevirtual except the methods being invoked on an object declared with an interface name may be in different classes and positions within the class
• Information that was determined during table look-up is preserved, but will not be correct if the next invocation occurs on an object of a different class
• Java was designed to support mobile code
• Two main mechanisms for dealing with mobile code risks:
– Sandboxing –running the code in a restricted execution environment
– Code signing – verifying that the digital signature of the file producer is trusted
• Attackers send messages that cause a program to read data into a buffer memory
• The buffer memory is overwritten, leaving different return addresses or completely new code on the machine
• The sandbox consists of four Java mechanisms: class loader, verifier, runtime checks of the JVM, and the security manager
• Loader separates trusted class libraries from untrusted packages by using different class loaders
• Class loader places code into categories that let the security manager restrict the actions the code will be allowed to take
• Separate name spaces for classes loaded by different loaders
• The manager is a single Java object
• Keeps track of which code can do which dangerous operations
• Each JVM has only one security manager at a time
• Security manager can’t be uninstalled
• Security manager answers questions about access permissions
• When Java makes an API call, the associated API code asks the security manager whether the operation is allowed
• Security manager uses the code signer and URL to determine if the operation is valid
• Security manager throws a
SecurityException if the operation is not allowed
• By enforcing type safety, Java helps maintain security
• Security problems arise if the same storage area can be associated with two different types