JAVA: write once, run anywhere, on anything, anytime, safely By Elisabeth Mazur-Rzesos (Bibliothèque Royale de Belgique) What is Java? Java is just a small, simple, safe, object-oriented, interpreted or dynamically optimized, byte-coded, architecture-neutral, garbage-collected, multithreaded programming language with a strongly typed exception-handling mechanism for writing distributed, dynamically extensible programs. Bill Joy, cofounder Sun Microsystems, 1995 In the first place, Java is a compact object-oriented programming language. More importantly, Java is also a software platform for network-based cross-platform computing. The Java language Java is probably most popular for its use in creating interactive dynamic web pages ("Active content"). But Java extends further than this. It is a serious and powerful programming language capable of tackling the most sophisticated applications. Never in the history of computing has a new language attracted so much support from toolmakers, software developers and OS vendors, both from the academic and commercial world, in such a short time. (Since its launch in 1995, Java has been adopted by more than 1.000.000 programmers world-wide). Some features which make Java such an attractive language: object-oriented - Java utilises an object-oriented methodology similar to C++; it is a programming method that pairs functions and data into reusable chunks known as objects; the characteristics of these objects are defined by classes. The keywords behind the object-oriented methodology are: inheritance: Java classes can be extended by subclasses, making it possible to reuse code without rewriting code. encapsulation: internal object functionality is hidden from the rest of the application, which improves the ease of maintenance of object-oriented applications. polymorphism: Java allows constructor overloading and method-overloading (but no operator overloading). simple and familiar - the Java syntax is very similar to C++, although its most complex (and bug-prone) parts have been excluded (no pointers, automatic garbage collection etc.). portable - the same application can not only run on your PC but also on a mainframe, a cellular phone, a pager or even a coffee-machine. network-centric - popular TCP/IP based protocols are supported in terms of simple network protocol classes (which are designed to be secure). This makes Java an ideal client-server platform. Java clients can be integrated in web pages, and as soon as the user visits this page the applet can connect to (and only to) its originating server. This eliminates the need for distribution of platform-dependent client software. For this reason Java applets have become very popular in their use as database front-ends. robust - the elimination of pointer arithmetic and garbage-collection reduce the risk of run-time errors and memory leaks. secure - applets download from the web have strict security restrictions. They cannot read or write to the local disk, they can only connect to the server they originate from. The applet literally runs in a sandbox. multithreaded - Java model offers multithreading, which allows several execution flows to run simultaneously within a single application. dynamic - new classes can be loaded on the fly, thus making it possible to extend applications dynamically by loading classes only when they are needed. 1 The Java development kit The JDK is Sun Microsystems' free tool for developing Java applets and full-scale applications. It contains all the basic tools that developers need to get started: the Java compiler, a simple debugger, and a simple testing tool. It is a platform-dependent tool that is available for most native OSes: Windows 9x/NT, Linux, Solaris, Macintosh, etc. Although it is possible to develop complete applications with nothing but the JDK, many IT vendors have created graphical development tools for RAD (rapid application development) with Java (like Symantec's Café or Borland's JBuilder). These provide far more productive programming environments than the JDK's command-line interface. How does Java work? After the source code for an application has been written, it must be compiled. In the classic meaning, during the compilation process, the source code is translated into the native machine code of the computer. The Java compiler translates the source code into a condensed format known as bytecode, which is a compact and platformindependent format. To execute bytecode, Java uses a so-called Virtual Machine (JVM). This can best be described as virtual computer that runs on top of your current OS, and whose machine code is Java bytecode. It contains a bytecode interpreter, which translates the bytecode on the fly into the native code of the underlying microprocessor. Java bytecode is analogous to an executable binary, except it isn't specific to microprocessor architecture. It’s why Java applets can run on any computer that has a Java virtual engine. Unlike compiled binaries, Java applets aren't translated into native machine code until the moment of execution. The technical drawback to this approach, of course, is that on-the-fly interpreting takes time and hurts performance. Sun's Java chips could eliminate the need for run-time interpreting because they execute the bytecode directly. In effect, Java bytecode is the native instruction set of the Java microprocessors. These microprocessors are designed to run Java programs much faster than software based Java engine (Java Virtual Machine) on a general-purpose microprocessor. The Java platform The consequence of this architecture is that Java is far more than just the programming language. It is a complete software platform. Could this become a threat for current software platforms like Windows or Macintosh? Definitely not! It's important to realise how Java can supersede current platforms without killing them. Java is a platform that propagates entirely in software and coexists peacefully with the native OS, and even totally relies on it. As Java has no way of communicating directly with a machine's hardware, it uses the underlying OS to manage all hardware interactions. In fact Java just adds an abstraction layer to current platforms. Java is a platform implemented in software that runs on practically any machine, and software spreads much faster than hardware. If you have installed a Java-compatible Web-browser (HotJava, Netscape, MS Internet Explorer,..), you already have a Java Virtual Machine on your computer. You can also download JDK for free off the Web to make your system a Java platform. Java development tools come with a JVM, too. Apple, IBM, Microsoft, Novell, Silicon Graphics, SUN etc. integrate the JVM into their OSes accelerating the process of propagation of Java. IBM is porting Java to the AS/400 and MVS. These two operating systems manage an estimated 70 percent of the world's corporate data. It's not crazy to predict that by the turn of the century, there will be more copies of the Java VM in the world than any of the OSes that host it, and in the next few years Java may well become the number one software platform. Java issues: performance and security Performance Since the beginning, Java's main weakness was its poor performance. This was no big issue for little applets used to spice up some web page, but now that Java is being used for full-scale applications this has become a major problem. The cause of this problem lies at the heart of the Java architecture: the interpreter. Java is an interpreted language, and thus has to translate bytecode on the fly into native code, which is a time-consuming task. However, there are many reasons why this drawback shouldn't scare us: Just-in-time compilation (JIT) is a technology currently implemented in most Java environments. Interpreters that include a JIT compiler not only convert bytecode into native code on the fly, but they cache the converted code in memory while the program runs. This way, subsequent calls to the same piece of code will be executed 2 at the speed of native code, which improves the overall performance of the application. JIT compilers can be bundled with navigators or integrated with the JRE (Java Run-time Environment). They are completely transparent to the user. The latest version of the JVM (Java 2) is far more optimised than previous versions, so environments based on Java 2 provide better performance. Static compilation. Static compilers translate bytecode completely into native code, which gives an enormous performance increase, but statically compiled programs lose all the benefits of Java (platform-independence, WWW-distribution, etc…). Java programs can call native code. This allows the developer to write performance-critical code in the native language, and keep all the rest in Java. As with static compilation though, all platform-independence is lost. To keep programmers away from this possibility, Sun Microsystems launched the 100% Pure Java Initiative which encourages software vendors to write their software in pure Java. Finally, over the next years, processing power will continue to increase, and processors will soon be powerful enough to execute Java bytecode just as fast as today's native code. In the long run, none of the today's technical problems are likely to pose an insurmountable obstacle for Java. Java language and Java platform evolve very fast. We can speculate on Java's course because it's consistent with historical trends in computing. The most important trend is toward higher level of software abstraction above the hardware. Programmers get more performance by writing to the metal but the code is hard to maintain and even harder to port. And code lives much longer than anyone plans. The best example: the computer industry spending billions of dollars rewriting ancient code that can't handle the year 2000. Now the possibility exists to write code, which may live for 10 or 20 years. Java carries software abstraction to the next level because it abstracts everything below the VM - the OS and CPU become interchangeable parts that can be replaced without breaking applications. Java can run on just about any OS or CPU. UNIX and NT offer some hardware abstraction but they are multiplatform and not cross-platform, still needing recompiling or replacing of software when switching CPUs. Even if Java fails to conquer the world as a platform, we will end up with code that runs on every platform. For developers, the risks are minimal. For users Java could bring a new freedom to change OSes and CPUs without breaking software. Java security Since day one, Java was developed with security in mind. It is clear that most WWW-users are not very keen to see their hard drive formatted or their data scrambled just by visiting a web page that contains a Java applet. Therefore Java has a very strict security policy concerning applets (not applications), and forbids them to do the following: Applets can't read or write to the reader's file system Applets can't communicate with any network server other than the one that had originally stored the applet, to prevent the applet from attacking another system from the reader's system. Applets can't run any programs on the reader's system. Applets can't print (until Java 2) These security restrictions do not apply to Java applications. Like every security system, Java's security has its flaws. Princeton University's Secure Internet Programming Team (sip) at http://www.cs.princeton.edu/sip/java has discovered many of Java's known security holes. These allow devious developers to create hostile applets. We can distinguish two types of hostile applets, which utilise some flaws in Java's security: attack applets - serious system-modification attacks (seen only in labs so far), destructive; malicious applets (annoying but not destructive) - invasion of privacy (mail forging), denial of service, antagonism. Java provides a strong defence against the first type of applets, although it is very weak against the other type. Java security versus ActiveX security 3 Java and ActiveX are two systems that let people attach computer programs to Web pages. People like these systems because they allow Web pages to be much more dynamic and interactive than they could be otherwise. However, Java and ActiveX do introduce some security risk, because they can cause potentially hostile programs to be automatically downloaded and run on your computer, just because you visited some Web page. The downloaded program could try to access or damage the data on your machine, for example to insert a virus. Both Java and ActiveX take measures to protect your from this risk. There has been a lot of public debate over which system offers better security. This page gives our opinion on this debate. Java and ActiveX take fundamentally different approaches to security. We will concentrate on comparing the approaches, rather than critiquing the details of the two systems. After all, details can be fixed. ActiveX security relies entirely on human judgement. ActiveX programs come with digital signatures from the author of the program and anybody else who chooses to endorse the program. Think of a digital signature as being like a person's signature on paper. Your browser can look at a digital signature and see whether it is genuine, so you can know for sure who signed a program. (That's the theory, at least. Things don't always work out so neatly in practice). Once your browser has verified the signatures, it tells you who signed the program and asks you whether or not to run it. You have two choices: either accept the program and let it do whatever it wants on your machine, or reject it completely. ActiveX security relies on you to make correct decisions about which programs to accept. If you accept a malicious program, you are in big trouble. Java security relies entirely on software technology. Java accepts all downloaded programs and runs them within a security "sandbox". Think of the sandbox as a security fence that surrounds the program and keeps it away from your private data. As long as there are no holes in the fence, you are safe. Java security relies on the software implementing the sandbox to work correctly. One problem with the original version of Java is that the "sandbox" can be too restrictive. For example, Java programs are not allowed to access files, so there's no way to write a text editor. (What good is editing if you can't save your work?) Java-enabled products are now starting to use digital signatures to work around this problem. The idea is like ActiveX: programs are digitally signed and you can decide, based on the signature, to give a program more power than it would otherwise have. This lets you run a text editor program if you decide that you trust its author. The downside of this scheme is that it introduces some of the ActiveX problems. If you make the wrong decision about who to trust, you could be very sorry. There's no known way to get around this dilemma. Some kinds of programs must be given power in order to be useful, and there's no ironclad guarantee that those programs will be well behaved. Still, Java with signed applets does offer some advantages over ActiveX. You can put only partial trust in a program, while ActiveX requires either full trust or no trust at all. And a Java-enabled browser could keep a record of which dangerous operations are carried out by each trusted program, so it would be easier to reconstruct what happened if anything went wrong. (Current browsers don't do this record keeping, but we wish they would.) Finally, Java offers better protection against accidental damage caused by buggy programs. The good news is that there have been few incidents of people being damaged by hostile Java or ActiveX programs. The reason is simply that the people with the skills to create malicious programs have chosen not to do so. What can Java do for libraries? Today's library technology requirements: network-centric client-server environments - the librarians can break up their logic between the desktop and servers. The libraries' databases should be distributed across the whole network, and should be accessible from anywhere, anytime. scalability - hardware and software should be scalable as bandwidth and data requirements increase portability - corporate IT environments are heterogeneous; the equipment should have a relatively long lifecycle. multimedia and objects - library vendors developed proprietary database solutions customised for text retrieval rather than object searching. Object technology and compliance with object standards will assure extensibility and maintainability. 4 ease of maintenance - size of library networks keeps increasing but heterogeneity is kept. The consequence is a very hard to maintain network of totally different platforms, and continuous updating of all software versions is almost impossible. Java benefits to libraries: desktop platform independence: libraries are extremely heterogeneous environments (PC's, Macs, terminals, Unix workstations, NCs, …). The fact that the same software can run on all these platforms has huge consequences: decrease of maintenance cost, extension of the systems' lifecycles, ease of administration. In such heterogeneous environments it is almost impossible to keep all the systems up-to-date with the latest version of software. This problem is considerably reduced with Java, as only an application has to be kept up-to-date for the whole network. information management, control and delivery - lower cost of information delivery, reduce cost and problems of desktop physical upgrades, update your network server once for all Java-enabled desktops addresses better user needs and allows expanded user services - provide text with live content on the web, support a broader user base with more diverse content, just-in-time information (customer receives latest update), customised integration of learning technologies - the user is becoming more demanding and technically astute; he is more adept at searching, retrieving and manipulating data - roles of the traditional librarian object-oriented and distributed approach to libraries. Making use of technologies like CORBA, JDBC (Java Database Connectivity), library data can be distributed across the whole network, and can be made accessible form anywhere. This way, data can be totally decentralised. The object-oriented approach also offers considerable scalability advantages. The same systems can be used tot manipulate a variety of data types (media objects, text, …) and are also guaranteed to support all future data types. connectivity to legacy applications: many library systems still rely on old legacy library systems, which can only be accessed via text interfaces. With Java, cross-platform WWW-clients can be developed for these applications which can then be accessed from anywhere across the network, thus prolonging the life of these systems. inter-library communication: a consistent and flexible object-oriented approach can also lead to the possibility of easy data exchange between libraries world-wide over the Internet. Via CORBA, objects can be distributed over the whole Internet, and can be accessed from any library worldwide where they are needed. Some library projects involving Java CIMI - Consortium for the Computer Interchange of Museum Information - several Java-based applications http://www.cimi.org http://www.uk.adlibsoft.com/news/cimi.html Berkeley Digital Library Project http://elib.cs.berkeley.edu/ib/about.html Java Willow - Washington Information Looker-upper Layered over the Web (before - over Windows) - a fullfeatured Z39.50 client http://www.washington.edu/bibsys/jwillow - Java Willow applet http://www.washington.edu/bibsys/jwillow-app - Java Willow standalone application http://www.washington.edu/willow/java.html - White Paper on Java Willow DLI2 - several projects are Java based: http://www.dli2.nsf.gov/projects.html - Digital Libraries Initiative Phase-2 (list of projects) http://sifter.indiana.edu/about.html - A distributed information filtering system for digital libraries http://www-diglib.stanford.edu and http://www-diglib.stanford.edu/diglib/WP/PUBLIC/DOC117.html Stanford Digital Library Technologies ELISE I and II http://tasi.ac.uk/building/eliseprint.html Java-related technologies JavaOS JavaOS is a version of the Java VM that can be ported to target systems without an operating system. Previous versions of Java may have relied on the windowing system or the networking drivers supplied by - let's say - Solaris or Windows 95. JavaOS provides its own implementations of the networking and windowing libraries, for example. JavaOS is not a traditional OS but rather an OS that runs Java main programs and Java applets only. JavaOS is ideal 5 for companies and individuals interested in porting Java to new platforms without carrying all the baggage of a traditional OS. JavaScript JavaScript has nothing to do with Java - except its confusing name. JavaScript, originally LiveScript, is Netscape's scripting language which allows executable content to be built into HTML documents. Javascript code is interpreted by the browser when a document containing a script is loaded. JavaScript source code is not compiled, unlike Java applets. It is an object-based language but has no classes or inbuilt inheritance mechanisms. JavaScript programs can be designed and tested very quickly. Very useful for forms input validations or help functions but very limited. JavaScript typical use: generate HTML on the fly validate and respond to form input mediate interaction with hyperlinks, image maps script interactions with plugins and applets Java applications ask more programming effort since you must recreate the browser environment in the program (f.ex. by creating a window). They will send their output outside the browser environment. Applications are run directly from the command line - they have always a static main() method - main is the only instance of this method. Applets are called from an HTML file using <applet> tag. They can have init() or start() method; sometimes they can have main() method which will not be the main entry point for the program. You can also run an applet using the appletviewer from JDK. Java and tcl/tk Jacl (Java command language) and Tcl Blend released February 19, 1998 - Tcl Blend allows you to load and interact with Java VM from tclsh or wish; Jacl is a new 100% Java implementation of Tcl 8.0. Java and CORBA The Java language environment, the WWW and CORBA (Common Object Request Broker Architecture) are complementary software technologies. When used together they provide a powerful set of tools for developing and deploying user-oriented distributed applications. Using Java Applets and CORBA for Distributed Application Development - Eric Evans, Daniel Rogers http://www.arlut.utexas.edu/~itgwww/ird/JavaCORBA.html Bibliography http://wwwwseast2.usec.sun.com/edu/libraries/libtechdirection.html - Java in Libraries by Art Pasquinelli http://www-personal.umich.edu/~victorr/javaz.html - Java and Z39.50: A new solution for Web access to library catalogs by Victor Rosenberg http://www.biblio-tech.com/biblio/html/java__ncs_and_netpcs.html - Java, NCs and NetPcs 6