1.0 Overview Microsoft has generated a great deal of interest with its newly developed “.NET” products. With the .NET initiative, Microsoft aims at providing developers the ability to capture the power of the Internet. .NET is a general purpose platform that allows users to design, develop, deploy and use web based applications in a uniform and efficient manner. According to Microsoft, .NET will “transcend device boundaries and fully harness the connectivity of the Internet”. The obvious competitor to .NET is Sun Microsystems’s J2EE development platform. There are some obvious similarities between the two platforms as well as incredible differences in some areas. Our objective here is to research and analyze a few functionalities and modules of .NET and J2EE. The areas we will specifically discuss include programming language capabilities and interactions, data access abilities, security issues, and middleware connectivity and interoperability. 2.0 Introduction .NET has an interesting history: 1993: In the beginning, there was COM. 1996: Along came Java 1997: COM continued to evolve: MTS, COM+ 1998: XML enters the picture 2000: .NET, the latest evolution of COM Microsoft’ .NET initiative brings so many great changes that the software developers, architects and the managers are all experiencing a fundamental tectonic shift. It has created difficulties to adapt to in the short run, but aims at enabling the windows 1 developers to build more powerful, more useful software more efficiently in the long run. .NET brings a language agnostic model, a managed execution environment, JIT compilation, and rich library support. Microsoft’s .NET can be summarized as a “brand applied to a wide range of technologies”. These technologies include: The .NET Framework: For developers, this framework is the most important part of .NET. Though it can be described as a successor to the Windows DNA (Distributed interNet applications Architecture), the process of windows development is greatly simplified because of the presence of the Common Language Runtime (CLR) and the .NET Framework class library. The CLR defines a common set of semantics that is used by multiple languages. Visual Studio.NET which is Microsoft’s primary tool for building .NET framework applications supports four CLR-based languages: Visual Basic.NET, C#, Managed C++ and Jscript.NET. There is also some third party demonstrated support for other languages built on CLR like Perl, Python and COBOL. CLR also provides other common services such as Garbage Collection and standard format for Metadata (the managed code is compiled into MSIL, i.e. Microsoft Intermediate Language), and Assemblies (conceptually similar to JAR files) for organizing compiled codes (DLLs or EXEs). The .NET Framework class library provides the standard code for the common functions. The .NET framework written in any language can use and access system services in a same way. Among the most important technologies that this framework provides include ASP.NET (key feature for building applications that use web services), ADO.NET (next generation ActiveX Data Objects for accessing data stored in relational 2 DBMSs and other formats), Windows Forms (a standard set of classes for building windows GUI in any .NET Framework programming languages), Enterprise Services (standard classes for accessing COM+ services such as transactions and object pooling). Web Services: The .NET Framework is the programming model for developing, deploying, and running XML Web services and applications that all underlies .NET. XML Web services are units of code that allow developers to write programs in different languages and use different platforms to communicate through standard Internet protocols like XML, WSDL (Web Services Description Language to define the interfaces to web services), SOAP (Simple Object Access Protocol to invoke the operations in those interfaces) and UDDI (Universal Description, Discovery and Integration to let clients find compatible servers). UDDI Registration SOAP INTERNET UDDI Registry Application Application Application 3 3.0 .NET Languages and Java J2EE offers only Java as the only language for the platform, while .NET platform offers a number of languages for programmers to write code in, namely, C#, VB.NET, Jscript and Managed C++. To state in their words, it is a Language-Agnostic Model. Though the environments take radically different views considering the languages in general, they coalesce nicely when it comes to C# and Java. The C# language is an object-oriented language that lets the programmers to quickly build a wide range of applications for the Microsoft .NET platform. The goal of C# and the .NET platform is to shorten development time by freeing the developer from worrying about several low level plumbing issues such as memory management, type safety issues, building low level libraries, array bounds checking, etc. thus allowing developers to actually spend their time and energy working on their application and business logic instead. As a Java developer the previous sentence could be described as "a short description of the Java language and platform" if the words C# and the .NET platform were replaced with words Java and the Java platform. Background In June 2000, Microsoft announced a new programming language called C# along with the .NET platform. C# is a strongly typed object-oriented language designed to give the optimum blend of simplicity, expressiveness, and performance. As stated earlier, the .NET platform is centered on a Common Language Runtime (similar to the JVM) and a set of libraries, used, by a wide variety of languages, which are able to work together by all compiling to an intermediate language (IL). The C# language was built with the hindsight of many languages, but most notably Java and C++. It was co-authored by 4 Anders Hejlsberg (who is famous for the design of the Delphi language), and Scott Wiltamuth. The most popular tool for creating C# code is Microsoft’s Visual Studio.NET, though Microsoft also provides a command-line complier with the .NET Framework called csc.exe and there is also another C# complier developed by the open source world. 3.1 Differences and Similarities (C# vs. Java) A major part of our effort was concentrated on understanding the C# language, its various features and comparing it to java with fallacies and benefits of each. Java and C# are two big languages, and though programmers from both camps find understanding the other easy, there are some subtle differences between the two. As far as the languages are concerned the differences can be divided into 4 parts informally: There are some features, which are exactly similar in both languages like Runtime environments, Garbage Collection, Interfaces, Strings and Unextendable classes. While some features differ only in small ways or a syntax like Inheritance, Access Modifiers, Reflection. Some features have some major conceptual differences like Nested Classes, Threads, Operator Overloading, Serialization, and Documentation generation. A few features of C# with no counterparts in Java are Deterministic Object Cleanup, Delegates, Boxing, Pointers and Unsafe Code, Pass by Reference. While, there are certain Java features having no look-alikes in C#, like Extensions, Checked Exceptions, Dynamic Class Loading, Anonymous Inner Classes and above all Cross Platform Portability. We have tried to look into a few areas out of those cited above, and analyzed how C# shape up in those areas, and compare the approaches of both languages when they 5 both provide those features, and analyze the reasons why certain features don’t have their counterparts. 3.1.1 Class hierarchy Both Java and C# have a single rooted class hierarchy where all the classes in C# are subclasses of System.Object and all Java classes are subclasses of java.lang.Object. In both the languages the methods of the Object classes share some similarities (e.g. System.Object's ToString() to java.lang.Object's toString()) and differences (System.Object does not have analogs to wait(), notify() or notifyAll() in java.lang.Object). 3.1.2 Execution environment Just like Java is typically compiled to Java byte code which then runs in managed execution environment (the Java Virtual Machine) so also is C# code compiled to an Intermediate Language (IL) which then runs in the Common Language Runtime (CLR). Both platforms support native compilation via Just In Time compilers. However, one of the key differences between C#’s IL design and Java byte code is that C# code is never interpreted; it is always natively complied, unlike Java where code is interpreted first and then natively compiled. This was just a design decision made to make the translation of IL into native code easier. It changes which instructions are included, what type information is included, and how it is conveyed. Thus the two ILs are different, the IL in C# is more type-neutral. There's no information in the instructions that specifies the type of the arguments. Rather, that is inferred by what's been pushed on the stack. This approach makes the IL more compact because a JIT compiler has that information anyways, so there's no reason to carry it along in the instructions. 6 3.1.3 Object creation In Java objects are created on the heap using the new keyword. Most classes in C# are created on the heap by using the new keyword. Also, just as the JVM manages the destruction of objects, so also does the CLR via a Mark and Compact garbage collection algorithm. Additionally, C# also supports stack-based classes called Value-types. In Java, only the primitives are created on the stack. Thus, objects that are used similarly to primitives in a program hang around and wait for garbage collection thus adding overhead to the program, especially if the objects were used briefly and in a single location. To avoid the problem of allocating heap space for such classes and then having to garbage collect them, C# has a mechanism that allows one to specify that objects of a certain class should be stack based (In fact, C#'s built-in types such as int are actually implemented as structs in the runtime library). Unlike classes, value types are always passed by value and are not garbage collected. And arrays of value types contain the actual value type objects, not references to dynamically allocated objects, which yield a savings of both memory and time. 3.1.4 Namespaces A C# namespace is a way to group a collection of classes similar to Java's package construct. Though both are conceptually same, they are implemented differently. In Java, the package names dictate the directory structure of source files in an application whereas in C# namespaces do not dictate the physical layout of source files in directories only their logical structure. This complete separation between physical packaging and logical naming gives more flexibility to package things together in physical distribution 7 units without forcing the use of a bunch of directories. In the language itself, there are clearly some differences. In Java, the packaging is also the physical structure, and because of this a Java source file has to be in the right directory and can only contain one public type or one public class. In C#, source files can be given any names, can contribute to multiple namespaces and can take multiple public classes. Further, one can write all of the sources in one big file, or can spread them across smaller files. Conceptually, what happens with C# at compilation is that one gives the compiler all of the source files that make up one’s project and then it just goes off and figures out what to do. 3.1.5 Access modifiers The table below shows the mapping between C# access modifiers to that of Java. C#’s protected keyword has the same semantics as the C++ version, different than the semantics of the protected keyword in Java. This means that a protected member can only be accessed by member methods in that class or member methods in derived classes but is inaccessible to any other classes. The internal modifier means that the member can be accessed from other classes in the same assembly as the class. The internal protected modifier means that a member can be accessed from classes that are in the same assembly or from derived classes. The default accessibility of a C# field or method when no access modifier is specified is private while in Java it is protected (except that derived classes from outside the package cannot inherit the field). C# access modifier Java access modifier Private Private Public public Internal protected protected N/A internal protected N/A 8 3.1.6 Reflection The ability to discover the methods and fields in a class as well as invoke methods in a class at runtime, typically called reflection, is a feature of both Java and C#. The primary difference between reflection in Java versus reflection in C# is that reflection in C# is done at the assembly level while reflection in Java is done at the class level. Since assemblies are typically stored in DLLs, one needs the DLL containing the targeted class to be available in C# while in Java one needs to be able to load the class file for the targeted class. The examples below which enumerate the methods in a specified class should show the difference between reflection in C# and Java. C# using System; using System.Xml; using System.Reflection; using System.IO; class ReflectionSample { public static void Main( string[] args){ Assembly assembly=null; Type type=null; XmlDocument doc=null; try{ // Load the requested assembly and get the requested type assembly=Assembly.LoadFrom("C:\\WINNT\\Microsoft.NET\\Framework\\v1.0.2914\\System. XML.dll"); type = assembly.GetType("System.Xml.XmlDocument", true); //Unfortunately one cannot dynamically instantiate types via the Type object in C#. doc = Activator.CreateInstance("System.Xml","System.Xml.XmlDocument").Unwrap() as XmlDocument; if(doc != null) Console.WriteLine(doc.GetType() + " was created at runtime"); else Console.WriteLine("Could not dynamically create object at runtime"); }catch(FileNotFoundException){ 9 Console.WriteLine("Could not load Assembly: system.xml.dll"); return; }catch(TypeLoadException){ Console.WriteLine("Could not load Type: System.Xml.XmlDocument from assembly: system.xml.dll"); return; }catch(MissingMethodException){ Console.WriteLine("Cannot find default constructor of " + type); }catch(MemberAccessException){ Console.WriteLine("Could not create new XmlDocument instance"); } // Get the methods from the type MethodInfo[] methods = type.GetMethods(); //print the method signatures and parameters for(int i=0; i < methods.Length; i++){ Console.WriteLine ("{0}", methods[i]); ParameterInfo[] parameters = methods[i].GetParameters(); for(int j=0; j < parameters.Length; j++){ Console.WriteLine("Parameter:{0}{1}",parameters[j].ParameterType, parameters[j].Name); } }//for (int i...) } } Java import java.lang.reflect.*; import org.w3c.dom.*; import javax.xml.parsers.*; class ReflectionTest { public static void main(String[] args) { Class c=null; Document d; try{ c=DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument().getClass(); d = (Document) c.newInstance(); System.out.println(d + " was created at runtime from its Class object"); }catch(ParserConfigurationException pce){ System.out.println("No document builder exists that can satisfy the requested configuration"); }catch(InstantiationException ie){ System.out.println("Could not create new Document instance"); }catch(IllegalAccessException iae){ System.out.println("Cannot access default constructor of " + c); } // Get the methods from the class Method[] methods = c.getMethods(); //print the method signatures and parameters for (int i = 0; i < methods.length; i++) { System.out.println( methods[i]); Class[] parameters = methods[i].getParameterTypes(); for (int j = 0; j < parameters.length; j++) { System.out.println("Parameters: " + parameters[j].getName()); } } 10 } } The above code samples demonstrate that there is slightly more granularity in the C# Reflection API than the Java Reflection API because C# has a ParameterInfo class which contains metadata about the parameters of a Method while Java uses Class objects for that purpose which lose some information such as the name of the parameter. Sometimes there is a need to obtain the metadata of a specific class encapsulated as an object. This object is the java.lang.Class object in Java and the System.Type object in C#. To retrieve this metadata class from an instance of the target class, the getClass() method is used in Java while the GetType() method is used in C#. If the name of the class is known at compile time then one can avoid creating an instance of the class just to obtain the metadata class by doing the following. C# Type t = typeof(ArrayList); Java Class c = java.util.Arraylist.class; /* Must append ".class" to fullname of class */ 3.1.7 Serialization and documentation Object Persistence also known, as Serialization is the ability to read and write objects via a stream such as a file or network socket. Object Persistence is useful in situations where the state of an object must be retained across invocations of a program. Usually in such cases simply storing data in a flat file is insufficient and using a Database Management System (DBMS) is overkill. Serialization is also useful as a means of transferring the representation of a class in an automatic and fairly seamless manner. 11 Serializable objects in C# are annotated with the Serializable attribute. The NonSerialized attribute is used to annote members of a C# class that should not be serialized by the runtime. Such fields are usually calculated as temporary values that have no meaning when saved. C# provides two formats for serializing classes; either as XML or in a binary format, the former is more readable for users and applications while the latter is more efficient. One can also define custom ways an object is serialized if the standard ways are insufficient by implementing the ISerializable interface. In Java, serializable objects are those that implement the Serializable interface while the transient keyword is used to mark members of a Java class as ones not to be serialized. By default Java supports serializing objects to a binary format but does provide a way of overriding the standard serialization process. Objects that plan to override default serializations can implement methods with the following signatures private void readObject(java.io.ObjectInputStream stream) throws IOException, ClassNotFoundException; private void writeObject(java.io.ObjectOutputStream stream) throws IOException Since the above methods are private there is no interface that can be implemented to indicate that a Java class supports custom serialization using readObject and writeObject. For classes that need publicly accessible methods for custom serialization there exists the java.io.Externalizable interface which specifies the readExternal() and writeExternal() for use in customizing how an object is read and written to a stream. C# using System; using System.IO; using System.Reflection; using System.Runtime.Serialization; using System.Runtime.Serialization.Formatters.Binary; using System.Runtime.Serialization.Formatters.Soap; [Serializable] class SerializeTest{ [NonSerialized] 12 private int x; private int y; public SerializeTest(int a, int b){ x = a; y = b; } public override String ToString(){ return "{x=" + x + ", y=" + y + "}"; } public static void Main(String[] args){ SerializeTest st = new SerializeTest(66, 61); Console.WriteLine("Before Binary Write := " + st); Console.WriteLine("\n Writing SerializeTest object to disk"); Stream output = File.Create("serialized.bin"); BinaryFormatter bwrite = new BinaryFormatter(); bwrite.Serialize(output, st); output.Close(); Console.WriteLine("\n Reading SerializeTest object from disk\n"); Stream input = File.OpenRead("serialized.bin"); BinaryFormatter bread = new BinaryFormatter(); SerializeTest fromdisk = (SerializeTest)bread.Deserialize(input); input.Close(); /* x will be 0 because it won't be read from disk since non-serialized */ Console.WriteLine("After Binary Read := " + fromdisk); st = new SerializeTest(19, 99); Console.WriteLine("\n\nBefore SOAP(XML) Serialization := " + st); Console.WriteLine("\n Writing SerializeTest object to disk"); output = File.Create("serialized.xml"); SoapFormatter swrite = new SoapFormatter(); swrite.Serialize(output, st); output.Close(); Console.WriteLine("\n Reading SerializeTest object from disk\n"); input = File.OpenRead("serialized.xml"); SoapFormatter sread = new SoapFormatter(); fromdisk = (SerializeTest)sread.Deserialize(input); input.Close(); /* x will be 0 because it won't be read from disk since non-serialized */ Console.WriteLine("After SOAP(XML) Serialization := " + fromdisk); Console.WriteLine("\n\nPrinting XML Representation of Object"); XmlDocument doc = new XmlDocument(); doc.Load("serialized.xml"); Console.WriteLine(doc.OuterXml); } } Java import java.io.*; class SerializeTest implements Serializable{ transient int x; private int y; public SerializeTest(int a, int b){ x = a; y = b; } public String toString(){ return "{x=" + x + ", y=" + y + "}"; } 13 public static void main(String[] args) throws Exception{ SerializeTest st = new SerializeTest(66, 61); System.out.println("Before Write := " + st); System.out.println("\n Writing SerializeTest object to disk"); FileOutputStream out = new FileOutputStream("serialized.txt"); ObjectOutputStream so = new ObjectOutputStream(out); so.writeObject(st); so.flush(); System.out.println("\n Reading SerializeTest object from disk\n"); FileInputStream in = new FileInputStream("serialized.txt"); ObjectInputStream si = new ObjectInputStream(in); SerializeTest fromdisk = (SerializeTest)si.readObject(); /* x will be 0 because it won't be read from disk since transient */ System.out.println("After Read := " + fromdisk); } } Both C# and Java provide a mechanism for extracting specially formatted comments from source code and placing them in an alternate document. These comments are typically API specifications and are very useful way to provide API documentation to the users of a library. The generated documentation is also useful to share the specifications for an API between designers, developers and QA. Javadoc is the tool used by Java to extract API documentation from source code. Javadoc generates HTML documentation from the source code comment, an example of which is the Java 2 Platform, Standard Edition API Documentation25 which was all generated using Javadoc. Javadoc can be used to describe information at the package, class, and member and method level. Descriptions of classes and member variables can be provided with the options to add references to other classes, class members and methods. With Javadoc one can document Description of the method, Exceptions thrown by the method, Parameters the method accepts, Return type of the method, Associated methods and members, Indication as to whether the API has been deprecated or not, Version of the API the method was first added. The deprecated information is also used by the compiler, which issues a warning if a call to a method marked with the deprecated tag is encountered during compilation. Javadoc also provides the following information 14 automatically: Inherited API, List of derived classes, List of implementing classes for interfaces, Serialized form of the class, Alphabetical class listing, Package hierarchy in a tree format. Since Javadoc generates HTML documentation, it is valid to use HTML in Javadoc comments. There is support for linking the generated documentation with other generated documentation available over the web. Such linking is useful when one wants readers of the documentation to be able to read the API documentation from the related sources. Below is an example of how Javadoc comments are used. Java /** * Calculates the square of a number. * @param num the number to calculate. * @return the square of the number. * @exception NumberTooBigException this occurs if the square of the number * is too big to be stored in an int. */ public static int square(int num) throws NumberTooBigException{} C# uses XML as the format for the documentation. The generated documentation is an XML file that contains the metadata specified by the user with very little additional information generated automatically. The entire C# XML documentation tags have an analogous Javadoc construct while the same cannot be said for the Javadoc tags having C# XML documentation analogs. For instance, the default C# XML documentation does not have analogs to Javadoc's @author, @version, or @deprecated tags although such metadata can be generated by reflecting on the assembly, as Microsoft's documentation build process does. One could also create custom tags that are analogous to the Javadoc tags and more but they would be ignored by standard tools used for handling C# XML documentation including Visual Studio.NET. Also of note is that C#'s XML documentation when generated does not contain metadata about the class such as listings of inherited API, derived classes or implementing interfaces. 15 The primary benefit of an XML format is that the documentation specification can now be used in many different ways. XSLT stylesheets can then be used to convert the generated documentation to ASCII text, HTML, or Postscript files. Also of note is that the generated documentation can be fed to tools that use it for spec verification or other similar tasks. Below is an example of how C# XML documentation is used. C# ///<summary>Calculates the square of a number.</summary> ///<param name="num">The number to calculate.</param> ///<return>The square of the number. </return> ///<exception>NumberTooBigException - this occurs if the square of the number is too big to be stored in an int. </exception> public static int square(int num){} 3.1.8 Deterministic object cleanup To provide total control of releasing resources used by classes, C# provides the System.IDisposable interface which contains the Dispose() method that can be called by users of the class to release resources (like database or file handles) on completion of whatever task is at hand. Classes that manage resources such as database or file handles benefit from being disposable. Being disposable provides a deterministic way to release these resources when the class is no longer in use, which is not the case with finalizers in Java or C#. It is typical to call the SupressFinalize method of the GC class in the implementation of the Dispose method since it is likely that finalization by the runtime won't be needed since it will be provided explicitly via the Dispose method. C# also provides that via the using keyword releasing the resources used by classes occur in a more deterministic manner via the Dispose method. If a class is disposable, it is best to make usage of the Dispose() method idempotent (i.e. multiple calls to Dispose() have no ill effects) which can be done by providing a flag that is checked within the Dispose() 16 method to see if the class has already been disposed or not. The example below shows a program where a class keeps a file open up until the Dispose() method is called which indicates that the file no longer needs to be open. C# using System; using System.IO; public class MyClass : IDisposable { bool disposed = false; FileStream f; StreamWriter sw; private String name; private int numShowNameCalls = 0; MyClass(string name){ f = new FileStream("logfile.txt", FileMode.OpenOrCreate); sw = new StreamWriter(f); this.name = name; Console.WriteLine("Created " + name); } ~MyClass(){ Dispose(false); } public void Dispose(){ if(!disposed){ Dispose(true); } } private void Dispose(bool disposing){ lock(this){ /* prevents multiple threads from disposing simultaneously */ /* disposing variable is used to indicate if this method was called from a * Dispose() call or during finalization. Since finalization order is not * deterministic, the StreamWriter may be finalized before this object in * which case, calling Close() on it would be inappropriate so we try to * avoid that. */ if(disposing){ Console.WriteLine("Finalizing " + name); sw.Close(); /* close file since object is done with */ GC.SuppressFinalize(this); disposed = true; } } } public string ShowName(){ if(disposed) throw new ObjectDisposedException("MyClass"); numShowNameCalls++; sw.Write("ShowName() Call #" + numShowNameCalls.ToString() + "\n"); return "I am " + name; } public static void Main(string[] args){ using (MyClass mc = new MyClass("A MyClass Object")){ for(int i = 0; i < 10; i++){ Console.WriteLine(mc.ShowName()); 17 } //for }/* runtime calls Dispose on MyClass object once "using" code block is exited, even if exception thrown */ }//Main } The above idiom is practically the same as having C++ style destructors without the worry of having to deal with memory allocation woes, making it the best of both worlds. The non-deterministic nature of finalization has long been a hindrance for Java developers. This cleanup is practically the same as having C++ style destructors without the worry of having to deal with memory allocation woes, making it the best of both worlds. Calling the Dispose() method does not request that the object is garbage collected, although it does speed up collection by eliminating the need for finalization. 3.1.9 Delegates Delegates are a mechanism for providing callback functions. Delegates are similar to the function pointers in C or functions in C++ and are useful in the same kinds of situation. One use of delegates is passing operations to a generic algorithm based on the types being used in the algorithm. Another use of delegates is as a means to register handlers for a particular event (i.e. the publish-subscribe model). To get the same functionality as C# delegates in Java, one can create interfaces that specify the signature of the callback method such as is done with the Comparable interface although this has the drawback of forcing the method to be an instance method when it most likely should be static. To use delegates, one first declares a delegate that has the return type and accepts the same number of parameters as the methods one will want to invoke as callback functions. Secondly one needs to define a method that accepts an instance of the delegate as a parameter. Once this is done, a method that has the same signature as the delegate (i.e. accepts same parameters and returns the same type) can be created and used 18 to initialize an instance of the delegate, which can then be passed to the method that accepts that delegate as a parameter. Note that the same delegate can refer to static and instance methods, even at the same time, since delegates are multicast. The example below shows the process of creating and using an instance delegates. C# using System; //delegate base public class HasDelegates { // delegate declaration, similar to a function pointer declaration public delegate bool CallbackFunction(string a, int b); //method that uses the delegate public bool execCallback(CallbackFunction doCallback, string x, int y) { Console.WriteLine("Executing Callback function..."); if (doCallback == null) throw ArgumentException("Callback can't be null!"); return doCallback(x, y); } } public class FunctionDelegates { public static bool FunctionFoo(string a, int b) { Console.WriteLine("Foo: {0} {1}", a, b); return true; } } public class DelegateTest { public static void Main(string[] args){ HasDelegates MyDel = new HasDelegates(); //create delegate HasDelegates.CallbackFunction myCallback = new HasDelegates.CallbackFunction(FunctionDelegates.FunctionFoo); //pass delegate to delegate function MyDel.execCallback(myCallback, "Twenty", 20); } } // DelegateTest In the example above the use of a static delegate shields the client programmers from having to know how to instantiate the delegate object. 19 3.1.10 Boxing C# contains some pretty interesting innovations that make component development easier, such as its notions of boxing and unboxing. In situations where value types need to be treated as objects, the .NET runtime automatically converts value types to objects by wrapping them within a heap-allocated reference type in a process called boxing, while unboxing allows the value of an object to be converted to a simple value type. 3.1.11 Pointers and unsafe code Although core C# is like Java in that there is no access to a pointer type that is analogous to pointer types in C and C++, it is possible to have pointer types if the C# code is executing in an unsafe context, with a lot of runtime checking disabled which means that the program must have full trust (granted earlier) on the machine it is running on. This is necessary particularly in situations like interfacing with the underlying operating system, during interactions with COM objects that take structures that contain pointers, when accessing a memory-mapped device or in situations where performance is critical. The syntax and semantics for writing unsafe code is similar to the syntax and semantics for using pointers in C and C++. To write unsafe code, the unsafe keyword must be used to specify the code block as unsafe and the program must be compiled with the /unsafe compiler switch. While writing unsafe code in C#, one has the ability to do things that aren't typesafe, like operate with pointers. The code, of course, gets marked unsafe, and will absolutely not execute in an untrusted environment. To get it to execute, one has to grant 20 a trust, and if one doesn't, the code just won't run. In that respect, it's no different than other kinds of native code. The real difference is that it's still running within the managed space. The methods one writes still have descriptive tables that tell you which objects are live, so one doesn't have to go across a marshalling boundary whenever one goes into this code. Otherwise, when one goes out to undescriptive, unmanaged code (like through the Java Native Interface, for example), one has to set a watermark or erect a barrier on the stack. One has to remarshall all the arguments out of the box. Also using objects, extra care has to be taken about which ones one touches because the GC (Garbage Collector) is still running on a different thread. It might move the object if one haven't pinned it down correctly by using some obscure method to lock the object. C# has taken a different approach. It has integrated this into the language. Since garbage collection may relocate managed (i.e. safe) variables during the execution of a program, the fixed keyword is provided so that the address of a managed variable is pinned during the execution of the parts of the program within the fixed block. Without the fixed keyword there would be little purpose in being able to assign a pointer to the address of a managed variable since the runtime may move the variable from that address as part of the mark & compact garbage collection process. 3.2 C# and Interoperability Apart from the language differences that C# has with Java we also looked into the issues of interoperability in C#. These we have classified into three separate sections namely, Language Interoperability, Platform Interoperability and Standards Interoperability. Since all the languages on the .NET platform have a striking resemblance, we have given Language Interoperability its own separated section where 21 we compare .NET with Java. A part of our studies indicate that while C# scales better in terms of Language interoperability, and Java in Platform Interoperability, they both fail in terms of Standards Interoperability. 3.2.1 Platform Interoperability Java runs on any platform that has Java VM installed on it. Java code runs as Java Virtual Machine (VT) byte codes that are either interpreted in the VM or JIT compiled, or can be compiled entirely into native code. J2EE works on any platform that has a compliant set of required platform services (EJB container, JMS service, etc.,). All of the specifications that define the J2EE platform are published and reviewed publicly, and numerous vendors offer compliant products and development environments. But J2EE is a single-language platform. Calls from/to objects in other languages are possible through CORBA, but CORBA support is not a ubiquitous part of the platform C# presently runs only on the Windows platform. C# is implicitly tied into the IL common language runtime, and is run as just in time (JIT) compiled byte codes or compiled entirely into native code. Though the .NET core works on Windows only but theoretically supports development in many languages (once sub-/supersets of these languages have been defined and IL compilers have been created for them). .NET applications, where the code with references to the base class library, and other libraries, are loaded at runtime into the .NET execution engine, are mostly just-in-time (JIT) compiled by the .NET JIT compiler. This compiles the platform independent intermediate language opcodes to native code used by the processor where the JIT compiler is running. Using intermediate code and JIT compilation means that Microsoft has built into the framework the facility to allow .NET applications to be runnable on any operating system 22 that supports the .NET runtime. Moreover, the remoting architecture of .NET assumes that by default objects are passed by value, that is, when an object is passed to another machine, the actual data in the object is passed and a copy of the object is initialized on the remote machine and run there. If the assembly, i.e. the package that contains the object’s code, is not present on the remote machine, that machine will download it. This passing around of assemblies works because the code in each assembly is platform independent. A major selling point of Java technologies is that applications written in Java are portable across a number of operating systems and platforms. Sun officially supports Linux, Windows and Solaris but other vendors have implemented Java on a large range of platforms including OS/2, AIX and MacOS. Binary compatibility across platforms using similar Java versions is common except for situations involving bugs in various VM implementations. On the contrary, presently, C# is only available on Windows. Hence some of the .NET libraries, particularly the WinForms library that depend on the minute details of the Windows API don’t run on other platforms. Efforts are being made to port it to other platforms, including Linux and FreeBSD. Linux porting is being done as part of the Mono project29 developed by Ximian while the FreeBSD implementation is a Microsoft project codenamed rotor. A majority of the world's desktop computers, and a significant proportion of the world's mobile devices, will, however, support the .NET runtime because these computers and devices run Windows of one flavor or another. Microsoft is committed to providing .NET for all of its 32-bit operating systems as well as its 64-bit operating 23 systems of the future. This reduces the problems that were experienced going from 16-bit Windows to 32-bit Windows, or developing CE Win32 applications when one is used to developing NT Win32 applications. However, portability between the mobile devices and the computers, both running .NET architecture is still limited. A developer still has to pay attention to facilities of the target device (e.g. the limited screen real estate on Windows CE devices makes MDI (multiple document interfaces) applications unusable), but the .NET namespace for Windowing, System.WinForms, makes developing such applications simple. As a part of the solution to this problem .NET makes componentization a requirement: separating business logic into objects separate to the UI 'presentation' code. However, Microsoft hasn't ignored platform interoperability. The .NET libraries provide extensive capabilities to write HTML/DHTML solutions. For solutions that can be implemented with a HTML/DHTML client, C#/ .NET is a good choice. For crossplatform projects, which require a more complex client interface, Java is a good choice. .NET is Platform Independent to a limited extend. But the platform independence of .NET is different from that of Java. Java programmers who have developed enterprise applications for open distribution have had to use some code of C/C++. Distribution of the programs is one of the major problems that one faces. Though Java claims, "You write once and Run everywhere", that is true but the hidden part of ease of use is totally ignored. Since Java does not produce PE (Portable Exectuable) files; one has to use the Command-Line JITer (java.exe urclass.class) to start-up the program. So there is the option of either creating ugly batch files or interoperate the start-up code into C/C++ to make PE files. So the option one is left with is to interoperate the start-up code into 24 C/C++. This has 2 impacts; firstly one has to still learn C/C++ and secondly using C/C++ makes your code Platform Dependent! On the .NET Platform files are compiled into PE files (i.e. Dll's and Exe's). These are the formats that one is already used to using. Using the .NET code is no different than using the normal machine specific code and it doesn’t require learning any new language to perform certain operations. This makes the files truly Platform Independent at least on the windows platforms. 3.2.2 Standards interoperability By standards interoperability we mean all the standards like databases systems, graphics libraries, Internet protocols, and object communication standards like COM and CORBA, that the language can access. C# and Java both have restricted interoperability in this respect. Because of Microsoft’s business motivations and its own role in defining many of these standards, they support some and provide less support for others which compete with their own - for instance - CORBA, competes with COM and OpenGL competes with DirectX. Similarly, Sun's Java doesn't provide as good support for Microsoft standards as it could. C# objects, since they are implemented as .NET objects, are automatically exposed as COM objects. C# thus has the ability to expose COM objects as well as to use COM objects. This will allow the huge base of COM code to be integrated with C# projects, since NET is a framework, which can eventually replace COM. However, Microsoft has submitted C# to ECMA with an objective to present C# to the industry as a possible standard and hence gain support in ECMA for a process that will lead to a commonly designed language, which has a common language 25 infrastructure. By a common infrastructure it means, “the core set of class libraries that this specification entails, such that if other companies using other platforms implement it, they could reasonably expect to find those classes available to their programs”. When and if ECMA actually arrives at a standard for C# and a common language infrastructure, the result will be available under ECMA's copyright and licensing policies, which are truly open. Any customer will be able to license the ECMA C# standard, subset it, superset it without paying royalties. They'll be able take it and implement it on any platform or any device. That is something fundamentally different from SUN and other competitors who approached the standards bodies, with an intention to monopolize their proprietary languages. Thus the .NET platform is sort of an open platform; any vendor can create a compiler for it. All languages can be ported to .NET, (someday someone might even write a Java(tm) compiler for the .NET). On the contrary in Java, one has to depend on Sun to provide the compilers. This limits the chance of open competition between thirdparty developers. Just like on the native platform, there are compilers for C/ C++ from many vendors like MS, Borland etc., and the choice of choosing the compiler will be in the hands of the developer as the .NET Platform also invites vendors to develop their own compilers. Since third party vendors can create compilers, they are also free to develop third-party tools, so now the Developer will get a better choice of tools to suit his needs. 4.0 J2EE and .NET database connectivity and support Database connectivity has been a prime area of competition among the two major developers of integrated development environments. Sun Microsystems and Microsoft 26 both have spent great deal of resources in researching ideal way to let developers connect to databases. Today, almost all applications have database backend. These backend are not necessarily compatible with the platform that the application is developed for. It is important for users to get applications that work across various platforms in a unified and flawless manner. In order to achieve this kind of connectivity, Java came up with JDBC and Microsoft has developed ODBC and ADO/ADO.NET. Both, JDBC and ODBC/ADO has a great and long (in terms of Software years) history. Java does not have one extensive system to do perform all data related tasks. It performs different tasks using different modules. Relational database connectivity is done using JDBC, Offline data access and Persistent objects are provided by EJB Entity Beans and there is no support provided for Hierarchical data access. On the other hand, Microsoft has included all of the above usage in one single API called ADO.NET. Microsoft introduced ODBC as its first database connectivity API in 1992. ADO was introduced later on as an ActiveX Data Object. Microsoft also includes interfaces such as RDO and DAO. ADO was created as a more abstract layer of OLE-DB API. It included connection interfaces, commands and recordsets. The new .NET framework empowers ADO with connection ability to a data source that is in relational, XML or tabular form. The new ADO API is called ADO.NET. ADO.NET acts upon a “disconnected” database fashion. Since ADO.NET is just an interface, it needs to have an implementation. There are two implementations available. SQL Server .NET data provider: 27 This implementation is specific for .NET and only interacts with SQL Server 7.0 and higher. This implementation provides very efficient data access because of its close tie to the operating system and framework. OLE DB .NET data provider: This implementation acts on OLE DB concept and thus contains drivers to connect to almost any database. Often, for unfavorable databases, this implementation shows low performance. JDBC was introduced by Sun Microsystems in 1996. Java was designed to work on various operating systems and JDBC was also designed to work with many different relational databases. JDBC is an interface that needs to have a specific implementation for each database. 1) JDBC uses existing ODBC drivers for connecting to certain databases by using a JDBC-ODBC Bridge. This allows Java to use ODBC’s powerful grip on clientserver application, but at the same time defeats the goal of having 100% Java applications. 2) There is a similar implementation of JDBC that only provides a wrapper around a C/C++ driver provided by the database. In other words, this implementation provides a Java class that interacts with the existing C/C++ code and uses all C++ methods. This system provides performance such as that of a C/C++ program and at the same time provides interoperability among various databases. For a 3tiered implementation, this means that only the second tier server has to be properly configured to interact with the C/C++ driver properly. 28 3) JDBC also provides a 100% Java implemented implementation. This implementation is easy to use, but lacks high performance of a C/C++ driver. The later version of JDBC also focuses on “disconnected” database activities. It provides Rowsets, which work similar to ADO.NET’s datasets. It has also made changes in the way it executes SQL query. Instead of querying each SQL statement separately, it can send a block of SQL code to the server, analyze it and return several sets of tuples. 4.1 Differences and Similarities There are many Differences and Similarities between Database access using J2EE or .NET. Our focus in this project is going to concentrate on three major issues relating to database access. Architecture issues are mostly related to the original design of Java vs. Microsoft’s ability to tie .NET platform with its operating systems closely. Optimization is something that the SQL code writer has to focus on more than ADO.NET and JDBC users. In promoting .NET, Microsoft has given a lot of focus to its new offline database utility called DataSets. DataSets are smaller version of a relational database that lie in developer’s local machine and can be easily manipulated. Java does not provide a similar functionality through JDBC. This point is going to be beneficial to Microsoft and going to appeal to a lot of developers. 29 4.1.1 Architecture ADO.NET follows the architecture shown below: Presentation Windows Tier MyApp.Ex Forms e DataSe t Web I Forms E DataSe t Interne t Intrane XM t L Business Tier Data Object (Class) DataSe t Data Tier Data Adapter Data Adapter Business to Business (BizTalk, for example) Figure 4.1 ADO.NET Architecture ADO.NET has streamlined database connectivity issues by putting a middle tier of XML to interact with Web, Windows Apps or B2B applications. XML layer can in turn easily interact with Datasets and Data Adapter to gather data from the actual database. This architecture removes the requirements of needing to use different implementation for different databases, because XML can interact with any data tier. 30 JDBC follows the architecture shown below: Figure 4.2 JDBC Architecture JDBC architecture is a little more complicated than that of ADO.NET. It works on the same fundamental design paradigm. It has an application layer which contains separate ResultSets and interacts with the Data tier through a middle level of Data Management. JDBC architecture does not deal with XML for its tier connections. Because of this reason, a special Connection and DriverManger objects have to be involved in making a connection. DriverManager module chooses which driver to use depending on the underlying database. DriverManager will choose an Oracle Driver if the database that the application wants to connect to is an Oracle database. If ODBC has an existing driver for a particular database, DriverManager uses that driver by passing the query through JDBC-ODBC Bridge. It is important to note that JDBC does not provide 31 the same amount of functionality as ADO.NET provides. In order to achieve similar functionality, we have to also analyze and include a separate architecture for EJB with JDBC. 4.1.2 Optimization JDBC lets you use “prepared statements” to optimize the code. Prepared statements are regular SQL statements that accept input parameters. In other words, prepared statements prevent from having to recreate and rerun each SQL statement multiple times. Prepared Statement follows a basic template of code shown below: // Only the execution of the request changes. PreparedStatement stmt = cnx. Preparedstatement (“Select name, first_name from gurus aged WHERE <? “); Stmt.setInt(0, 26); ResultSet rs = stmt.executeQuery( ); It is obvious from the above code fragment that Prepared Statements make it easier for the programmers to reuse the queries. ADO.NET has a similar object. This functionality is achieved by Command Object. Command Object is an OLE-DB command and it plays the same role that Prepared Statement plays for JDBC. The parameter passing in SQL statement is also similar. ADO.NET code is shown below in C# syntax: //Only the execution of the request changes. OleDbCommand cmd = OleDbCommand new (“SELECT Name, First_Name FROM Gurus aged WHERE <? “, cnx); cmd. Parameters. Add(new OleDbParameter (“age”, 26)); OleDbDataReader reader = cmd.ExecuteReader(); As far as optimization is concerned, both ADO.NET and JDBC have equal amount of capacity. Optimization depends a lot more on network bandwidth, SQL query optimization and reuse factor involved in each query. For example, having a query that returns large amount of data is going to slow down the application no matter if it is using 32 ADO.NET or JDBC. Optimizing the actual SQL statements will affect JDBC and ADO.NET both in a positive manner. 4.1.3 Offline database access A ResultSet is a Java object that contains the results of executing an SQL query. In other words, it contains the rows that satisfy the conditions of the query. ResultSets can be viewed as small locally stored databases. ResultSets allow various operations to be performed upon them. It provides methods to get the current row, get next row, or get nth row of the ResultSet. ResultSets are a tabular collection of data. ResultSet only contains values from a database table that satisfy the conditions of a query. In previous versions of JDBC, the datatype that ResultSets could use had to be atomic. The newer version of JDBC drivers allows users to have non-atomic datatypes in ResultSets. There are different levels of functionality in a ResultSet. Scrollable ResultSet can be moved forward and backward. Certain ResultSets can reflect the changes made into them. Each ResultSet has some overhead attached to them depending on their functionality. Since ResultSet acts as a local copy of database query, it has to be able to perform certain database related tasks. ResultSets allow developers to Insert, Delete and Update rows. These rows are subsequently updated in the underlying database at appropriate times. The new updater method makes it possible to update a row in the ResultSet without using SQL commands. A Java program can update the contents of ResultSet by passing new values as simple parameters and calling Java functions. The following code fragment shows how UpdateInt function works: Rs.UpdateInt(3, 88) //The value in column 3 of rs is set to 88. A column can also be updated by using the column name in the following manner: Rs.updateInt(“SCORES”, 88); 33 The updater methods do not update the underlying database directly. It is the UpdateRow method that actually does a database update. One of the limitations is that, if an update in Recordsets has to be stored back in the database, it has to call UpdateRow function immediately. If the pointer that points to a location in the RecordSet moves, all corresponding updates will be lost. In other words, RecordSet cannot effectively store all changes to its data and finally update the database at the end. RecordSet also allows to delete a row from the RecordSet and from the database. When DeleteROW method is called, it immediately deletes the current row from RecordSet and also from the underlying database. Not having to use SQL to delete makes it easier for developers to work with Recordsets. New rows can be inserted into a result set table and into the underlying database table using new methods in the JDBC 2.0 API. Inserting a row is a two step process. Step one only adds the row to a staging area of the RecordSet. It basically creates a new space in the RecordSet and Updates the new row with the new information. Once all rows are inserted, second step is to call insertRow method. This method will insert the row into the ResultSet as well as Underlying database. Depending on the type of ResultSet, some of them have to be closed specifically at the end of their usage. Microsoft wanted to find a creative solution for a limitation of having small bandwidth for data transfer. They were working under a philosophy that a data transport vehicle should only behave as a vehicle and should not have to hold modified values. For this reason, they came up with an idea of localized database that can hold all modifications made by an application and update the database in one instance. The ADO.NET dataset, represented below is a data construct that can contain several 34 relational Rowsets, the relationship that links those Rowsets and the metadata for each Rowset. The Dataset also tracks any changes made to any of the Rowset and saves the original and new values for each change. DataSet can be exported to XML or created from XML document, thus enabling increased interoperability between applications. In a DataSet, XML is the main medium of communication. In other words, Dataset is generated using an XML document that ADO.NET retrieves from the database. Also, it places the data back into the database using XML. ADO.NET uses DataAdapter module to execute SQL commands at the data source to both load the DataSet with data, and reconcile changes made to the data in the DataSet back to the data source. Just like ResultSet, DataSet allows Insert, Update and Delete commands to modify the data while being offline. DataSet is actually one abstraction level higher than the ResultSet. Dataset can contain many ResultSets. A single DataSet can store values from various table, views or queries. DataSet can be setup to behave like a relational database. At code level, DataSet is equally easy to use as RecordSet. For example, in order to create a Dataset, Visual Basic programmer only has to have one line of code as show below: Dim custDS as DataSet = New DataSet(“CustomerOrders”) Adding a new DataTable to the DataSet also takes only a few lines of code. In that sense, DataSet are much more useful than JDBC RecordSets. They truly provide relational database capabilities even while being offline. The XML support provided by DataSet greatly increases its interoperability. DataSet also allow another DataSet to be created based upon the schema of one of them. For all these reasons, DataSet has become the selling point for .NET platform. 35 4.2 Example and Code discussion The biggest advantage of using ADO.NET versus older version of ADO or Java is the availability of Datasets. As explained earlier, Datasets are temporary databases, which hold values from a particular query. DataSet designer in Visual Studio.NET is a very intuitive and easy to use interface. As shown below, it shows all of the tables and their columns for a particular dataset. It also shows queries and relationships between tables. Figure 4.4 (MSDN Magazine – November 2000) Loading DataSet can be done in many different ways. DataSet can be loaded using the object model directly. A DataSet can also be populated using the DataSetCommand object of a managed provider, or by loading XML in directly. The third approach provides a great amount of interoperability for dataset. The following code shows how to load a DataSet using DataSetCommand. Dim myDS as DataSet 36 Dim MyDBCommand as SQLDataSetCommand Dim myCustomer as DataRow myDS = new DataSet myDSCommand = new SQLDataSetComand (“SELECT * FROM customers”, _ “server=localhost; uid=sa;pwd=;database=northwind’) myDSCommand.FillDataSet(myDS, “customers”) For Each myCustomer in myDS.Tables(0).Rows Console.WriteLine(myCustomer(“CustomerID”).ToString( )) Next In the above example, a connection object to the NorthWind database is made by the dataset. After retrieving selected data, the dataset has no knowledge of the Data source. After loading the data and schema in the DataSet, the connection is terminated. DataSet gets its disconnected property because it only keeps the connection open until query runs and all resulting data has been loaded in the DataSet. Since datasets work as databases, certain relationship has to be established between different tables. The following code adds a single DataRelation to the DataSet object’s relations collection. The first argument specifies the name of the relation. The second and third arguments are the DataColumn objects that link the two tables. ‘In Vsual Basic Dim Ds As DataSet = New DataSet (“CustomerOrders”) Ds.Relations.Add(“CustOrders”, Ds.Tables(“Customers”).Columns(“CustomerID”), _ Ds.Tables(“Orders”).Columns(CustomerID”)) Dataset allow navigating within different tables easily. This allows you to retrieve all related rows of data from related tables if you have one row of data from a particular table. For example, if you have information from Customer table, you can retrieve all of the orders for a particular customer from the Orders table. The following code shows this effect: Dim Cust As DataRow = ds.Tables (“Customers”).Rows(0) Dim orders( ) as DataRow = cust.GetChildRows(ds.Relations(“CustomerOrders”) Console.WriteLine (“Total records for custOrders = “ & “ orders.Length.ToString) 37 4.3 Interoperability using XML with datasets One of the most important design goals of ADO.NET was powerful XML support. XML has become a standard output method for many systems. Because of the use of XML in web based technologies and Database systems, it was a focus of the developers of .NET framework to tightly extend XML support for all database related activities. The Database side of .NET – SQL Server 2000 – offers close XML support. It is possible to query SQL Server 2000 database and retrieve the results in XML format. .NET comes with XML Framework parser, which parses and interprets incoming or outgoing XML strings. DataSet has methods that can both read and write XML. For reading XML in, XmlReader object is used along with ReadXML method. For writing XML out, the XML Framework uses XmlWriter is employed. Regardless of where the data is originated, DataSet can save out its contents as XML. The schema is encoded as XSD, and the data is encoded as XML. When loading XML into the DataSet, some rules are followed regardless of how Schema was created for a DataSet. Elements with a certain name are mapped into the DataSet table of the same name. Attributes and scalarvalued sub-elements are mapped into columns of that table. The schema of the table is expanded as appropriate if the columns aren’t already in the DataSet, or if the DataSet doesn’t already contain a table by the same name. Because of the many built in functions to do most of the work, the code to actually load data into a DataSet from XML is very simple: Dim r as StreamReader = New StreamReader(“foo.xml) Dim ds as DataSet = New DataSet Ds.ReadXml(r) 38 XML has inherent flaw of presenting data in a single dimension. In other words, XML cannot preserve the hierarchy or document characteristic of a database query. When it is loaded in the DataSet, it is loaded with all those things taken off. The process of loading XML into DataSet tables is known as shredding [Gazzit00]. Shredding prevents DataSets from being filled up with data that is not appropriately ordered or hierarchied in XML. When changes are made to the data using DataSet functions, the newly generated document has same structure and design as the original, except the data values have changed. XmlDataDocuments provide the same functionality in ADO.NET. When XML document is loaded, the relational subset can be obtained using the DatSet property on the XmlDataDocument. The following code shows how to use DataSet property used on the XmlDataDocument: Imports System.NewXml Dim r As XmlTextReader Dim doc As xmlDatadocument Dim da as Dataset r = New XmlTextReader (“foo.xml”) doc = New XmlDataDocument doc.LoadDataSetMapping (“foo.xad”) doc.Load (r) ds = doc.Dataset XML support is very crucial in any database related experience, since it is the ultimate way to make interoperable applications. Different databases are often in use at large corporations, and the scripting languages are quite different for them. XML provides a common format for data to be transferred. In 1998, with introduction of Oracle 8i, Oracle introduced XML support for its databases. Since then, all subsequent versions of Oracle’s databases have provided growing support for XML. Oracle contains its own XML parser written in Java [Walter98]. Oracle can take data coming in the database in the form of XML document, or it can output the result of a query in XML 39 format. Oracle’s ConText engine allows searches to be restricted on a part of a stored XML file. It is possible to form a query based on XML tags [Walter98]. The following code shows how simple it is to form a very accurate query using XML tags SELECT SUM (Amount) FROM Claim_Header ch, Claim_Settlements cs, Claim_Settlement_Payments csp WHERE csp.Approver = 'JCOX' AND CONTAINS (DamageReport, 'Arson WITHIN motive') > 0 AND CONTAINS (DamageReport, 'Fire WITHIN Cause' ) > 0 AND . . . /* Join Clauses */ [Walter98] Oracle is planning to come out with the level XML support that will allow XSL sheets to be formed on demand based on user inputs and the resulting XML. Since XML has become a part of Oracle’s database management systems, it makes a Visual Basic or VC# programmer’s life much easier and allows him/her to use .NET as a complete development environment. So far, many Visual Studio developers felt a need to always use SQL Server 2000 for their database backend because of the ease of integration, but with the great XML support provided by .NET’s development environment and by Oracle, it is possible to use multiple database backends. The following image shows how interoperability can be used with multiple database systems: .NET developed Application SQL DB XML Fig 4.5 Oracle XML DB Database Interoperability As seen in the above image, an application can work with both SQL and Oracle DBs equally well. An application developed in .NET can interact with SQL DB, retrieve information in XML format, process it and save that information in an Oracle DB by 40 converting it in XML format. Added support for XML in ADO.NET’s DataSet object makes it easier to save information from a web page to DataSet to an Oracle or SQL DB. 5.0 Security issues concerning .NET and Java This section will discuss the security issues of .NET and J2EE. There will be detailed discussions of how each platform supports and maintains secure development and execution environments. The main components of these platforms that keep code, data, and systems safe from inadvertent or malicious errors are code based access control, role based access control, code verification and execution, secure communication, and code and data integrity. Code based access control is giving permissions to code to access resources based on application or protection domain the code is assigned to, and evidences of the code. Role based access control is giving permissions to a user to access resources based on the user’s role in the system. Code verification and execution means to check the semantics, analyze bytecode, and to keep code execution within a domain. Secure communication is being able to pass data and messages locally or remotely in a secure manner, as to avoiding data modification or other such hacks. Code and data integrity is making sure code hasn’t been modified without authorization by using cryptographic solutions and signed distribution files. 5.1 Differences and Similarities The differences and similarities of the security aspects outline below will be the focus of this section. Both .NET and J2EE have ways of dealing with these issues, sometimes they are implemented the same way, and sometimes they have totally different 41 architectures. An example being .NET uses the Windows SSPI and IIS for secure communications, Java provides JSSE for a more flexible solution. 5.1.1 Code Based Access Control Code access means the resources that a piece of code can access. Code access control is the security that allows and prevents a piece of code to access resources. The determination of what a piece of code is allowed to do is decided by the origins and intentions of the code. A main component of the CLR’s security ability is code access control. Code access control is broken down into evidence based security, permissions, and a security policy. The CLR reviews evidence of an assembly, determines an identity for it, and then looks up and grants permissions based on the security policy for that assembly identity. Evidence based security is when the CLR examines the assemblies to determine their origins. At runtime the CLR looks at the metadata of the assemblies and finds out where the code originated, the creator of the assembly, and the URL and zone the assembly came from. A zone is a concept in .NET that represents what domain the assembly is from like the Internet, a LAN, or the local machine. The association of the metadata and its corresponding assembly is also verified by the CLR. Permissions are the result of code access control in the CLR. A permission assigned to a piece of code is the allowance to execute a certain method or access a certain resource. An assembly will request permissions to execute, and theses requests are answered at runtime by the CLR. The CLR will throw a security exception if an assembly’s request is denied. Sometimes a request may not be entirely denied, and the CLR will give the assembly lower level permissions than requested. Because there are so 42 many different permissions that can be requested, they are grouped into sets where the permissions of each set have the same level of security and trust. An assembly originated from the Internet zone may be granted an Internet permission set that pertains to untrusted code. Alike permissions are placed into sets based on similar allowances, code is also grouped based on similar requests and zones. The security policy is responsible for code grouping. There are three security policies in .NET. There is one for the total enterprise, one for the machine executing the code, and one for the requesting user. Any policy file may limit the permissions of another policy file, but can’t entirely restrict all permissions. Each security policy groups code into hierarchal categories based on the identity that the CLR determines from the evidence in the metadata. Once the code is grouped and categorized, the security policy can determine permissions for the assembly. The permission decisions are made by the policy that an administrator sets for assemblies and domains. The .NET configuration tool or the Code Access Security Tool (Caspol.exe) can be used to do this. Code based security in Java is implemented through the JVM, the class loader, and the Security Manager and Access Controller. The JVM is a secure runtime environment. It manages memory by dynamically allocating different areas for use, isolates executing code, and does array bounds checking to name a few things that make the JVM secure. The class loader is actually a hierarchy architecture that which is many instances of class loader objects that load non-essential classes. The root of the hierarchy is the primordial class loader that loads the base classes. The hierarchy is supposed to prevent unauthorized and untrusted code from replacing any code in the base classes. 43 The Security Manager and Access Controller examine and implement the security policy. Permissions are determined by the security policy at runtime. As with .NET, permissions are granted by the security policy based on evidence. Java looks for a publisher signature and a location origin. Permissions are also grouped into protection domains and associated with groups of classes in Java much the way they are grouped into permission sets and associated with code groups in .NET. Classes are grouped by their origin the same way that code is categorized in .NET by the assembly’s zone. J2EE doesn’t automatically do code access control like .NET. By default, the Security Manager isn’t used so it is up to the user to supply implementation of the Java API classes that perform resource access control. Java only has two security policy levels, one for the executing machine, and one for the user. Each level can expand or restrict on all the permissions of another level, and there can be multiple policy files at each level. Although it’s easier to create custom permissions in Java, .NET provides a lot of standard permission sets, all of which would have to be custom made in Java. The evidence that .NET looks at to establish identity to grant permissions is much more informative than the evidence that Java uses, thus the user must have stronger credentials to be granted permissions on the .NET framework. The togetherness of .NET and Windows allows for the greater permission and evidence sets. J2EE does allow more configurable security policy levels, but without all the permission sets that .NET provides, is like having one talented player but it takes the entire team to win the game. 5.1.2 Role Based Access Control The policies and permissions of code based access control also apply to role based access control, with the difference that the permissions of a policy are now applied to a 44 user or role. A role is some logical grouping of users like administrators and guests. They all have a different set of privileges to do certain operations. .NET uses role based security to establish an identity, also known as authenticating a user, and to give that identity access to resources, known as authorization. .NET applies the term ‘principal’ to role membership, and permissions of role based security are managed by the PrincipalPermission object. .NET uses many plug-in authentication modules. The standard modules are Windows authentication, Passport authentication, Form authentication, IIS, and impersonation. Windows authentication is the Windows OS supporting authentication mechanisms used by applications through the Security Support Provider Interface. Passport authentication is an authentication service owned and provided by Microsoft. Passport is a centralized service that requires only a single logon for members of sites that use it. It’s similar to forms authentication, but already implemented by Microsoft. Form based authentication is from HTTP and is known as ‘cookies’. Unauthenticated requests from clients on the web are redirected back to themselves with an HTML logon form. The form requests user logon credentials that will be sent to the application server once submitted. If the request is authenticated, then ASP.NET issues to the client a key to reacquire the client’s identity in the form of a cookie. IIS server contains built in authentication mechanisms like Basic and Digest authentication and X.509 Certificates with SSL. These mechanisms are used to authenticate identities to applications hosted on the IIS server. 45 Impersonation authentication is supported by .NET on the OS level. Impersonation allows an application to access another application using a different identity while maintaining responsibility to the original user. J2EE uses Java Authentication and Authorization Service (JAAS) for role based security. JAAS is an integrated package that implements a Java version of the Pluggable Authentication Module (PAM) framework. Using JAAS, developers are allowed to modify and then plug-in authentication modules. JAAS currently supports authentication methods including Unix, JNDI, and Kerberos. JAAS can only provide limited impersonation authentication because the user identity is different for application and OS levels. Java servlets also support authentication through all the HTTP methods (Basic, Digest, and form). .NET offers more specific control of role based security by supporting role permission checking declaratively and imperatively. Java servlets provide user access checking declaratively at the servlet level, EJB’s provide user access checking declaratively down to the method level, and JAAS provides user access checking imperatively within methods. The different J2EE technologies do the same as .NET, but aren’t as fine grained as .NET based on the environments that the authentication mechanisms are expected to run on. .NET is expected to run on a Windows platform, and IIS is the only supported server of the .NET framework. This expectancy limits the flexibility of .NET to maintain such fine granularity of role based access control on other platforms. IIS also doesn’t support application level authentication because authentication is done at the OS level; again limiting flexibility. Passport authentication and authorization requires that users are members of the Microsoft Passport service to 46 access some .NET applications. When Passport has to be shutdown because of a bug or security breach, there isn’t any other alternative for authentication, and Microsoft has a copyright preventing any non-Microsoft vendors from producing one. 5.1.3. Secure Code Verification and Execution Verifying code and executing it within a safe environment is the best way to prevent system weaknesses from being exposed by an application error, malicious or not. .NET and Java both perform security checks during the code execution process. Stack integrity, bytecode structure, buffer overflows, method parameter types, semantics, and accessibility policies are all checked and verified against the executing code. These checks are implemented differently because of the differences in the executing environments. .NET compiles bytecodes into IL and inserts the code checks at certain locations. Java interprets bytecodes and has a bytecode verifier traverse the bytecodes before it goes to the JIT or JVM. .NET and Java use the concept of a ‘sandbox’ for code execution. The sandbox is the analogy of a kid that can only play with the objects in the sandbox unless given permission to use objects outside of the sandbox. The sandboxes of .NET and Java are called the Application Domain and Protection Domain, respectively. The Application Domain in .NET applies static boundaries to its execution environment. An application domain will contain all the loaded classes of an assembly, and multiple assemblies may be loaded into the same application domain. There is no way for an object to directly reference another object in a different application domain and must be done remotely. The Protection Domain in Java applies dynamic boundaries to its sandbox. The source of the code and security policy determines what Protection Domain a class is loaded into. 47 Because of the hierarchal class loader structure, an object can access another object in another domain as long as they were both loaded from the same class loader. The Protection Domain is more flexible as far as what executing code can do, but this flexibility depends on how sophisticated the developer such that there isn’t a security breach. The Application Domain supplies a fixed solution to the executing environment, taking this privilege and burden away from the developer. Another privilege that is also a burden that developers must deal with is unmanaged code. All the security checks to verify code are done on managed code in a managed environment, the CLR and JVM. .NET and Java allow for unmanaged code to bypass the CLR and JRE, respectively. .NET provides a method to access legacy applications and code outside the CLR, and Java provide the JNI to be used. 5.1.4 Secure Communication Sensitive data across remote communications to the system need to be secure. Secure communications in .NET and J2EE are done at the application level. Both .NET and J2EE support Secure Sockets Layer (SSL) and Transport Layer Security (TLS). These protocols determine what cryptographic algorithm and session keys to be used. .NET applications can use the Windows SSPI, but only as unmanaged code, which as discussed above is a big potential problem. Microsoft wants its users to use IIS. IIS does support SSL and TLS, but it uses files to transfer messages, which isn’t the most efficient way. Java provides Java Secure Sockets Extensions (JSSE) for implementing secure communications. JSSE uses SSL and TLS to create a secure connection using sockets (SSLSocketFactory), and can use this connection for remote method invocations (RMI). 48 JSSE is more configurable and flexible than any .NET solution for creating secure communications. .NET developers have a choice of either running unmanaged code to use the Windows SSPI or use IIS for creating secure communications, and based on the history of attacks on IIS servers, that isn’t to safe either. 5.1.5 Secure Code and Data Protection Systems must uphold code and data reliability and validity to be secure. Code loaded by the system must supply evidence of its source, version signature, and proof that there hasn’t been any unauthorized modification; the same is applicable to data. .NET and Java both provide ways of maintaining code and data integrity. .NET uses and extends the Windows Crypto API to make cryptographic solutions very configurable. Java provides Java Crypto Extensions (JCE) and the Java Crypto Architecture (JCA) for cryptographic functionality. JCE supports key exchange, uses message authentication algorithms (MAC), and makes it easier for developers to implement different encryption algorithms. Signed distribution files are necessary to verify code and data sources. Certificate management for signing is provided by both .NET and Java. .NET uses strong-named assemblies that include the assembly name and version information. The assembly is signed with an RSA keypair nullifying the chance of unauthorized modification. Including the version information is a benefit towards avoiding DLL conflicts in execution. Java’s JAR files are sealed and all the class files that it contains are individually signed. An unsigned class maybe added to a JAR file, but not to a package within a JAR file so private data and methods of a package can’t be accessed. JAR 49 manifest files don’t require version information as .NET assemblies do, allowing possible DLL conflicts. Java’s cryptographic capabilities are more flexible than .NET’s primarily because .NET’s encryption functionality is tied to the Windows API which is a problem if the developer is using another OS. Both .NET and Java handle signed distribution files well. The criticism of code and data protection in both .NET and Java is that it isn’t difficult to reverse engineer the crypto algorithms and security protocols because both are based on published portable executable standards, i.e. bytecode. It isn’t hard to reverse engineer any bytecode, and third party vendors have solutions to fix this, but it isn’t a standard for now. 5.2 Example and code discussion Role based security allows developers to provide resource access to specific users and groups at a fine grained level. The two most important classes for implementing role based security are Identity objects and Principal objects. An identity object signifies a single user and contains the user name and an authenticating security provider. An example of an identity object could be a user logged into a domain authenticated by Windows, Kerberos, or another security provider. A principal object represents the combination of an identity and the roles that identity can assume; such as an administrator or guest. The identity object contains a name and an authentication type. The name can be a user name or a Windows account name. The authentication type can be any supported logon protocol like Kerebos or some custom value. There are different types of identity objects to allow greater control of users logging into different domain types. The 50 GenericIdentity object is used for custom logons, and the WindowsIdentity object is specialized for Windows authentication. There are other standard authentication modules that .NET uses, and so there are also FormsIdentity and PassportIdentity objects. Access to the name and authentication type attributes of identity objects are provided through the IIdentity interface. The IIdentity interface is implemented by all Identity classes. An application that grants privileges to users that aren’t in a Windows user group will use identity objects created with the GenericIdentity class or an other specially written class that implements the IIdentity interface. A common use for role based security is to check if a Windows user has permission to access a certain resource. The WindowsIdentity class is provided for this situation to contain the information about the authenticated Windows user. The following code example demonstrates how user attributes are obtained. Imports System Imports System.Security.Principal Module WindowsIdentityExample Public Sub Main() Dim CurIdentity As WindowsIdentity CurIdentity = WindowsIdentity.GetCurrent() Console.WriteLine(“Authenticated?: “&_ CurIdentity.IsAuthenticated) Console.WriteLine("Authentication: " & _ CurIdentity.AuthenticationType) CurIdentity.AuthenticationType) End Sub End Module The output of this code when executed is similar to this: Name: FACULTY\Lester Lipsky Authenticated?:True Authentication: Kerberos Anonymous?: False Guest?: False System?: False 51 The principal object represents the abilities that a user has in regard to what role can access what resources. It identifies users with associating roles. The abilities to access resources are granted based on the role(s) within a principal object. Like identity objects, .NET provides GenericPrincipal and WindowsPrincipal objects. The IPrincipal interface is used for accessing properties of principal objects like an associated Identity object, and determining if a user is a member of a certain role. All principal classes implement the IPrincipal interface. For instance, the WindowsPrincipal class implements the IPrincipal interface and extra methods to map Windows NT/2000 group memberships to roles. A principal object is associated to an application domain through a call context object, and there is a call context object for every thread in each process. Security checks can be done against principal objects imperatively or declaratively, or by direct access. Managed code can determine if a certain principal object is a member of a recognized role, has an accepted identity, or is impersonating an identity by acting in a role by imperative or declarative security checks. Values of principal objects can be accessed directly without a PrincipalPermission object. This is done by reading the values of the current thread’s principal or using the IsInRole method. The following code example represents a P/Invoke call into the Win32 API LogonUser to switch the identity of the currently executing thread. The identity switch will allow the thread to call into some .NET managed code. Declarative or imperative Identity role checking will be done to prevent unauthorized callers from executing privileged code. When the impersonation is finished and a privileged method is called, a method RevertToSelf() will be used to switch back to the regular identity before the 52 privileged method call. This code example is courtesy of Peter Bromberg at Eggheadcafe.com. This first class, Impersonation, takes care of the calls to the APIs. This class has one major function: to accept a machine name, an "authorized user" name to check against, and a new username and password to impersonate assuming the passed-in username is correct. In the Test Harness that accompanies the application, the authorized Machine\UserName or DOMAIN\UserName will be set for it to check against in the line, "If mWI1.Name <> strCurrentUserName Then Return 1". Since this function returns a Windows Identity token, we'll just send back a "1" to show that the call failed on the authorized user check so we can handle it in the calling code. For convenience, the ImpersonationContext object has been stored in the AppDomain (application domain) cache by the token integer converted to a string so that it can be accessed at any time. Imports System Imports System.Runtime.InteropServices Imports System.Security.Principal Imports System.Security.Permissions Imports PAB Public Class Impersonation <DllImport("C:\\WINNT\\System32\\advapi32.dll")> _ Public Shared Function LogonUser(ByVal lpszUsername As String, ByVal lpszDomain As String, ByVal lpszPassword As String, _ ByVal dwLogonType As Integer, ByVal dwLogonProvider As Integer, ByRef phToken As Integer) As Boolean End Function <DllImport("C:\\WINNT\\System32\\Kernel32.dll")> _ Public Shared Function GetLastError() As Integer End Function Public Function ImpersonateNewUser(ByVal strMachineName As String, ByVal strCurrentUserName As String, ByVal strNewUserName As String, ByVal strNewPassword As String) As Integer ' if a DOMAIN account, strMachineName should be the DOMAIN 'The Windows NT user token. Dim token1 As Integer 'Get the user token for the specified user, machine, and password using the unmanaged LogonUser method. 'The parameters for LogonUser are the user name, computer name, password, 'Logon type (LOGON32_LOGON_NETWORK_CLEARTEXT), Logon provider (LOGON32_PROVIDER_DEFAULT), 53 'and user token. Dim loggedOn As Boolean = LogonUser(strNewUserName, strMachineName, strNewPassword, 3, 0, token1) Debug.WriteLine("LogonUser called") Debug.WriteLine("LogonUser Success? " + loggedOn.ToString()) Debug.WriteLine("NT Token Value: " + token1.ToString()) 'First lets get the credentials of the actual user that's already in context: Debug.WriteLine("Before impersonation:") Dim mWI1 As WindowsIdentity = WindowsIdentity.GetCurrent() Debug.WriteLine(mWI1.Name) Debug.WriteLine(mWI1.Token) If mWI1.Name <> strCurrentUserName Then Return 1 ' Now set our IntPtr (token) to the new user whose token we want to impersonate: Dim token2 As IntPtr = New IntPtr(token1) Debug.WriteLine("New identity created:") ' now instantiate the new WindowsIdentity object: Dim mWI2 As WindowsIdentity = New WindowsIdentity(token2) Debug.WriteLine(mWI2.Name) Debug.WriteLine(mWI2.Token) 'Now go ahead and Impersonate the new user. Dim mWIC As WindowsImpersonationContext = mWI2.Impersonate() Dim sCurrentImp As String = mWI2.Token.ToString() System.AppDomain.CurrentDomain.SetData(sCurrentImp, mWIC) Debug.WriteLine("After impersonation:") Return mWI2.Token.ToInt32() End Function Public Function Revert(ByVal sToken As String) As Integer Dim mWI3 As WindowsImpersonationContext = System.AppDomain.CurrentDomain.GetData(sToken) mWI3.Undo() Debug.WriteLine("After impersonation is reverted:") Dim mWIR As WindowsIdentity mWIR = WindowsIdentity.GetCurrent() Debug.WriteLine(mWIR.Name) Debug.WriteLine(mWIR.Token) Dim retval As Integer = mWIR.Token.ToInt32() Return retval mWIR = Nothing End Function End Class The next class will act as privileged code. A function called “Hello” will return the name of the Identity that successfully accessed it. The Identity will be checked declaratively using Role:="PETER\DBUser")>” the “<PrincipalPermissionAttribute(SecurityAction.Demand, attribute. The code to imperatively check the Identity is commented out and looks like: 'If Not (MyPrincipal.IsInRole("PETER\DBUser")) Then ' Throw New System.Security.SecurityException("Unauthorized User") 54 'End If 'If Not (MyPrincipal.Identity.Name.Equals("PETER\tester")) Then ' Throw New System.Security.SecurityException("Unauthorized User") 'End If The Principal Policy of the current application domain is set to Windows Principal too. Imports System.Security.Principal Imports System.Security.Permissions Imports System.Threading Public Class Testme <PrincipalPermissionAttribute(SecurityAction.Demand, Role:="PETER\DBUser")> _ Public Function Hello(ByVal strText As String) Dim mWI1 As WindowsIdentity = WindowsIdentity.GetCurrent() Debug.Write(mWI1.Name) Return strText + " was run under permitted user:" & mWI1.Name & vbCrLf End Function Public Sub New() 'Get the current principal and put it into a principal object. AppDomain.CurrentDomain.SetPrincipalPolicy(PrincipalPolicy.WindowsPrincipal) Dim MyPrincipal As WindowsPrincipal = CType(Thread.CurrentPrincipal, WindowsPrincipal) ' NOTE: uncomment following lines and Comment out the ' <PrincipalPermissionAttribute(SecurityAction.Demand, Name:="PETER\tester", Role:="PETER\DBUser")> _ ' before the "Hello" function to do checks imperatively 'If Not (MyPrincipal.IsInRole("PETER\DBUser")) Then ' Throw New System.Security.SecurityException("Unauthorized User") 'End If 'If Not (MyPrincipal.Identity.Name.Equals("PETER\tester")) Then ' Throw New System.Security.SecurityException("Unauthorized User") 'End If End Sub End Class This class is the test harness that will put everything together and generate a form to test out the other classes. Imports System.Security.Principal Imports System.Security.Permissions Imports System.Threading Public Class Form1 Inherits System.Windows.Forms.Form Dim retval2 As Integer Dim retval As Integer Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click Dim mWI2 As WindowsIdentity = WindowsIdentity.GetCurrent() ' show original identity: TextBox1.Text += "Original Identity: " & mWI2.Name & vbCrLf ' Impersonate the required user: Dim ip As PAB.Impersonation = New PAB.Impersonation() retval=ip.ImpersonateNewUser(System.Environment.MachineName, TextBox2.Text, TextBox3.Text, TextBox4.Text) If retval = 1 Then 55 MessageBox.Show("UH-OH, Wrong user running app!") Exit Sub End If 'Show stored impersonation Token TextBox1.Text += "Impersonation Token: " & retval.ToString() ' Make the privileged call... Dim tD As Testme = New Testme() TextBox1.Text += vbCrLf & tD.Hello("Security") tD = Nothing ' Revert to original user Dim retval2 As Integer = ip.Revert(retval.ToString()) TextBox1.Text += vbCrLf & "Reverted: " & retval2.ToString() & vbCrLf ' Show current identity TextBox1.Text += "Final Identity: " & mWI2.Name ip = Nothing End Sub Private Sub Button2_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button2.Click TextBox1.Clear() Dim mWI1 As WindowsIdentity = WindowsIdentity.GetCurrent() TextBox1.Text += "Current Identity: " & mWI1.Name Dim tD As Testme = New Testme() TextBox1.Text += vbCrLf & tD.Hello("Security") tD = Nothing mWI1 = WindowsIdentity.GetCurrent() TextBox1.Text += "Identity: " & mWI1.Name End Sub End Class The user information in the example below is native to the machine the code was ran on. 56 Clicking the NO Security button will cause the code to attemps to instantiate the TestMe class and call its Hello method. This method call will fail, because the security checks will return a Security Exception: By filling in an Authorized user, a DBUser and password, and click the Security button, then the code calls the impersonation API: “ip.ImpersonateNewUser(System.Environment.MachineName, TextBox2.Text, TextBox3.Text, TextBox4.Text)” Since the call to the Hello Method is now being run under the required user / role identity, the call will succeed: 57 6.0 Interoperability In this section we will talk about the interoperability of .NET architecture as a whole. The couple of areas, which we have researched, are interoperability of languages and middleware interoperability. 6.1 Interoperability of languages By Interoperability of languages we mean level and ease of integration with other languages. Both the Java Virtual Machine and the Common Language Runtime allow you to write code in many different languages, so long as they compile to byte code or IL code respectively. However, the .NET platform has done much more than just allows other languages to be compiled to IL code. One of the advantages of .NET Platform is Language Independence. Since all the code on the .NET ultimately gets converted to IL, one can incorporate components written in any language supported by the CLR. An example of this is that the Framework classes (System Namespaces) that come with the .NET SDK have been written in C#, but you can use these classes from any language like VB.NET, Managed C++ without ever writing any extra code! NET allows multiple languages to freely share and extend each other’s libraries to a great extent. For instance, an Eiffel or Visual Basic programmer could import a C# class, override a virtual method of that class, and the C# object would now use the Visual Basic method (polymorphism). VB.NET has been massively 58 upgraded (at the expense of compatibility with VB6) to have modern object oriented features. Languages written for .NET will generally plug into the Visual Studio.NET environment and use the same RAD frameworks if needed, thus overcoming the secondary effect of using another language. C# provides P/Invoke, which is a much simpler (no-dlls) way to interact with C code than Java's JNI. This feature is very similar to J/Direct, which is a feature of Microsoft Visual J++. It does not matter if you use VB.NET, C#, Managed C++, COBOL.NET, Perl.NET or any other language on the .NET they all produce the same code. So now on the .NET all languages have the same efficiency and power. It is just the developers’ choice to choose the language whose syntax he is most comfortable with. Cross language interoperability is the ability to access constructs written in one programming language from another. There are a number of ways cross language interoperability works in Java. First of all, there is the Java Native Interface (JNI), which is a mechanism that allows Java programs call native methods written in C, C++ or assembly language. The C, C++ or assembly methods have to be written in a specific way to be called from Java. Native methods can use JNI to access Java features such as calling Java language methods, instantiating and modifying Java classes, throwing and catching exceptions, performing runtime type checking, and loading Java classes dynamically. Java also has the ability to interact with distributed objects that use the common object request broker architecture (CORBA) via Java IDL. CORBA is a technology that allows developers to make procedure calls on objects in a location and language agnostic manner. CORBA has a language-agnostic interface definition language (IDL), which has various mappings for languages that support CORBA. Java IDL supports the mappings 59 from Java objects to CORBA IDL objects. Various ORBs support CORBA language bindings for a number of languages including C, C++, Java, Python, Lisp, Perl, and Scheme. The most seamless way to do cross language interoperability in Java is when the language is compiled directly to Java byte code. This means that objects in that language are available to Java programs and Java objects are available to programs written in the target language. A good example of this is the Jython scripting language, which is a version of the Python programming language that is integrated with the Java platform. There are a number of projects in various degrees of completion that are aimed at providing a similar degree of cross language interoperability within the confines of the Java Virtual Machine. A list of languages retargeted for the Java Virtual Machine is available on the web page of Dr. Robert Tolksdorf37. Currently, Sun Microsystems (creators of the Java language and platform) seems to be uninterested in this level of cross language interoperability and has decided to leave this to independent developers and researchers. With seamless cross language interoperability, objects can inherit implementation from other types, instantiate and invoke methods defined on other types, and otherwise interact with objects regardless of the language the types are originally implemented in. Also tools such as class browsers, debuggers, and profilers only need to understand one format (be it Java byte codes or .NET instruction language) but can support a multitude of languages as long as they target the appropriate runtime. Also error handling across languages via exceptions is possible. C# and the .NET runtime were created with seamless cross-language interoperability as a design goal. A language targeting the .NET common language 60 runtime (CLR) is able to interact with other languages that conform to the common type system and when compiled include certain metadata. The common type system defines how types are declared, used, and managed in the .NET runtime thus creating a framework that allows for type safety and ensures that objects written in various languages can share type information. Metadata is binary information describing the assemblies, types and attributes defined in the application that are stored either in a CLR portable executable (PE) or in memory if the assembly has been loaded. Languages that are currently being developed to target the .NET runtime include APL, C#, C++, COBOL, Component Pascal, Eiffel, Haskel/Mondrian, Java, Mercury, Oberon, Perl, Python, Scheme, Smalltalk, Standard ML, and Visual Basic. Since it is very possible that certain features in one language have no analog in another, the .NET Framework provides the Common Language Specification (CLS), which describes a fundamental set of language features and defines rules for how those features are used. The CLS rules are a subset of the common type system that is aimed at ensuring cross-language interoperability by defining a set of features that are most common in programming languages. The C# compiler is a CLS compliant compiler meaning that it can be used to generate code that complies with the CLS. The C# compiler can check for CLS compliance and issues an error when a program code uses functionality that is not supported by the CLS. To get the C# compiler to check for the CLS compliance of a piece of code, mark it with the CLSCompliantAttribute(true) attribute. C# programs can also call almost any function in any DLL using a combination of the extern keyword and the DllImport attribute on the method declaration. A major 61 advantage of this is that the method being called does not have to be specifically written to be called from C#, nor is any "wrapper" necessary-so calling existing code in DLLs is relatively simple. 6.2 Middleware Interoperability As a final part of our project we did research and investigation about the middleware interoperability of .NET components. Particularly how .NET components interact with COM components, and its contrasting technologies in Java, Remoting in .NET vs. RMI/RPC, and to a certain extent how .NET components interacts with CORBA/JINI. 6.2.1 COM Microsoft has relied on COM over the past few years, so it was a necessity for .NET to interoperate with COM. .NET can access COM as a client, just as COM can access .NET as a client. A .NET client can access a COM server by way of a runtime callable wrapper (RCW). The RCW wraps the COM object and acts as the mediator between the COM object and the .NET CLR. This makes the COM object appear as a native .NET object to the .NET client, and the .NET client looks like a COM client to the COM object. Client Access through RCW 1 62 An RCW can be created by two different ways. One, through GUI forms in Visual Studio, and two, a command line tool called TlbImp.exe. Both ways use a .NET runtime class called System.Runtime.InteropServices.TypeLibConverter to read the type library and generate the RCW code. A COM client can also access a .NET object. A COM Callable Wrapper (CCW) can wrap the .NET object and mediate between it and the CLR. The .NET component’s assembly has to be signed with a strong name so that the CLR can identify it when it operates with the CCW. Com Callable Wrapper (CCW) Just like permissions set on all other code that runs inside the .NET framework, there is a way to limit what methods, interfaces, and classes a COM client can call. A metadata attribute called System.Runtime.InteropServices.ComVisible can be used on any assembly, class, interface, or method needed to allow or deny a request from COM. 6.2.2 .NET Remoting vs. RMI The model of distribution is skeletal in .NET so that Remoting can have the ability to specify the use of different protocols. This concept is called channels; as denoted by Microsoft. Because there are currently so many different message formats and protocols, Microsoft chose to use HTTP and TCP for .NET, rather than trying to continually adapt their middleware to constantly updated formats and protocols. .NET does allow custom channels to be created using a developers choice of message formats like IIOP for TCP. RMI uses sockets to connect a distributed model. The sockets use 63 IIOP and Java Remote Method Protocol (JRMP), Sun’s own design based on TCP sockets. A client and server in .NET connect directly using a specific port. A naming service isn’t used for services to register their proxies because the proxies are masked from the client. RMI supplies a RmiRegistry to store protocols of different servers. The RmiRegistry must be launched before any communication can occur between a client and server. The following is an example of an object, a server, and a client each implemented in C# and Java to demonstrate how Remoting and RMI differ. The first two pieces of code represent the object written in C# and Java respectively. Remoting makes it possible to directly implement a server interface in a class, whereas RMI requires the creation of distributed interfaces. C# using using using using System; System.Runtime.Remoting; System.Runtime.Remoting.Channels; System.Runtime.Remoting.Channels.Tcp; namespace NETRemotingSamples { /// < summary > /// Distributed object /// </summary > public class HelloServer: MarshalByRefObject { public HelloServer() { // activated with each call in "SingleCall" // activated once in "Singleton" (shared state) Activate distributed Console.WriteLine("Object!"); } public string HelloMethod(string name) { Console.WriteLine("HelloMethod: { 0 } ", name); return "Hello" + name } } 64 } Java package hello; /** Interface defining the distant service sayHello */ public interface Hello extends java.rmi.Remote /** Notice the raised exception: RemoteException */ public String helloMethod(String name) throws java.rmi.RemoteException; } Hello.java ____________________ Implementation ______________________ package hello; importation java.rmi.server. * // Class that implements the RMI interfaces Hello. public class HelloImpl implements Hello, UnicastRemoteObject { // You cuts to implement the method(s) have promised in // the corresponding RMI interfaces: public String helloMethod(String name) throws java.rmi.RemoteException { System.out.println ("Hello.helloMethod called"); return " Hello "+ name; } } In Remoting, the server contains public and private methods like any other object, except that it can be derived from an unspecified interface, and typing isn’t necessary. Any distributed interface in RMI must be typified, done through the class java.rmi.server.Remote. The next two code snippets are only the Main() methods of the servers written in C# and Java respectively. 65 C# public static int Main(string [ ] args) { // Declaration of the channels TcpChannel chan = new TcpChannel(8085); ChannelServices.RegisterChannel(chan); // Recording of the waiter RemotingConfiguration.RegisterWellKnownServiceType ( Type.GetType("HelloServer, helloserver"), "SayHello", WellKnownObjectMode.Singleton); System.Console.WriteLine("Hit < enter > to exit"); System.Console.ReadLine(); return 0; } Java public static void main(String args[ ]) { // Creation of the authority HelloImpl obj = new HelloImpl("SayHello"); // Recording of the object java.rmi.Naming.rebind ("rmi://localhost: " + port + "/SayHello ", obj); System.out.pr intln("Bound RMI object in registry"); } } The .NET client and Java client are abstractly alike. They both need to recover the proxy of the server so they can call a method from that distributed object. RMI uses the proxy generated by RMIC (a stub), and Remoting uses a transparent proxy concept to keep it masked from the client. The following two pieces of code represent the client for this application, written in C# and Java respectively. C# using System; using System.Runtime.Remoting; using System.Runtime.Remoting.Channels.Http; using NETRemotingSamples; public class HelloClient { public static void Main(String[ ] args) { 66 // passes the URL waiter and its type // a proxy using the activator recovers HelloServer helloProxy = (HelloServer) Activator.GetObject( typeof (HelloServer), "tcp://localhost:8085/SayHello"); // the call of the method is carried out helloProxy.HelloMethod("salut bilou!")); } } Java package hello; importation java.rmi. *; public class HelloClient { public static void main(String args[]) { Hello helloProxy = (Hello) Naming.lookup( "//" + host + ":" + port + "/SayHello"); String message = helloProxy.helloMethod("hello"); } } Table of Synthesis Proxy Skeletons Distributed object Configuration Distributed directory Addition of protocols Addition of formats Activation CustomProxy Existing protocols Management of error NET Remoting Dynamics Integrated into Framework Classes RMI Statics (rmic) or dynamics Integrated into Framework Remote Interfaces File XML System Property No (system interns containing RmiRegistry tables of objRef) Channels SocketFactoryImpl Formatters Serialization SingleCall, Singleton or Customer activated Custom RealProxy HTTP, TCP, SOAP API with dimensions waiter Activable objects Dynamic Proxy JRMP, IIOP, ORMI (Orion), T3 (Web Logic)... Remote Exceptions Remote Exceptions 67 6.2.3 CORBA CORBA is one of the most widely used middlewares today. Some of this growth is due to the fact that Java incorporated IIOP as the main protocol for EJB’s. Currently Microsoft doesn’t support the ability for .NET to internetwork with a Java/CORBA object using IIOP. If .NET can become interoperable with an object through IIOP, common internetworking would become more advanced. Also IIOP could be a good alternative to TCP remoting. CORBA is one of the most widely used middlewares today. Some of this growth is due to the fact that Java incorporated IIOP as the main protocol for EJB’s. Currently Microsoft doesn’t support the ability for .NET to internetwork with a Java/CORBA object using IIOP. If .NET can become interoperable with an object through IIOP, common internetworking would become more advanced. Also IIOP could be a good alternative to TCP remoting. 68 There are some complex ways to integrate CORBA and .NET. They’re complex because CORBA has its own definition language, IDL, and uses a stub compiler; a stub. The first solution is to have .NET contain a Java ORB. This means that Microsoft’s J# would have to be able to use the classes of the JDK version of the connecting Java object. So any .NET object will need to carry a special J# compiler and IDL compiler, and the ORB being used needs to be compatible with the JDK version too. Another solution is to create a new channel for .NET Remoting. A new channel using TCP/IIOP could be integrated easily into the .NET architecture. Creating a custom channel is supported by .NET Remoting so this is an easy way to integrate CORBA and use the other services supplied by Remoting. As easy as this sounds, currently there has not been one implemented yet. 69 A third solution is to use wrappers. For a .NET object to internetwork with a Java application server, an RCW can be used to call a COM object, which can then work with the Java server. In this situation the server needs only to communicate with the COM object and not carry anything special to do that. In the instance that a .NET server is to be accessed by a Java client, then a CCW is needed on the .NET server. As mentioned above, there are particular interventions placed on the .NET server to use a CCW. 7.0 Conclusions and future work The languages for .NET and J2EE platforms, particularly C# and Java both show striking resemblance. Our studies indicate that that both languages are similar enough that they could be made to mirror each other without significant effort by either user base, 70 if so required. We found out through our research that predicting which one is better in totality is a futile exercise. There are certain aspects like cross-language portability where languages in .NET are far better placed than Java. Similarly, Java has an advantage of cross-platform interoperability, one area where C# and all .NET languages fail miserably. Both languages have weaknesses in terms of standards interoperability. In terms of language features, most developers, especially those with a background in C or C++, would find that features like operator overloading, pointers, preprocessor directives, delegates and deterministic object cleanup make C# more expressive than Java in a number of cases. Similarly, Java developers who learn C# will be pleasantly surprised at features that are missing in Java once one uses them in C#, such as boxing, enumerations and pass by reference. On the other hand the lack of checked exceptions, inner classes, cross platform portability or the fact that a class is not the smallest unit of distribution of code makes the choice of C# over Java not so satisfactory. However, the true worth of a these languages will be seen from how quickly they evolve to adapt to the changing technological landscape and what kind of community surrounds the language. .NET is still in its initial stages. Sun has done a great job with Java although a lack of versioning support and the non-existence of a framework that enables extensibility of the language built into the platform make drastic evolution difficult. C# with its support for versioning via the .NET framework and the existence of attributes which can be used to extend the features of the language looks like it would in the long run be the more adaptable language. Microsoft has made database an independent issue for developers by providing integration of XML with ADO.NET. Regardless of the database used by the developers, 71 the code for querying the database can remain the same. As XML becomes more popular and widely used in the industry, usage of .NET interface to develop applications will naturally increase. Also, XML provides ultimate interoperability because of the support it has received by all major database manufacturers. DataSet is the major selling point for .NET. DataSet objects make database connectivity to be a non-expensive function by storing large amount of necessary data in local memory and thus not needing a continuous connection to the database. The security of the .NET framework is good in the right environment. A .NET distributed network running on top of Windows platforms is the best environment for .NET because of its dependencies and integration into Windows. Code based access control is a strong point of .NET. There are many standard permission sets and security policy files implemented by default. .NET also requires a lot stronger evidence information of an assembly than Java does. The role based access control of .NET is decent if used with a Windows OS and not to good if used with another OS. Both imperative and declarative checking can be done very easily in .NET, again only if a Windows OS is the platform. Code verification and safe execution is well done in .NET. Microsoft has taken away the privilege and burden of allowing objects to directly access other objects outside of their domain by making static boundaries. However, like Java, .NET does allow unmanaged code to be executed and this could be troublesome if used by an unsophisticated developer. Secure communication can be safely done if the developer can use SSPI well or continuously updates IIS patches. Windows supplies a crypto API for .NET to protect code and data which is good, but .NET doesn’t have this API to use on another OS. On a Windows platform .NET deals well with the most 72 common security issues, however needs improvement to deal with these issues on another OS. This paper is only an initial coverage of a few topics in .NET. C#, ADO.NET, and security issues were all researched and analyzed to a detailed level. There are many more topics within .NET that the same can be done for, ASP.NET and web services for instance. Java created a Java Pet Shop that entails every aspect of the Java language, and Microsoft has implemented the pet shop in the .NET framework and called it Pet Store. Another angle of future work could be to break down the Java Pet Shop and the .NET Pet Store and analyze them using the information in this paper and other topics like how reusable is the code in both applications. Research can be done on the CLR and Java. The CLR compiles any .NET language into IL before interpretation. Java isn’t a supported language of .NET, and the CLR doesn’t understand it. A prototype in an extension or plug-in format for the CLR to compile Java into IL could be designed. Perhaps the most complex and potentially rewarding future work to be done from this paper is to advance middleware interoperability. As far as Microsoft is concerned, .NET’s middleware interoperability is good because .NET can interact with COM and customization is supported in certain areas. Internetworking would be far more advanced if Microsoft provided an IIOP channel by standard. Unfortunately they don’t, but a prototype of a custom channel using TCP/IIOP to connect a .NET component with an EJB/CORBA component could be designed 73 and implemented as future work. Currently there aren’t any prototypes designed yet, and research is continuing on other ways to internetwork .NET and CORBA. Appendix A. Topic-wise Breakdown by Student A.1 Jaladhi Mehta Jaladhi was responsible for the research on C#. There were many topics worthy of comparison within C# with Java, but we limited him to what the group felt was important. The focus was on understanding C# as a language, comparing its features to Java, and analyzing how C# supports cross platform and standards interoperability. Jaladhi also was responsible for interoperability of languages in .NET. Because his original research was language, this topic became his when it was added to the project. A.2 Hardik Dave Hardik researched database connectivity primarily because he had experience with ADO.NET previous to this project. ADO.NET and JDBC had wide range of topics on Database connectivity. Focus of this paper was on DataSets as offline database access and XML as a medium of interoperability. Hardik worked on .NET Remoting and RMI to help Keith with middleware interoperability of .NET. A.3 Keith Bessette Keith was responsible for researching security issues within .NET. There are many security issues within any platform, but the focus was kept to main security issues that both .NET and J2EE cover. Keith researched the middleware interoperability of .NET too and found that this is one of the least researched areas of .NET. 74 1. References 2. [ADO.NET] http://msdn.microsoft.com/vstudio/techinfo/articles/upgrade/adoplus.asp 3. [ADO to XML] http://msdn.microsoft.com/msdnmag/issues/01/08/data/print.asp 4. [ADO.NET DataSet for Multitiered Applications] http://msdn.microsoft.com/msdnmag/issues/02/01/data/print.asp 5. [ADO.NET VS. JDBC] http://www.dotnetguru.org/article.php?sid=51 6. [C# ehhco ] http://www.ehhco.com/csharp/cintro.htm 7. [C# vs. Java] http://www.thetestplace.com/talks/csharpVsjava.ppt 8. [CLR Security] http://msdn.microsoft.com/library/default.asp?= url=/library/enus/dnmag01/html/CAS.asp 9. [Comparing Microsoft.NET and the Java Environment] http://www-wi.fhreutlingen.de/dbtech/lectures/dotNET_Chappell.pdf 10. [Dotnetguru] http://www.dotnetguru.org 11. [Gazzit00] Gazitt, Omri, http://msdn.microsoft.com/msdnmag/issues/1100/default.asp 12. [Genamics developer] http://genamics.com/developer/csharp_comparative.htm 13. [Gotdotnet] http://gotdotnet.com 14. [Gunnerson01] Gunnerson, Eric. A Programmer's Introduction To C#. Apress, 2001. 15. [J2EE vs. The .NET Platform] www.objectwatch.com/FinalJ2EEandDotNet.doc 16. [J2EE vs. Microsoft.NET] www.theserverside.com/resources/pdf/J2EE-vsDotNET.pdf 17. [J2EE/. NET] http://www.machrotech.com/DotNet/FinalJ2EEandDotNet.htm 18. [J2EE and .NET security] http://www.owasp.org/downloads/J2EEandDotNetsecurityByGerMulcahy.pdf 75 19. [Jdance] www.jdance.com 20. [Java 2, Enterprise Edition and .Net: Can We Live Together?] http://java.sun.com/features/2002/04/j2eenms.html 21. [JAVA/C#] http://www.csharp-station.com/Articles/JavaVsCSharp.aspx 22. [JAVA/C#] www.thetestplace.com/accu2002/slides 23. [Java access control mechanisms] research.sun.com/techrep/2002/smli_tr-2002108.pdf 24. [JDBC Architecture] http://octopus.cdut.edu.cn/~yf17/javaent/jenut/ch02_01.htm 25. [JDBC Specifications] http://java.sun.com/j2se/1.4/docs/guide/jdbc 26. [Microsoft Developer’s Network] http://msdn.microsoft.com/ 27. [Middleware Interoperability: COM] http://msdn.microsoft.com/msdnmag/issues/01/08/Interop/default.aspx 28. [Middleware Interoperability: CORBA] http://216.239.35.120/translate_c?hl=en&u=http://www.dotnetguru.org/articles/R eflexion/corbadotnet/CorbaDotNet.htm 29. [Mono Project] http://www.go-mono.com/index.html 30. [.NET Framework SDK Documentation] http://msdn.microsoft.com/library/default.asp?url=/library/enus/nfstart/html/sdkstart.asp 31. [.NET Application Domain FAQ] http://www.gotdotnet.com/team/clr/AppdomainFAQ.aspx 32. [.NET architecture and security] http://home.att.net/~s-prasad/MS.net.PPT 33. [.NET Remoting vs. RMI] http://translate.google.com/translate?hl=en&sl=fr&u=http://www.dotnetguru.org/ articles/Reflexion/RemotingDrawbacks/RemotingDrawbacks.html&prev=/search %3Fq%3Dremoting%2B.NET%2BRMI%26hl%3Den%26lr%3D%26ie%3DUTF -8 34. [Security in the .NET framework] http://www.microsoft.com/technet/treeview/default.asp?url=/technet/itsolutions/n et/evaluate/fsnetsec.asp 76 35. [Security in Java] http://java.sun.com/docs/white/langenv/Security.doc.html 36. [Software Engineers Put .NET and Enterprise Java Security to the Test] http://www.devx.com/enterprise/articles/dotnetvsjava/GK0202-1.asp 37. [Tolksdorf] Tolksdorf, Robert http://grunge.cs.tu-berlin.de/~tolk/vmlanguages.html 38. [Walter98] Walter, Mark, http://www.xml.com/pub/a/SeyboldReport/ipx981102.html 39. [XML Interoperability] http://www.devx.com/upload/free/Features/vbpj/2002/04apr02/da0402p.asp 77