Advanced Virtual Interconnect GRID - AviGRID Avinash Shankar School of Informatics and Computer Science University of Wollongong asn68@uow.edu.au Abstract GRID computing is revolutionizing the way in which end user computing is simplified and mass computing resources are utilized to execute programs faster with or without the knowledge of the end user. Due to network traffic or limited bandwidth, GRID computing still has a long way to go to achieve global acceptance and become a regular commodity such as the Internet. The major objective of any GRID system is to minimize the idleness of systems across networks and migrating processes across a variety of network topologies. The proposed Architecture is an extension of my previous work CBReM [2]. An extensible and scalable middleware or software layer is built around the Operating System such as UNIX and LINUX to make the Architecture some what platform independent. Keywords: AviGRID, GRID, Clusters, Process, CBReM, Wireless, Internet. 1. Introduction GRID computing over the past decade has become an item of acceptance and is fast becoming a medium, for processing resource intensive Applications like Rendering and Animations. Advanced Virtual Interconnect GRID or AviGRID is an extensible and scalable architecture which incorporates the use of Beowulf clusters as a test bed to migrate different types of processes and to develop a Semi – transparent end user session. It provides a middleware support for existing Operating Systems and uses a load sharing scheme [1] to effectively utilize and share common resources such as file storage and parallel execution of processes. A number of clusters are positioned at strategic location inside the campus and are connected using High Speed Gigabit networks [Wired or connection based transactions]. Each Cluster Server has a Wireless Hub that connects local mobile devices such as Laptops and PDA’s which are allocated IP addresses dynamically using either DHCP or ZeroConf type II configurations. 2. Overview – Current Technologies Communication over different types of networks becomes effective only when the over all latency of the data transmissions is effective. If the transmission or transfer rate is high then process migration is not effective and the process is best suitable for local processing. Most of the current GRID infrastructures face more or less a similar bottleneck. Hence it becomes cumbersome to build a global Grid infrastructure that can effectively use GRID computing as a base to migrate processes from different parts of the globe for effective distributed computing and parallel executions of program threads. The AviGRID uses a combination of wired and wireless commodities to effectively offer service components to the students of UOW. Instead of approaching a theoretic global infrastructure the AviGRID utilizes a Local approach that is more cost effective and would drastically bring down the idle time of systems operating at UOW. Another problem faced by most researchers is the use of GRID systems. Most GRID systems either serve a specific purpose like that of a scientific or research project which utilizes resources for tasks such as number crunching, the use of a Distributed File System with parallel execution of programs over a Distributed shared Memory scheme, Etc. The goal of any system is to provide uptime of hundred percent and also execute processes at the quickest possible time. Further more the complexities of writing parallel programs adds to the burden of the programmer who has to write applications specific to the GRID platform and utilize resources. In the next section we discuss about the various problems faced in designing a GRID system. 3. Objectives of AviGRID These are some of the effective design objectives that a system designer should follow in order to be able to develop an effective GRID system such as AviGRID. 1) Better Throughput: One of the primary reasons behind any new approach is that there should be some improvement in performance. This improvement in performance should be large enough to justify shifting from the current system. In a distributed system the challenge lies in utilizing all the idle resources of the machines on the systems in an effective manner in such a way that all tasks get done in less time. In the implementation of a distributed system the potential stumbling blocks are the latency that is introduced by the network. Maintaining current state information about the system is another problem which has to be addressed by system designers. Too many updates could flood a network with unwanted state information; too few updates could result in obsolete data which could make the system unstable. 2) Resource Sharing: The term resource here can mean anything from a hardware device to memory to CPU cycles to data processing or storage. An effective distributed system will try to utilize these resources in as effective as a manner as possible. The task here again is the ability to maintain all the state information to be as current as possible. AviGRID maintains a global table state that is common to all systems. Hence state information is always updated and based on the state information the scheduler CoED [5] migrates processes accordingly. 3) Scalability: This is a very important parameter about how well a system can scale up or down. This effectively means that a system should be able to adapt to various kinds of parameters. Systems should be able to enter and leave the system without any barriers. At the same time the performance of the system should not suffer as a result of this dynamism. 4) Heterogeneity: As networks have become widespread large kinds of special purpose systems are on these networks. The idea behind AviGRID was to effectively utilize resources from a variety of platforms and maintain a common resource pool. In a modern context any distributed system should be able to handle the complexity that arises from dealing with systems on different kinds of Architectures, Platforms and Applications. 5) Fault Tolerance: The implementation of fault tolerance in a system will reflect in increased availability of the system. This implies that the system is not dependent on any one systems functioning. A typical problem area here would be the storage of files on a remote server. Each time a user session is in progress the requested files has to be available to the user. The vulnerability of the system would be loss of information stored. To deal with this we looking the concept of smart storage where two or more systems will have a track of each other and thus effect data serving is made possible. The system as a whole can function even after some systems have failed. This reflects in increased reliability of the system. These are some of the basic objectives of what should be the design factors that go into the design of AviGRID. These factors however are only general guidelines and they are applicable to various systems. For instance if a designer knows that the number of systems in his system will never exceed a certain small number he can overlook the scalability feature and try to optimize on the relevant features. 4. Working of AviGRID AviGRID is an entity which acts as a middleware layer that adds distributed and GRID components to a standalone multitasking operating system such as Linux and windows. The following components have been designed and implemented for the effective working of the AviGRID system. 4.1 Networking: A variety of network topologies can be found at UOW. The data transactions can be both wired and wireless forms of communications. AviGRID utilizes resources from wired resources and helps to effectively manage resources over both wired and wireless devices. The main drawback of wireless devices is that they are always mobile and so it becomes virtually impossible to track them and effectively utilize resources from them. Further more these devices tend to run on standby power sources such as batteries which make them vulnerable to power outages or failure of the devices to work. The point of failure becomes even more when a process is migrated to a wireless device. Hence a combination of both is used and all wired devices are effectively utilized as service components for wireless devices. 4.2 Load sharing or Scheduler - CoED: We had studied previous strategies like Michael Mitzenmacher scheme [13], and then designed a new scheme named CoED [5]. We developed comparative simulation programs that proved the viability of our load sharing model, and have implemented it in a distributed network. In our approach, we have considered a collective, eventbased, decentralized strategy (CoED) that balances the loads of individual nodes of a system according to their status and the occurrence of certain events. The simulation results and the implementation support our view that CoED is a feasible and efficient scheme for load sharing in distributed computing systems. The Processes are migrated only when there is a need for migration. The turnaround time of the sending and receiving of data is calculated prior to migration of the process. 4.3 Process Allocation: A Process Analyser tends to look at the worthiness of the process to be migrated. A List of items is checked from the local table and then the decision is made for migrating the process across. A typical scenario would be the communication between two nodes and the communication overhead between them. The data traffic is also taken into account. 4.4 End-User perspective: The idea as said before is to make the infrastructure semi-transparent. If the user wants to utilize a resource say printing, it would not be useful to print it remotely and then let the user find his / her printouts! Some form of user feedback is required and the user is given control to his / her data. The kind of resources utilized will also play a very important role here. Some sort of brokerage scheme like that of the Grid bus [6] or Globus [ ] for that matter can be easily incorporated into the AviGRID infrastructure. 5. AviGRID Infrastructure As seen on the figure above, it is the geographical map of the UOW campus. The campus is divided into Clusters A, B, C, D. These Clusters communicate with each other using High speed Ethernet land lines throughout the campus. The networking is made such that local nodes in each cluster are utilized optimally for resource sharing. The Red Lines indicate the connectivity between the clusters. As we know the existing speed of wireless networks are not quite high as compared to wired technologies. Hence the high speed network acts as a go between in possessing request from wireless mobile devices such as cell phones and laptops inside the campus. Each of the clusters keeps track of the mobile devices in its zone and when the device moves out of range from the cluster a global search request is sent to find out where the device is. Then appropriate transfer of control is given to the cluster that has the device in its wireless zone / range. To make it a cost effective infrastructure, Low range wireless transmitters are used throughout the campus. This helps in having an economical GRID and helps in extending service components to local clusters. 6. GRID Components 6.1 The AviShell: At the moment the implementation of the architecture is taking place in Linux only nodes and a Shell called AviShell is utilized to add middleware support to Linux operating system. This Shell unlike that of the bash shell, becomes the basis of I/O operations from user sessions. The shell takes user input and then gives control to the process analyser. If needed process migration takes place and distributed processing is done transparently. 6.2 Services: A Variety of services can be offered to the users of the AviGRID. Some of the typical examples that can be incorporated in the AviGRID are shown below: Build Beowulf Clusters for incorporating AviGRID Port coding from C/C++ Sockets to Java Add user service components for both wired and wireless devices Platform independent Middleware support. Build Wireless nodes and Networks for testing the usability of the Infrastructure. Simulation of virtual nodes and proc testing Scalability testing Testing on Heterogeneous Clusters Actual Resource utilization on a Beowulf Clusters Utilization of existing technologies such as GRID Bus[6] Internet or Intranet services Print and spooling services Process migration and distributed processing Local area Voice over IP services Conclusion Feasibility and Simulative studies show sufficient proof that a middleware system such as AviGRID is feasible and possible Infrastructure for implementation at the University of Wollongong [UOW]. Most universities world wide are now joining hands in building a wireless campus. We are going one step further by building a GRID infrastructure and utilizing various idle resources that are already available inside the campus. By building AviGRID we are opening a whole new world of resource utilization locally and will be able to offer a number of real world GRID services to the staff and students at UOW. References [1] Alok Shriram, Anuraag Sarangi, Avinash S. “ICHU Model for Processor Allocation in Distributed Operating Systems”, submitted to ACM SIGOPS Operating System Review (OSR), Vol. 35, No. 3, pg 16-21, July, 2001. [2] Alok Shriram, Anuraag Sarangi, Avinash Shankar. “CBReM Model to Manage Resources over the Grid,” published in the proceedings of the International Conference on Information Technology (CIT 2001), Gopalpur, India, December 2001. [3] Anuraag Sarangi and Alok Shriram, “Process Allocation Using ICHU Model”, paper presented as a poster in International Conference on High Performance Computing (HiPC’00), Bangalore, India,December, 2000. [4] Anuraag Sarangi, Alok Shriram, Avinash Shankar. “A Scheduling Model for Grid Systems,” published in the proceedings of the IEEE/ACM International Workshop on Grid Computing (GRID – 2001) by Springer-Verlag in the Lecture Notes in Computer Science (LNCS) series (Vol. 2242), Denver, USA, November 2001. [5] Anuraag Sarangi, Alok Shriram, Avinash Shankar. “Collective Load Sharing in Homogeneous Distributed Systems,” published in the proceedings of the International Conference on Advanced Computing and Communications (ADCOM 2001), Bhubaneswar, India, December 2001. [6] Dr.Rajkumar Buyya “Grid Bus Architecture”, PhD thesis Melbourne University, Australia 2002. for Distributed Systems”, IEEE 5th Int. Conf. on Distributed Computing Systems, 1985, pp. 539-546. [7] Rajkumar Buyya, Steve Chapain, David DiNucci, ”Architectural Models for Resource Management in the GRID” Proceedings of first IEEE/ACM International Workshop on GRID Computing-GRID 2000,December 2000,pp.18-3. [16] Andrew S. Tanenbaum, Modern Operating Systems. Prentice-Hall N.J., U.S. 1992. [8] Eager et al. "Adaptive Load Sharing in Homogenous Distributed Systems", IEEE Transactions on Software Engineering, Vol. 12, May 1986, pp 662-675. [9] H. El-Rewini and T. G. Lewis, “Scheduling parallel programs onto arbitrary target architectures”, Journal of Parallel and Distributed Computing, Vol. 9, No. 2, June 1990, pp.138-153. [10] Gerasoulis and T. Yang, “A comparison of clustering heuristics for schduling directed acyclic graphs onto multiprocessors”, Journal of Parallel and Distributed Computing, Vol. 16, No. 4, December 1992, pp.276-291. [11] Gerard LeLann, “Motivations, objectives and characterizations of distributed systems,” Distributed systems – Architecture and Implementation, Springer Verlag, 1981, Lecture Notes in Computer Science, Vol. 105. [12] R. Lüling, B. Monien, F. Ramme, “Load Balancing in Large Networks: A Comparative Study”, 3rd IEEE Symposium on Parallel and Distributed Processing, 1991. [13] Michael Mitzenmacher, "How Useful is Old Information?" IEEE Transactions on Parallel and Distributed Systems, Vol.11, No. 1, January 2000, pp. 6-20. [14] J. Mullender, "Process Management in a Distributed Operating System", Lecture Notes in Computer Sciences-Experiences with Distributed Systems. J. Nehmer (Editor). Vol.309, International Workshop Kaiserslautern, Sept 1987. [15] L. M. Ni, C. W. Xu, T. B. Gendreau, “Drafting Algorithm – A Dynamic Process Migration Protocol [17] Amjad Umar, Distributed Computing and Client-Server Systems, PTR Prentice-Hall Inc., New Jersey,pp. 345-374. [18] K. M. Baumgartner and B. W. Wah, “Computer Scheduling Algorithms: Past, Present, and Future”. Information Sciences 57-58, 1991, pp. 319-345.