Advanced Virtual Interconnect GRID

advertisement
Advanced Virtual Interconnect GRID - AviGRID
Avinash Shankar
School of Informatics and Computer Science
University of Wollongong
asn68@uow.edu.au
Abstract
GRID computing is revolutionizing the way in
which end user computing is simplified and mass
computing resources are utilized to execute
programs faster with or without the knowledge of
the end user. Due to network traffic or limited
bandwidth, GRID computing still has a long way to
go to achieve global acceptance and become a
regular commodity such as the Internet. The major
objective of any GRID system is to minimize the
idleness of systems across networks and migrating
processes across a variety of network topologies.
The proposed Architecture is an extension of my
previous work CBReM [2]. An extensible and
scalable middleware or software layer is built
around the Operating System such as UNIX and
LINUX to make the Architecture some what
platform independent.
Keywords: AviGRID, GRID, Clusters, Process,
CBReM, Wireless, Internet.
1. Introduction
GRID computing over the past decade has become
an item of acceptance and is fast becoming a
medium, for processing resource intensive
Applications like Rendering and Animations.
Advanced Virtual Interconnect GRID or AviGRID
is an extensible and scalable architecture which
incorporates the use of Beowulf clusters as a test
bed to migrate different types of processes and to
develop a Semi – transparent end user session. It
provides a middleware support for existing
Operating Systems and uses a load sharing scheme
[1] to effectively utilize and share common
resources such as file storage and parallel execution
of processes. A number of clusters are positioned at
strategic location inside the campus and are
connected using High Speed Gigabit networks
[Wired or connection based transactions]. Each
Cluster Server has a Wireless Hub that connects
local mobile devices such as Laptops and PDA’s
which are allocated IP addresses dynamically using
either DHCP or ZeroConf type II configurations.
2. Overview – Current Technologies
Communication over different types of networks
becomes effective only when the over all latency of
the data transmissions is effective. If the
transmission or transfer rate is high then process
migration is not effective and the process is best
suitable for local processing. Most of the current
GRID infrastructures face more or less a similar
bottleneck. Hence it becomes cumbersome to build a
global Grid infrastructure that can effectively use
GRID computing as a base to migrate processes
from different parts of the globe for effective
distributed computing and parallel executions of
program threads.
The AviGRID uses a combination of wired and
wireless commodities to effectively offer service
components to the students of UOW. Instead of
approaching a theoretic global infrastructure the
AviGRID utilizes a Local approach that is more cost
effective and would drastically bring down the idle
time of systems operating at UOW. Another
problem faced by most researchers is the use of
GRID systems. Most GRID systems either serve a
specific purpose like that of a scientific or research
project which utilizes resources for tasks such as
number crunching, the use of a Distributed File
System with parallel execution of programs over a
Distributed shared Memory scheme, Etc. The goal
of any system is to provide uptime of hundred
percent and also execute processes at the quickest
possible time. Further more the complexities of
writing parallel programs adds to the burden of the
programmer who has to write applications specific
to the GRID platform and utilize resources. In the
next section we discuss about the various problems
faced in designing a GRID system.
3. Objectives of AviGRID
These are some of the effective design objectives
that a system designer should follow in order to be
able to develop an effective GRID system such as
AviGRID.
1) Better Throughput: One of the primary
reasons behind any new approach is that
there should be some improvement in
performance.
This improvement in
performance should be large enough to
justify shifting from the current system. In
a distributed system the challenge lies in
utilizing all the idle resources of the
machines on the systems in an effective
manner in such a way that all tasks get done
in less time. In the implementation of a
distributed system the potential stumbling
blocks are the latency that is introduced by
the network. Maintaining current state
information about the system is another
problem which has to be addressed by
system designers. Too many updates could
flood a network with unwanted state
information; too few updates could result in
obsolete data which could make the system
unstable.
2) Resource Sharing: The term resource here
can mean anything from a hardware device
to memory to CPU cycles to data processing
or storage. An effective distributed system
will try to utilize these resources in as
effective as a manner as possible. The task
here again is the ability to maintain all the
state information to be as current as
possible. AviGRID maintains a global table
state that is common to all systems. Hence
state information is always updated and
based on the state information the scheduler
CoED [5] migrates processes accordingly.
3) Scalability: This is a very important
parameter about how well a system can
scale up or down. This effectively means
that a system should be able to adapt to
various kinds of parameters.
Systems
should be able to enter and leave the system
without any barriers. At the same time the
performance of the system should not suffer
as a result of this dynamism.
4) Heterogeneity: As networks have become
widespread large kinds of special purpose
systems are on these networks. The idea
behind AviGRID was to effectively utilize
resources from a variety of platforms and
maintain a common resource pool. In a
modern context any distributed system
should be able to handle the complexity that
arises from dealing with systems on
different kinds of Architectures, Platforms
and Applications.
5) Fault Tolerance: The implementation of
fault tolerance in a system will reflect in
increased availability of the system. This
implies that the system is not dependent on
any one systems functioning. A typical
problem area here would be the storage of
files on a remote server. Each time a user
session is in progress the requested files has
to be available to the user. The vulnerability
of the system would be loss of information
stored. To deal with this we looking the
concept of smart storage where two or more
systems will have a track of each other and
thus effect data serving is made possible.
The system as a whole can function even
after some systems have failed. This reflects
in increased reliability of the system.
These are some of the basic objectives of what
should be the design factors that go into the design
of AviGRID. These factors however are only
general guidelines and they are applicable to various
systems. For instance if a designer knows that the
number of systems in his system will never exceed a
certain small number he can overlook the scalability
feature and try to optimize on the relevant features.
4. Working of AviGRID
AviGRID is an entity which acts as a middleware
layer that adds distributed and GRID components to
a standalone multitasking operating system such as
Linux and windows. The following components
have been designed and implemented for the
effective working of the AviGRID system.
4.1 Networking: A variety of network topologies
can be found at UOW. The data transactions can be
both wired and wireless forms of communications.
AviGRID utilizes resources from wired resources
and helps to effectively manage resources over both
wired and wireless devices. The main drawback of
wireless devices is that they are always mobile and
so it becomes virtually impossible to track them and
effectively utilize resources from them. Further
more these devices tend to run on standby power
sources such as batteries which make them
vulnerable to power outages or failure of the devices
to work. The point of failure becomes even more
when a process is migrated to a wireless device.
Hence a combination of both is used and all wired
devices are effectively utilized as service
components for wireless devices.
4.2 Load sharing or Scheduler - CoED: We had
studied
previous
strategies
like
Michael
Mitzenmacher scheme [13], and then designed a
new scheme named CoED [5]. We developed
comparative simulation programs that proved the
viability of our load sharing model, and have
implemented it in a distributed network. In our
approach, we have considered a collective, eventbased, decentralized strategy (CoED) that balances
the loads of individual nodes of a system according
to their status and the occurrence of certain events.
The simulation results and the implementation
support our view that CoED is a feasible and
efficient scheme for load sharing in distributed
computing systems. The Processes are migrated
only when there is a need for migration. The
turnaround time of the sending and receiving of data
is calculated prior to migration of the process.
4.3 Process Allocation: A Process Analyser tends
to look at the worthiness of the process to be
migrated. A List of items is checked from the local
table and then the decision is made for migrating the
process across. A typical scenario would be the
communication between two nodes and the
communication overhead between them. The data
traffic is also taken into account.
4.4 End-User perspective: The idea as said before
is to make the infrastructure semi-transparent. If the
user wants to utilize a resource say printing, it would
not be useful to print it remotely and then let the
user find his / her printouts! Some form of user
feedback is required and the user is given control to
his / her data. The kind of resources utilized will
also play a very important role here. Some sort of
brokerage scheme like that of the Grid bus [6] or
Globus [ ] for that matter can be easily incorporated
into the AviGRID infrastructure.
5. AviGRID Infrastructure
As seen on the figure above, it is the geographical map of the UOW campus. The campus is divided into
Clusters A, B, C, D. These Clusters communicate with each other using High speed Ethernet land lines
throughout the campus. The networking is made such that local nodes in each cluster are utilized optimally
for resource sharing. The Red Lines indicate the connectivity between the clusters. As we know the existing
speed of wireless networks are not quite high as compared to wired technologies. Hence the high speed
network acts as a go between in possessing request from wireless mobile devices such as cell phones and
laptops inside the campus. Each of the clusters keeps track of the mobile devices in its zone and when the
device moves out of range from the cluster a global search request is sent to find out where the device is.
Then appropriate transfer of control is given to the cluster that has the device in its wireless zone / range. To
make it a cost effective infrastructure, Low range wireless transmitters are used throughout the campus. This
helps in having an economical GRID and helps in extending service components to local clusters.
6. GRID Components
6.1 The AviShell: At the moment the
implementation of the architecture is taking place in
Linux only nodes and a Shell called AviShell is
utilized to add middleware support to Linux
operating system. This Shell unlike that of the bash
shell, becomes the basis of I/O operations from user
sessions. The shell takes user input and then gives
control to the process analyser. If needed process
migration takes place and distributed processing is
done transparently.
6.2 Services: A Variety of services can be offered to
the users of the AviGRID. Some of the typical
examples that can be incorporated in the AviGRID
are shown below:














Build Beowulf Clusters for incorporating
AviGRID
Port coding from C/C++ Sockets to Java
Add user service components for both wired
and wireless devices
Platform independent Middleware support.
Build Wireless nodes and Networks for
testing the usability of the Infrastructure.
Simulation of virtual nodes and proc testing
Scalability testing
Testing on Heterogeneous Clusters
Actual Resource utilization on a Beowulf
Clusters
Utilization of existing technologies such as
GRID Bus[6]
Internet or Intranet services
Print and spooling services
Process migration and distributed
processing
Local area Voice over IP services
Conclusion
Feasibility and Simulative studies show sufficient
proof that a middleware system such as AviGRID is
feasible
and
possible
Infrastructure
for
implementation at the University of Wollongong
[UOW]. Most universities world wide are now
joining hands in building a wireless campus. We are
going one step further by building a GRID
infrastructure and utilizing various idle resources
that are already available inside the campus. By
building AviGRID we are opening a whole new
world of resource utilization locally and will be able
to offer a number of real world GRID services to the
staff and students at UOW.
References
[1] Alok Shriram, Anuraag Sarangi, Avinash S.
“ICHU Model for Processor Allocation in
Distributed Operating Systems”, submitted to ACM
SIGOPS Operating System Review (OSR), Vol. 35,
No. 3, pg 16-21, July, 2001.
[2] Alok Shriram, Anuraag Sarangi, Avinash
Shankar. “CBReM Model to Manage Resources
over the Grid,” published in the proceedings of the
International
Conference
on
Information
Technology (CIT 2001), Gopalpur, India, December
2001.
[3] Anuraag Sarangi and Alok Shriram, “Process
Allocation Using ICHU Model”, paper presented as
a poster in International Conference on High
Performance Computing (HiPC’00), Bangalore,
India,December, 2000.
[4] Anuraag Sarangi, Alok Shriram, Avinash
Shankar. “A Scheduling Model for Grid Systems,”
published in the proceedings of the IEEE/ACM
International Workshop on Grid Computing (GRID
– 2001) by Springer-Verlag in the Lecture Notes in
Computer Science (LNCS) series (Vol. 2242),
Denver, USA, November 2001.
[5] Anuraag Sarangi, Alok Shriram, Avinash
Shankar. “Collective Load Sharing in Homogeneous
Distributed Systems,” published in the proceedings
of the International Conference on Advanced
Computing and Communications (ADCOM 2001),
Bhubaneswar, India, December 2001.
[6] Dr.Rajkumar Buyya “Grid Bus Architecture”,
PhD thesis Melbourne University, Australia 2002.
for Distributed Systems”, IEEE 5th Int. Conf. on
Distributed Computing Systems, 1985, pp. 539-546.
[7] Rajkumar Buyya, Steve Chapain, David
DiNucci, ”Architectural Models for Resource
Management in the GRID” Proceedings of first
IEEE/ACM International Workshop on GRID
Computing-GRID 2000,December 2000,pp.18-3.
[16] Andrew S. Tanenbaum, Modern Operating
Systems. Prentice-Hall N.J., U.S. 1992.
[8] Eager et al. "Adaptive Load Sharing in
Homogenous
Distributed
Systems",
IEEE
Transactions on Software Engineering, Vol. 12,
May 1986, pp 662-675.
[9] H. El-Rewini and T. G. Lewis, “Scheduling
parallel
programs
onto
arbitrary
target
architectures”, Journal of Parallel and Distributed
Computing, Vol. 9, No. 2, June 1990, pp.138-153.
[10] Gerasoulis and T. Yang, “A comparison of
clustering heuristics for schduling directed acyclic
graphs onto multiprocessors”, Journal of Parallel
and Distributed Computing, Vol. 16, No. 4,
December 1992, pp.276-291.
[11] Gerard LeLann, “Motivations, objectives and
characterizations
of
distributed
systems,”
Distributed
systems
–
Architecture
and
Implementation, Springer Verlag, 1981, Lecture
Notes in Computer Science, Vol. 105.
[12] R. Lüling, B. Monien, F. Ramme, “Load
Balancing in Large Networks: A Comparative
Study”, 3rd IEEE Symposium on Parallel and
Distributed Processing, 1991.
[13] Michael Mitzenmacher, "How Useful is Old
Information?" IEEE Transactions on Parallel and
Distributed Systems, Vol.11, No. 1, January 2000,
pp. 6-20.
[14] J. Mullender, "Process Management in a
Distributed Operating System", Lecture Notes in
Computer Sciences-Experiences with Distributed
Systems. J. Nehmer (Editor). Vol.309, International
Workshop Kaiserslautern, Sept 1987.
[15] L. M. Ni, C. W. Xu, T. B. Gendreau, “Drafting
Algorithm – A Dynamic Process Migration Protocol
[17] Amjad Umar, Distributed Computing and
Client-Server Systems, PTR Prentice-Hall Inc., New
Jersey,pp. 345-374.
[18] K. M. Baumgartner and B. W. Wah, “Computer
Scheduling Algorithms: Past, Present, and Future”.
Information Sciences 57-58, 1991, pp. 319-345.
Download