ACT - WordPress.com

advertisement
ADVANCE COMPUTING TECHNOLOGY (170704)
PRACTICAL-1
AIM: introduction to advance computing technology.
Advance Computing Technology is the one-time reference book, providing detailed
information about computing technologies, such as cluster computing, grid computing, and cloud
computing.
1. Cloud computing:The introduction of cloud computing has provided high scalability and elasticity to the
companies. This book helps you to learn various concepts associated with all the three types of
computing technologies.
Cloud computing is a computing paradigm where a large pool of systems are connected in
private or public networks, to provide dynamically scalable infrastructure for application, data and
file storage. With the advent of this technology, the cost of computation, application hosting,
content storage and delivery is reduced significantly.
Cloud computing is a practical approach to experience direct cost benefits and it has the
potential to transform a data center from a capital - intensive set up to a variable priced
environment.
The idea of cloud computing is based on a very fundamental principal of„reusability of IT
capabilities'. The difference that cloud computing brings compared to traditional concepts of “grid
computing”, “distributed computing”, “utility computing”, or “autonomic computing” is to broaden
horizons across organizational boundaries.
Forrester defines cloud computing as:
“A pool of abstracted, highly scalable, and managed compute infrastructure capable of
hosting end customer applications and billed by consumption.”
Architecture of cloud computing:Cloud Computing architecture comprises of many cloud components, which are loosely coupled.
We can broadly divide the cloud architecture into two parts:
● Front End
● Back End
The front end refers to the client part of cloud computing system. It consists of interfaces
and applications that are required to access the cloud computing platforms, Example - Web
Browser.
The back End refers to the cloud itself. It consists of all the resources required to provide
cloud computing services. It comprises of huge data storage, virtual machines, security mechanism,
services, deployment models, servers, etc.
VIMAT/BE/CE/120940107018
Page 1
ADVANCE COMPUTING TECHNOLOGY (170704)
Each of the ends is connected through a network, usually Internet. The following diagram shows the
graphical view of cloud computing architecture:
Cloud Computing Benefits
Enterprises would need to align their applications, so as toexploit the architecture models that Cloud
Computing offers. Some of the typical benefits are listed below:
1. Reduces cost:-There are a number of reasons to attribute Cloud technology with lower costs. The
billing model is pay as per usage; the infrastructure is not thus lowering maintenance.Initial
expense and recurring expenses are much lower than traditional computing.
2. Increased storage:-With the massive Infrastructure that is offered by Cloud providers today,
storage & maintenance of large volumes of datais a reality. Sudden workload spikes are also
managed effectively & efficiently, since the cloud can scale dynamically.
3. Flexibility:- This is an extremely important characteristic. With enterprises having to adapt, even
more rapidly, to changing business conditions, speed to deliver is critical. Cloud computing stresses
on getting applications to market very quickly, by using the most appropriate building blocks
necessary for deployment.
Disadvantages of Cloud Computing
VIMAT/BE/CE/120940107018
Page 2
ADVANCE COMPUTING TECHNOLOGY (170704)
1. Downtime:- As cloud service providers take care of a number of clients each day, they can
become overwhelmed and may even come up against technical outages. This can lead to your
business processes being temporarily suspended. Additionally, if your internet connection is offline,
you will not be able to access any of your applications, server or data from the cloud.
2. Security:- Although cloud service providers implement the best security standards and industry
certifications, storing data and important files on external service providers always opens up risks.
Using cloud-powered technologies means you need to provide your service provider with access to
important business data. Meanwhile, being a public service opens up cloud service providers to
security challenges on a routine basis. The ease in procuring and accessing cloud services can also
give nefarious users the ability to scan, identify and exploit loopholes and vulnerabilities within a
system. For instance, in a multi-tenant cloud architecture where multiple users are hosted on the
same server, a hacker might try to break into the data of other users hosted and stored on the same
server. However, such exploits and loopholes are not likely to surface, and the likelihood of a
compromise is not great.
3. Vendor Lock-In:- Although cloud service providers promise that the cloud will be flexible to use
and integrate, switching cloud services is something that hasn’t yet completely evolved.
Organizations may find it difficult to migrate their services from one vendor to another. Hosting and
integrating current cloud applications on another platform may throw up interoperability and
support issues. For instance, applications developed on Microsoft Development Framework (.Net)
might not work properly on the Linux platform.
4. Limited Control:- Since the cloud infrastructure is entirely owned, managed and monitored by the
service provider, it transfers minimal control over to the customer. The customer can only control
and manage the applications, data and services operated on top of that, not the backend
infrastructure itself. Key administrative tasks such as server shell access, updating and firmware
management may not be passed to the customer or end user. It is easy to see how the advantages of
cloud computing easily outweigh the drawbacks. Decreased costs, reduced downtime, and less
management effort are benefits that speak for themselves
Application of cloud computing:The applications of cloud computing are practically limitless. With the right middleware, a cloud
computing system could execute all the programs a normal computer could run. Potentially,
everything from generic word processing software to customized computer programs designed for
a specific company could work on a cloud computing system.Why would anyone want to rely on
another computer system to run programs and store data? Here are just a few reasons:
● Clients would be able to access their applications and data from anywhere at any time. They
could access the cloud computing system using any computer linked to the Internet. Data
wouldn't be confined to a hard drive on one user's computer or even a corporation's internal
network.
VIMAT/BE/CE/120940107018
Page 3
ADVANCE COMPUTING TECHNOLOGY (170704)
● It could bring hardware costs down. Cloud computing systems would reduce the need for
advanced hardware on the client side. You wouldn't need to buy the fastest computer with
the most memory, because the cloud system would take care of those needs for you. Instead,
you could buy an inexpensive computer terminal. The terminal could include a monitor,
input devices like a keyboard and mouse and just enough processing power to run the
middleware necessary to connect to the cloud system. You wouldn't need a large hard drive
because you'd store all your information on a remote computer.
● Corporations that rely on computers have to make sure they have the right software in place
to achieve goals. Cloud computing systems give these organizations company-wide access
to computer applications. The companies don't have to buy a set of software or software
licenses for every employee. Instead, the company could pay a metered fee to a cloud
computing company.
● Servers and digital storage devices take up space. Some companies rent physical space to
store servers and databases because they don't have it available on site. Cloud computing
gives these companies the option of storing data on someone else's hardware, removing the
need for physical space on the front end.
● Corporations might save money on IT support. Streamlined hardware would, in theory, have
fewer problems than a network of heterogeneous machines and operating systems.
2. cluster computing:Clustering is a technique in which multiple computers are being connected together to achieve a
powerful computing device.
Architecture of cluster computing:-
VIMAT/BE/CE/120940107018
Page 4
ADVANCE COMPUTING TECHNOLOGY (170704)
3. Grid computing:Grid computing is a technique in which computer resources from various administrative domains
are being combined to achieve a common goal. At present, companies are moving their data on
cloud to enable ubiquitous and on-demand accessibility of shared pool of computing resources.
Architecture of grid computing:-
VIMAT/BE/CE/120940107018
Page 5
ADVANCE COMPUTING TECHNOLOGY (170704)
PRACTICAL-2
AIM:- To study about cluster computing.
What is cloud computing?
Connecting two or more computers together in such a way that they behave like a single
computer. Clustering is used for parallel processing, load balancing and fault tolerance.
Clustering is a popular strategy for implementing parallel processing applications because it
enables companies to leverage the investment already made in PCs and workstations. In addition,
it's relatively easy to add new CPUs simply by adding a new PC to the network.
Connecting two or more computers together in such a way that they behave like a single
computer. Clustering is used for parallel processing, load balancing and fault tolerance.
Clustering is a popular strategy for implementing parallel processing applications because it
enables companies to leverage the investment already made in PCs and workstations. In addition,
it's relatively easy to add new CPUs simply by adding a new PC to the network.
Introduction to cluster computing:Very often applications need more computing power than a sequential computer can
provide. One way of overcoming this limitation is to improve the operating speed of processors and
other components so that they can other the power required by computationally intensive
applications. Even though this is currently possible to certain extent, future improvements are
constrained by the speed of light, their dynamic laws, and the high financial costs for processor
fabrication. A viable and cost- effective alternative solution is to connect multiple processors
together and coordinate their computational efforts. The resulting systems are popularly known as
parallel computers, and they allow the sharing of a computational task among multiple processors.
There are three ways to improve performance:1. Work harder,
2. Work smarter, and
3. Get help.
In terms of computing technologies, the analogy to this mantra is that working harder is like
using faster hardware (high performance processors or peripheral devices). Working smarter
concerns doing things more effciently and this revolves around the algorithms and techniques used
to solve computational tasks. Finally, getting help refers to using multiple computers to solve a
particular task.
Cluster computing architecture:A cluster is a type of parallel or distributed processing system, which consists of a collection
of interconnected stand-alone computers working together as a single, integrated computing
resource.
VIMAT/BE/CE/120940107018
Page 6
ADVANCE COMPUTING TECHNOLOGY (170704)
A computer node can be a single or multiprocessor system (PCs, workstations,or SMPs)
with memory, I/O facilities, and an operating system. A cluster generallyrefers to two or more
computers (nodes) connected together. The nodes can existin a single cabinet or be physically
separated and connected via a LAN. An inter-connected (LAN-based) cluster of computers can
appear as a single system to users and applications. Such a system can provide a cost-effective way
to gain features and benfits (fast and reliable services) that have historically been found only on
more expensive proprietary shared memory systems. The typical architecture of a cluster is shown
in Figure.
The following are some prominent components of cluster computers:
1.
2.
3.
4.
5.
6.
Multiple High Performance Computers (PCs, Workstations, or SMPs)
State-of-the-art Operating Systems (Layered or Micro-kernel based)
High Performance Networks/Switches (such as Gigabit Ethernet and Myrinet)
Network Interface Cards (NICs)
Fast Communication Protocols and Services (such as Active and Fast Mes-sages)
Cluster Middleware (Single System Image (SSI) and System Availability In-frastructure.
The network interface hardware acts as a communication processor and is responsible for
transmitting and receiving packets of data between cluster nodes via a network/switch.
Communication software offers a means of fast and reliable data communication among cluster
nodes and to the outside world. Often, clusters with a special network/switch like Myrinet use
communication protocols such as active messages for fast communication among its nodes. They
potentially bypass the operating system and thus remove the critical communication overheads
providing direct user-level access to the network interface.
The cluster nodes can work collectively, as an integrated computing resource, or they can
operate as individual computers. The cluster middleware is responsible for offering an illusion of a
unfied system image (single system image) and availability out of a collection on independent but
interconnected computers.
VIMAT/BE/CE/120940107018
Page 7
ADVANCE COMPUTING TECHNOLOGY (170704)
Programming environments can other portable, effcient, and easy-to-use tools for
development of applications. They include message passing libraries, debuggers, and prolers. It
should not be forgotten that clusters could be used for the execution of sequential or parallel
applications.
Application of cluster computing (parallel processing):Clusters have been employed as an execution platform for a range of application classes,
ranging from supercomputing and mission-critical ones, through to e-commerce, and databasebased ones.
Clusters are being used as execution environments for Grand Challenge
Applications(GCAs) such as weather modeling, automobile crash simulations, life sciences,
computational fluid dynamics, nuclear simulations, image processing, electromagnetics, data
mining, aerodynamics and astrophysics. These applications are generally considered intractable
without the use of state-of-the-art parallel supercomputers. The scale of their resource requirements,
such as processing time, memory, and communication needs distinguishes GCAs from other
applications. For example, the execution of scientific applications used in predicting life-threatening
situations such as earthquakes or hurricanes requires enormous computational power and storage
resources. In the past, these applications would be run on vector or parallel supercomputers costing
millions of dollars in order to calculate predictions well in advance of the actual events. Such
applications can be migrated to run on commodity off-the-shelf-based clusters and deliver
comparable performance at a much lower cost. In fact, in many situation expensive parallel
supercomputers have been replaced by low-cost commodity Linux clusters in order to reduce
maintenance costs and increase overall computational resources. Clusters are increasingly being
used for running commercial applications.
In a business environment, for example in a bank, many of its activities are automated.
However, a problem will arise if the server that is handling customer transactions fails. The bank’s
activities could come to halt and customers would not be able to deposit or withdraw money from
their account. Such situations can cause a great deal of inconvenience and result in loss of business
and confidence in a bank. This is where clusters can be useful. A bank could continue to operate
even after the failure of a server by automatically isolating failed components and migrating
activities to alternative resources as a means of offering an uninterrupted service.
With the increasing popularity of the Web, computer system availability is becoming
critical, especially for e-commerce applications. Clusters are used to host many new Internet service
sites. For example, free email sites like Hotmail , and search sites like Hotbot (that uses Inktomi
technologies) use clusters. Cluster-based systems can be used to execute many Internet applications:
1. Web servers;
2. Search engines;
3. Email;
4. Security;
VIMAT/BE/CE/120940107018
Page 8
ADVANCE COMPUTING TECHNOLOGY (170704)
5. Proxy; and
6. Database servers.
In the commercial arena these servers can be consolidated to create what is known as an
enterprise server. The servers can be optimized, tuned and managed for increased efficiency and
responsiveness depending on the workload through various load-balancing techniques. A large
number of low-end machines (PCs) can be clustered along with storage and applications for
scalability, high availability, and performance. The leading companies building these systems are
Compaq , Hewlett-Packard , IBM, Microsoft and Sun.
Advantages:
• High performance
• Large capacity
• High availability
• Incremental growth
Disadvantages:
• Complexity
VIMAT/BE/CE/120940107018
Page 9
ADVANCE COMPUTING TECHNOLOGY (170704)
PRACTICAL-3
AIM:- To study about grid computing.
Introduction:Definition - What does Grid Computing mean?
Grid computing is a processor architecture that combines computer resources from various
domains to reach a main objective. In grid computing, the computers on the network can work on a
task together, thus functioning as a supercomputer. a grid works on various tasks within a network,
but it is also capable of working on specialized applications. It is designed to solve problems that
are too big for a supercomputer while maintaining the flexibility to process numerous smaller
problems. Computing grids deliver a multiuser infrastructure that accommodates the discontinuous
demands of large information processing.
Grid Computing is a form of distributed computing based on the dynamic sharing of
resources between participants, organizations and companies to by combining them, and thereby
carrying out intensive computing applications or processing very large amounts of data.
Well- known example of grid computing in the public domain is the ongoing SETI (Search
for Extraterrestrial Intelligence) @Home project in which thousands of people are sharing the
unused processor cycles of their PCs in the vast search for signs of "rational" signals from outer
space.
Grid computing is applying the resources of many computers in a network to a single
problem at the same time - usually to a scientific or technical problem that requires a great number
of computer processing cycles or access to large amounts of data. Grid computing requires the use
of software that can divide and farm out pieces of a program to as many as several thousand
computers. Grid computing can be thought of as distributed and large-scale cluster computing and
as a form of network-distributed parallel processing.
Grid computing appears to be a promising trend for three reasons:(1) Its ability to make more cost-effective use of a given amount of computer resources.
(2) As a way to solve problems that can't be approached without an enormous amount of computing
power.
(3) Because it suggests that the resources of many computers can be cooperatively and perhaps
synergistically harnessed and managed as a collaboration toward a common objective.
Types of Grid:Computational grid:- A computational grid is focused on setting aside resources specifically for
computing power. In this type of grid, most of the machines are high-performance servers.
VIMAT/BE/CE/120940107018
Page 10
ADVANCE COMPUTING TECHNOLOGY (170704)
Scavenging grid:- A scavenging grid is most commonly used with large numbers of desktop
machines. Machines are scavenged for available CPU cycles and other resources. Owners of the
desktop machines are usually given control over when their resources are available to participate in
the grid.
Data grid:- A data grid is responsible for housing and providing access to data across multiple
organizations. Users are not concerned with where this data is located as long as they have access to
the data. A data grid would allow them to share their data, manage the data, and manage security
issues such as who has access to what data.
Architecture:-
Applications:Grid Computing has many application fields. It is being used more and more systematically,
and for many reasons. The first is the improvement of performance and the reduction of costs due to
the combining of resources. The possibility of creating virtual organizations to establish
collaboration between teams with scarce and costly data and resources is another.
Scientists, who use applications that require enormous resources in terms of computing or
data processing, are large consumers of computational grids. One finds for instance many grids in
particle-physics experiments. Nor are leading industries staying behind: grids are massively present
in the automobile and aeronautical business, where digital simulation plays an important part.
In practice, grids are very useful in crash simulations, as well as for computer-aided design.
More recently, grids have emerged in other areas with the purpose of optimising company business.
The aim is to combine material resources for several services by reallocating them in a dynamic
way depending on performance peaks.
VIMAT/BE/CE/120940107018
Page 11
ADVANCE COMPUTING TECHNOLOGY (170704)
This strategy offers considerable cost cutting thanks to better management of resources,
administrative tasks and maintenance. This last application field is of particular interest to France
Telecom, as we shall see.
Advantages:● Can solve larger, more complex problems in a shorter time
● Easier to collaborate with other organizations
● Make better use of existing hardware
Disadvantages:● Grid software and standards are still evolving
● Learning curve to get started
● Non-interactive job submission
Conclusion:Grid computing provides a framework and deployment platform that enables resource
sharing, accessing, aggregation, and management in a distributed computing environment based on
system performance, users' quality of services, as well as emerging open standards, such as Web
services. This is the era of Service Computing.Grid computing technologies are entering the
mainstream of computing in the financial services industry. It is, slowly but surely, changing the
face of computing in our industry. Just as the internet provided an means for explosive growth in
information sharing, grid computing provides an infrastructure leading to explosive growth in the
sharing of computational resources. This is making possible functionality that was previously
unimaginable -- near real time portfolio rebalancing scenario analysis; risk analysis models with
seemingly limitless complexity; and content distribution with speed and efficiency hereunto
unparalleled.
VIMAT/BE/CE/120940107018
Page 12
ADVANCE COMPUTING TECHNOLOGY (170704)
PRACTICAL: - 4
AIM: Study Of ClusterSim Simulating Tool for Cluster Computing.
Introduction:Nowadays, clusters of workstations are widely used in academic, industrial and
commercial areas. Usually built with “commodity-off-the-shelf” hardware components and
freeware or shareware available in the web, they are a low cost and high performance
alternative to supercomputers. The performance analysis of different parallel job
scheduling algorithms, interconnection networks and topologies, heterogeneous nodes and
parallel jobs on real clusters requires: a long time to develop and change software; a high
financial cost to acquire new hardware; a controllable and stable environment; less
intrusive performance analysis tools etc. On the other hand, analytical modeling for the
performance analysis of clusters requires too much simplifications and assumptions.
Cluster Simulation Tool (ClusterSim)
The ClusterSim is a Java-based parallel discrete-event simulation tool for cluster
computing. It supports visual modeling and simulation of clusters and their workloads for
performance analysis. A cluster is composed of single or multi-processed nodes, parallel job
schedulers, network topologies and technologies. A workload is represented by users that
submit jobs composed of tasks described by probability distributions and their internal
structure. The main features of ClusterSim are:

It provides a graphical environment to model clusters and their workloads.

Its source code is available and its classes are extensible, providing a mechanism to
implement new job scheduling algorithms, network topologies etc.

A job is represented by some probability distributions and its internal structure (loop
structures, CPU, I/O and MPI (communication) instructions). Thus, any parallel
algorithm model and communication pattern can be simulated.

It supports the modeling of clusters and heterogeneous or homogeneous nodes.

Simulation entities (architectures and users) are independent threads, providing
parallelism.

The most part of collective and point-to-point MPI (Message Passing Interface)
functions are supported.
VIMAT/BE/CE/120940107018
Page 13
ADVANCE COMPUTING TECHNOLOGY (170704)

A network is represented by its topology (bus, switch etc.), latency, bandwidth,
protocol overhead, error rate and maximum segment size.

It supports different parallel job scheduling algorithms (space sharing, gang scheduling,
etc.) and node scheduling algorithms (first-come-first-served (FCFS), etc.).

It provides a statistical and performance module that calculates some metrics (mean
nodes utilization, mean simulation time, mean jobs response time etc.).

It supports some probability distributions (Normal, Exponential, Erlang HyperExponential, Uniform etc.) to represent the parallelism degree of the jobs and the
interarrival time between jobs submissions.

Simulation time and seed can be specified.
Architecture of the ClusterSim
The architecture of the ClusterSim is divided in three layers: graphical environment,
entities. The first layer allows the modeling and simulation of clusters and their workloads
by means of a graphical environment. Moreover, it provides statistical and performance
dataabout each simulation. The second layer is composed of three entities: user, cluster and
7 node.
VIMAT/BE/CE/120940107018
Page 14
ADVANCE COMPUTING TECHNOLOGY (170704)
Graphical Environment
The graphical environment was implemented using Java Swing and NetBeans 3.4.1
compiler. It is composed of a configuration and simulation execution interface, three
workload editors (user, job and task editors) and three architecture editors (cluster, node and
processor editors). Using these tools, it is possible to model, execute, save and modify
simulation environments and 8 experiments
As well as the ClusterSim editors, the
simulation model is divided between workload and architecture.
Based on the related works, we chose a hybrid workload model using probability
distributions to represent some parameters (parallelism degree and inter-arrival time) and the
internal structure description of the jobs. The use of execution time as a parameter, in spite
of being found on execution logs, it is valid only to a certain workload and architecture.
Moreover, it is influenced by many factors like load, nodes processing power, network
overhead etc. Thus, the execution time of a job must be calculated during a simulation,
according to the simulated workload and architecture.
To avoid long execution traces, the jobs inter-arrival time is also represented by a
probability distribution. Moreover, exponential and Erlang hyper-exponential distributions
are widely used in the academic community to represent the jobs inter-arrival time.
Statistical and Performance Module
For each executed simulation, the statistical and performance module of ClusterSim
creates a log with the calculation of several metrics. The main calculated metrics are: mean
jobs and tasks response time; wait, submission, start and end time of each task; mean jobs
slowdown; mean nodes utilization; mean jobs reaction time and others.
ClusterSim’s Entities
Each entity has specific functions in a simulation environment. The user entity is
responsible for submitting a certain number of jobs to the cluster following a pattern of
arrival interval. Besides, each job type has a specific probability of being submitted to the
cluster entity. This submission is made through the generation of a job arrival event. When
the cluster receives this event, it should decide to which nodes the tasks of a job should be
directed. So, there is a job management system scheduler that implements certain parallel
job scheduling algorithms. Other important classes belonging to the cluster entity are: MPI
manager, single system image and network.
VIMAT/BE/CE/120940107018
Page 15
ADVANCE COMPUTING TECHNOLOGY (170704)
The single system image works as an operating system of the cluster, receiving and
directing events to the responsible classes for the event treatment. Besides, it generates
periodically the end of time slice event to indicate to the node schedulers that another time
slice ended.
The Fig. shows the events exchange diagram of the ClusterSim, detailing the
interaction among the user, cluster and node entities. To simplify the diagram, some classes
were omitted.
A cluster entity is composed of several node entities. When receiving a job arrival
event, by means of the node scheduler class, the node entity puts the tasks destined to it into
a queue. Oneach clock tick, the scheduler is called to execute tasks in the processors of the
node. As each task is composed of CPU, E/S and MPI instructions, at the end of each one of
those macro instructions an event is generated. A quantum is attributed to each task. When a
task finishes, the processor should generate an end of quantum event for the node scheduler
to execute the necessary actions (to change priorities, to remove the task from the head’s
queue etc.). When the processor executes all the instructions of a task, an end of task event
is generated for the node scheduler.
ClusterSim’s Core
The core is composed of the JSDESLib (Java Simple Discrete Event Simulation
Library),multithread discrete-event simulation library in Java, developed by our group,
which has as the main objective to simplify the development of discrete-event simulation
tools.
VIMAT/BE/CE/120940107018
Page 16
ADVANCE COMPUTING TECHNOLOGY (170704)
Verification and Validation of the ClusterSim
To verify and test the ClusterSim, we simulated a simple workload composed of two jobs
and compare with the analytical analysis or manual execution of the same workload. In
Figure ,each graph represents a job, where the nodes are the tasks and the edges indicate
exchange of messages between the tasks. The value in each node and edge indicates the time
spent in seconds to perform the processing (CPU instructions) and communication. For
example, Job 2 represents a farm of processes or tasks, in which the master task sends data
to the slaves. So, they process these data and return them for the master process. As the
ClusterSim does not use the execution time as an entry parameter, the execution time of the
jobs was converted in CPU instructions and sent bytes.
Simulation Results
To analyze the use of ClusterSim, we modeled, simulated and analyzed a case study
composed of 12 workloads and 12 clusters. Due to the limited number of pages, we will
only show the analysis based on one metric: mean nodes utilization.
Simulation Setup
1). Clusters
The clusters are composed of 16 nodes and a front-end node interconnected by a Fast
Ethernet switch. Each node has a Pentium III 1 Ghz (0.938 GHz real frequency) processor.
In Table 4, we show the main values of the clusters features, obtained from benchmarks
and performance libraries (Sandra 2003, PAPI 2.3 etc.).
VIMAT/BE/CE/120940107018
Page 17
ADVANCE COMPUTING TECHNOLOGY (170704)
Clusters features and their respective values
2). Workloads
In ClusterSim, a workload is composed of a jobs set represented by: their types,
internal structures, submission probabilities and inter-arrival time distributions. Due to the
lack of information about the internal structure of the jobs, we decided to create a synthetic
set of job. In the workload jobs, at each one of the iterations, the master task sends a
different message to each slave task. On their turn, they process a certain number of
instructions, according to the previously defined granularity, and then they return a message
to the master task. The total number of instructions that is to be processed by the job and the
size of the messages are divided equally among the slave tasks.
With regard to the parallelism degree, which is represented by a probability
distribution, we considered jobs between 1 and 4 tasks as low parallelism degree and between
5 and 16 as high parallelism degree. As usual, we used a uniform distribution to represent the
parallelism degree.Combining the parallelism level, number of instructions and granularity
characteristics, we had 8 different basic job types.
Results Presentation and Analysis
In this section, we present and analyze the performance of the clusters and their gang
scheduling algorithms. To analyze them, we compare clusters in which a gang scheduling
component or part is varied and the others are fixed. In Fig., we present the mean nodes
utilization for all workloads and clusters. Considering the packing schemes (Fig. ), when the
multiprogramming level is unlimited, the first fit is better for HL and LH workloads.
VIMAT/BE/CE/120940107018
Page 18
ADVANCE COMPUTING TECHNOLOGY (170704)
At a first moment, the best fit scheme finds the best slot for a job, but at long term,
this decision may prevent new jobs from entering in more appropriate slots. In the case of
HL and LH workloads, this chance increases, because the long jobs (with a low parallelism
degree) that remain after the execution of short jobs (with a high parallelism degree) will
probably occupy columns in common, thus, making it difficult to defragment the matrix. On
the other hand, the first fit initially makes the matrix more fragmented. Besides, it increases
the multiprogramming level. But at long term, it will make it easier to defragment the
matrix, because the jobs will have fewer columns in common. In the other cases, the best fit
scheme presents a slightly better performance. In general, both packing schemes have an
equivalent performance.
Conclusions
In this paper, we proposed, implemented, verified, validated and analyzed the
simulation tool ClusterSim. It has a graphical environment that facilitates the modeling and
creation of clusters and workloads (parallel jobs and users) to analyze their performance by
means of simulation. Its hybrid workload model (probabilistic model and structural
description) allows the representation of real parallel jobs (instructions, loops etc.).
Moreover it makes the simulation more deterministic than an only-probabilistic model. The
verification and validation of ClusterSim by means of manual execution and experimental
tests showed that ClusterSim provides mechanisms to repeat and modify some parameters of
real experiments under a controllable and trustful environment. As shown in our case study,
VIMAT/BE/CE/120940107018
Page 19
ADVANCE COMPUTING TECHNOLOGY (170704)
we can create synthetic workloads and evaluate the performance of different cluster
configurations.
Built in Java and with its source code available, the classes of ClusterSim can be
extended,allowing the creation of new network topologies, parallel job scheduling
algorithms, etc.
The main contributions of this paper are: the definition, proposal, implementation,
verification, validation and analysis of the ClusterSim. The main features of ClusterSim are:
an hybrid workload model, a graphical environment, the modeling of heterogeneous clusters
and astatistical and performance module. As future works we can highlight: to implement a
network topology editor, support to distributed simulation, simulation of grid architectures,
generation of statistical and performance graphics etc.
VIMAT/BE/CE/120940107018
Page 20
ADVANCE COMPUTING TECHNOLOGY (170704)
PRACTICAL:-5
Aim: Create Amazon server for storage and online processing.
First of all create the account in amazon. There are following steps:Step1:- open the website.
Step2:- then click ‘create an AWS Account’.
Step:-3 if not registered then first register and then login.
VIMAT/BE/CE/120940107018
Page 21
ADVANCE COMPUTING TECHNOLOGY (170704)
Now login,
VIMAT/BE/CE/120940107018
Page 22
ADVANCE COMPUTING TECHNOLOGY (170704)
Step:-4 now, sign up for AWS services.
Now, insert detail of credit/debit master card.
Step:-5 veriffy the detail and create account.
Amazon services:-
VIMAT/BE/CE/120940107018
Page 23
ADVANCE COMPUTING TECHNOLOGY (170704)
Amazon services has a features that are,
 Flexible
 Cost-effective
 Scalable
 Elastic
 Secure
Amazon provice two cloud services:
I.
Amazon EC2
II.
Amazon S3
Amazon EC2:Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides resizable compute
capacity in the cloud. It is designed to make web-scale cloud computing easier for developers.
Amazon EC2’s simple web service interface allows you to obtain and configure capacity with minimal
friction. It provides you with complete control of your computing resources and lets you run on
Amazon’s proven computing environment. Amazon EC2 reduces the time required to obtain and boot
new server instances to minutes, allowing you to quickly scale capacity, both up and down, as your
computing requirements change. Amazon EC2 changes the economics of computing by allowing you to
pay only for capacity that you actually use. Amazon EC2 provides developers the tools to build failure
resilient applications and isolate themselves from common failure scenarios.
Benefits
I.
Elastic Web-Scale Computing:Amazon EC2 enables you to increase or decrease capacity within minutes, not hours or
days. You can commission one, hundreds or even thousands of server instances
simultaneously. Of course, because this is all controlled with web service APIs, your
application can automatically scale itself up and down depending on its needs.
II.
Completely Controlled:You have complete control of your instances. You have root access to each one, and you can
interact with them as you would any machine. You can stop your instance while retaining
the data on your boot partition and then subsequently restart the same instance using web
service APIs. Instances can be rebooted remotely using web service APIs. You also have
access to console output of your instances.
III.
Flexible Cloud Hosting Services:You have the choice of multiple instance types, operating systems, and software packages.
Amazon EC2 allows you to select a configuration of memory, CPU, instance storage, and
the boot partition size that is optimal for your choice of operating system and application.
For example, your choice of operating systems includes numerous Linux distributions,
and Microsoft Windows Server.
IV.
Designed for use with other Amazon Web Services:Amazon EC2 works in conjunction with Amazon Simple Storage Service (Amazon S3),
Amazon Relational Database Service (Amazon RDS), Amazon SimpleDB and Amazon
Simple Queue Service (Amazon SQS) to provide a complete solution for computing, query
processing and storage across a wide range of applications.
VIMAT/BE/CE/120940107018
Page 24
ADVANCE COMPUTING TECHNOLOGY (170704)
V.
Reliable:Amazon EC2 offers a highly reliable environment where replacement instances can be
rapidly and predictably commissioned. The service runs within Amazon’s proven network
infrastructure and data centers. The Amazon EC2 Service Level Agreement commitment is
99.95% availability for each Amazon EC2 Region.
VI.
Secure:Amazon EC2 works in conjunction with Amazon VPC to provide security and robust
networking functionality for your compute resources.




Your compute instances are located in a Virtual Private Cloud (VPC) with an IP
range that you specify. You decide which instances are exposed to the Internet and
which remain private.
Security Groups and networks ACLs allow you to control inbound and outbound
network access to and from your instances.
You can connect your existing IT infrastructure to resources in your VPC using
industry-standard encrypted IPsec VPN connections.
You can provision your EC2 resources as Dedicated Instances. Dedicated Instances
are Amazon EC2 Instances that run on hardware dedicated to a single customer for
additional isolation.
If you do not have a default VPC you must create a VPC and launch instances into that VPC
to leverage advanced networking features such as private subnets, outbound security group
filtering, network ACLs, Dedicated Instances, and VPN connections.
VII.
Inexpensive:Amazon EC2 passes on to you the financial benefits of Amazon’s scale. You pay a very
low rate for the compute capacity you actually consume. See Amazon EC2 Instance
Purchasing Options for a more detailed description.
 On-Demand Instances
 Reserved Instances
 Spot Instances
Amazon S3:Amazon Simple Storage Service (Amazon S3), provides developers and IT teams with secure, durable,
highly-scalable object storage. Amazon S3 is easy to use, with a simple web service interface to store
and retrieve any amount of data from anywhere on the web. With Amazon S3, you pay only for the
storage you actually use. There is no minimum fee and no setup cost.
Amazon S3 offers a range of storage classes designed for different use cases including Amazon S3
Standard for general-purpose storage of frequently accessed data, Amazon S3 Standard - Infrequent
Access (Standard - IA) for long-lived, but less frequently accessed data, and Amazon Glacier for longterm archive. Amazon S3 also offers configurable lifecycle policies for managing your data throughout
its lifecycle. Once a policy is set, your data will automatically migrate to the most appropriate storage
class without any changes to your applications.
Amazon S3 can be used alone or together with other AWS services such as Amazon Elastic Compute
Cloud (Amazon EC2) and AWS Identity and Access Management (IAM), as well as third party storage
VIMAT/BE/CE/120940107018
Page 25
ADVANCE COMPUTING TECHNOLOGY (170704)
repositories and gateways. Amazon S3 provides cost-effective object storage for a wide variety of use
cases including cloud applications, content distribution, backup and archiving, disaster recovery, and big
data analytics.
Benefits
I.
Durable:Amazon S3 provides durable infrastructure to store important data and is designed for
durability of 99.999999999% of objects. Your data is redundantly stored across multiple
facilities and multiple devices in each facility.
II.
Low Cost:Amazon S3 allows you to store large amounts of data at a very low cost. Using lifecycle
management, you can set policies to automatically migrate your data to Standard Infrequent Access and Amazon Glacier as it ages to further reduce costs. You pay for what
you need, with no minimum commitments or up-front fees.
III.
Available:Amazon S3 Standard is designed for up to 99.99% availability of objects over a given year
and is backed by the Amazon S3 Service Level Agreement, ensuring that you can rely on it
when needed. You can also choose an AWS region to optimize for latency, minimize costs,
or address regulatory requirements..
IV.
Secure:Amazon S3 supports data transfer over SSL and automatic encryption of your data once it is
uploaded. You can also configure bucket policies to manage object permissions and control
access to your data using AWS Identity and Access Management (IAM).
V.
Scalable:With Amazon S3, you can store as much data as you want and access it when needed. You
can stop guessing your future storage needs and scale up and down as required, dramatically
increasing business agility.
VI.
Send Event Notifications:Amazon S3 can send event notifications when objects are uploaded to Amazon S3. Amazon
S3 event notifications can be delivered using Amazon SQS or Amazon SNS, or sent directly
to AWS Lambda, enabling you to trigger workflows, alerts, or other processing. For
example, you could use Amazon S3 event notifications to trigger transcoding of media files
when they are uploaded, processing of data files when they become available, or
synchronization of Amazon S3 objects with other data stores.
VII.
High Performance:Amazon S3 supports multi-part uploads to help maximize network throughput and
resiliency, and lets you choose the AWS region to store your data close to the end user and
minimize network latency. And Amazon S3 is integrated with Amazon CloudFront, a
content delivery web service that distributes content to end users with low latency, high data
transfer speeds, and no minimum usage commitments.
VIII.
Integrated:Amazon S3 is integrated with other AWS services to simplify uploading and downloading
data from Amazon S3 and make it easier to build solutions that use a range of AWS
VIMAT/BE/CE/120940107018
Page 26
ADVANCE COMPUTING TECHNOLOGY (170704)
services.
Amazon
S3
integrations
include Amazon
CloudFront, Amazon
CloudWatch,Amazon Kinesis, Amazon RDS, Amazon Glacier, Amazon EBS, Amazon
DynamoDB,Amazon Redshift, Amazon Route 53, Amazon EMR, Amazon VPC, Amazon
KMS, and AWS Lambda.
IX.
Easy to use:Amazon S3 is easy to use with a web-based management console and mobile app and full
REST APIs and SDKs for easy integration with third party technologies.
VIMAT/BE/CE/120940107018
Page 27
ADVANCE COMPUTING TECHNOLOGY (170704)
PRACTICAL-6
AIM: Install Eucalyptus in virtual machine using VM ware.
Launch VMware Workstation and follow the simple steps as shown below:
Select "Create New Virtual Machine" option
This will bring up the following dialog box. Select "Typical" and hit Next to continue.
VIMAT/BE/CE/120940107018
Page 28
ADVANCE COMPUTING TECHNOLOGY (170704)
Browse to the downloaded "Eucalyptus Faststart ISO" image and click Next when done.
Next, select "Linux" as the Guest Operating System and "CentOS 64-bit" as the version. This
Faststart ISO is based on CentOS 6.4 version and is a 64 Bit ISO.
Provide a suitable "Name" for your Virtual Machine. You can optionally browse the location where
you want to save your VM files.
VIMAT/BE/CE/120940107018
Page 29
ADVANCE COMPUTING TECHNOLOGY (170704)
Provide at least a disk size of "100GB" and "store the virtual disk as a single file". Click Next to
continue.
In the next dialog, hit "Customize Hardware" to increase the RAM and CPU for your VM.
VIMAT/BE/CE/120940107018
Page 30
ADVANCE COMPUTING TECHNOLOGY (170704)
You can optionally remove unwanted devices such as Printer, USB etc if you don't require them.
Click OK when done.
You are now ready to "Power ON" your Frontend Controller VM.
VIMAT/BE/CE/120940107018
Page 31
ADVANCE COMPUTING TECHNOLOGY (170704)
Once you power on your VM, the following boot screen appears with few options. We need to
select "Install CentOS 6 with Eucalyptus Frontend" option.
VIMAT/BE/CE/120940107018
Page 32
ADVANCE COMPUTING TECHNOLOGY (170704)
You will be prompted to run a "Disk Check Utility". Skip it for now.
Now we will be walked through a simple step by step installer to set up our Node Controller.
Click Next to begin
Select your appropriate "Language". click Next to continue.
VIMAT/BE/CE/120940107018
Page 33
ADVANCE COMPUTING TECHNOLOGY (170704)
Select the appropriate "Keyboard" for your system. Click Next
VIMAT/BE/CE/120940107018
Page 34
ADVANCE COMPUTING TECHNOLOGY (170704)
You will be prompted to format your current disk. Select "Yes, discard any data"
In the next prompt, provide a suitable hostname for your Node Controller (in this case, eucafrontend). Fill in the Static IP details for your VM as shown below.
NOTE: It is not recommended that you use a DHCP Network for any of Eucalyptus components.
Always provide static IPs.
VIMAT/BE/CE/120940107018
Page 35
ADVANCE COMPUTING TECHNOLOGY (170704)
Select your nearest "city" for timezone settings. Click Next to continue.
Provide a suitable "Root password" for your system. You may get a warning that you are using a
weak password, you can ignore it and select "Use anyway" to continue. (But don't ignore it on
Production servers !!)
VIMAT/BE/CE/120940107018
Page 36
ADVANCE COMPUTING TECHNOLOGY (170704)
In the next dialog, you need to provide the "Public IP Range/ List". This is going to be used as the
IP Range for your Eucalyptus Machine instances.
You can set what type of installation you want for your VM. I generally choose "Use all space".
you can optionally provide your own if you want.
VIMAT/BE/CE/120940107018
Page 37
ADVANCE COMPUTING TECHNOLOGY (170704)
"Write changes to disk" when prompted. This will now begin the installation process.
The installation will take couple of minutes to complete..
Once your Frontend installation completes, Faststart will automatically create a Eucalyptus
Machine Image (EMI). This will be used later to deploy instances in our Private Cloud.
VIMAT/BE/CE/120940107018
Page 38
ADVANCE COMPUTING TECHNOLOGY (170704)
Once the installation completes, you will be asked to "Reboot" your system. Reboot it.
VIMAT/BE/CE/120940107018
Page 39
ADVANCE COMPUTING TECHNOLOGY (170704)
Once the VM reboots, there will a lot of configuration going on. You will see Eucalyptus services
starting up as well.
Next, you will be asked to configure the Frontend. Click Forward to continue
Accept the "License Information" and click Forward
VIMAT/BE/CE/120940107018
Page 40
ADVANCE COMPUTING TECHNOLOGY (170704)
Next, you will be asked to provide the Node Controller information. You need to provide each
Node Controller's IP address separated by spaces. When you click onForward, you will need to
provide each Node's root password.
Create a "User" account for your Frontend. Click Forward once done.
VIMAT/BE/CE/120940107018
Page 41
ADVANCE COMPUTING TECHNOLOGY (170704)
You need to "sync date and time" over the network. You can optionally provide your own NTP
server settings if your Eucalyptus Frontend is on an isolated network.
And there you have it !! Your configuration is now done.. Note down the User Consoleand Admin
Console credentials before you move forward.
VIMAT/BE/CE/120940107018
Page 42
ADVANCE COMPUTING TECHNOLOGY (170704)
You can now launch any browser, type in the credentials and view the Admin and User consoles
respectively.
That's it for now !! In the NEXT tutorial we will be looking at the User Console and how to go
about launching your very first Eucalyptus instance. Stay tuned for much more !
VIMAT/BE/CE/120940107018
Page 43
ADVANCE COMPUTING TECHNOLOGY (170704)
PRACTICAL:-7
AIM:-To study various application of IaaS, PaaS and SaaS.
SaaS: Software as a Service:Cloud application services, or Software as a Service (SaaS), represent the largest cloud market and
are still growing quickly. SaaS uses the web to deliver applications that are managed by a thirdparty vendor and whose interface is accessed on the clients’ side. Most SaaS applications can be run
directly from a web browser without any downloads or installations required, although some require
plugins.
Because of the web delivery model, SaaS eliminates the need to install and run applications on
individual computers. With SaaS, it’s easy for enterprises to streamline their maintenance and
support, because everything can be managed by vendors: applications, runtime, data, middleware,
OSes, virtualization, servers, storage and networking.
Popular SaaS offering types include email and collaboration, customer relationship management,
and healthcare-related applications. Some large enterprises that are not traditionally thought of as
software vendors have started building SaaS as an additional source of revenue in order to gain a
competitive advantage.
SaaS Examples:

Google Apps

Salesforce

Workday

Concur

Citrix GoToMeeting

Cisco WebEx
Common SaaS Use-Case: Replaces traditional on-device software
Technology Analyst Examples: Bill Pray (Gartner), Amy DeMartine (Forrester)
PaaS: Platform as a Service:Cloud platform services, or Platform as a Service (PaaS), are used for applications, and other
development, while providing cloud components to software. What developers gain with PaaS is a
framework they can build upon to develop or customize applications. PaaS makes the development,
testing, and deployment of applications quick, simple, and cost-effective. With this technology,
enterprise operations, or a third-party provider, can manage OSes, virtualization, servers, storage,
networking, and the PaaS software itself. Developers, however, manage the applications.
Enterprise PaaS provides line-of-business software developers a self-service portal for managing
computing infrastructure from centralized IT operations and the platforms that are installed on top
of the hardware. The enterprise PaaS can be delivered through a hybrid model that uses both public
IaaS and on-premise infrastructure or as a pure private PaaS that only uses the latter.
VIMAT/BE/CE/120940107018
Page 44
ADVANCE COMPUTING TECHNOLOGY (170704)
Similar to the way in which you might create macros in Excel, PaaS allows you to create
applications using software components that are built into the PaaS (middleware). Applications
using PaaS inherit cloud characteristic such as scalability, high-availability, multi-tenancy, SaaS
enablement, and more. Enterprises benefit from PaaS because it reduces the amount of coding
necessary, automates business policy, and helps migrate apps to hybrid model. For the needs of
enterprises and other organizations, Apprenda is one provider of a private cloud PaaS for .NET and
Java.
Enterprise PaaS Examples: Apprenda
Common PaaS Use-Case: Increases developer productivity and utilization rates while also
decreasing an application’s time-to-market
Technology Analyst Examples: Richard Watson (Gartner), Eric Knipp (Gartner), Yefim Natis
(Gartner), Stefan Ried (Forrester), John Rymer (Forrester)
IaaS: Infrastructure as a Service
Cloud infrastructure services, known as Infrastructure as a Service (IaaS), are self-service models
for accessing, monitoring, and managing remote datacenter infrastructures, such as compute
(virtualized or bare metal), storage, networking, and networking services (e.g. firewalls). Instead of
having to purchase hardware outright, users can purchase IaaS based on consumption, similar to
electricity or other utility billing.
Compared to SaaS and PaaS, IaaS users are responsible for managing applications, data, runtime,
middleware, and OSes. Providers still manage virtualization, servers, hard drives, storage, and
networking. Many IaaS providers now offer databases, messaging queues, and other services above
the virtualization layer as well. Some tech analysts draw a distinction here and use the IaaS+
moniker for these other options. What users gain with IaaS is infrastructure on top of which they
can install any required platform. Users are responsible for updating these if new versions are
released.
IaaS Examples:

Amazon Web Services (AWS)

Microsoft Azure

Google Compute Engine (GCE)

Joyent
Common IaaS Use-Case: Extends current data center infrastructure for temporary workloads
(e.g. increased Christmas holiday site traffic). Technology Analyst Examples: Kyle
Hilgendorf (Gartner), Drue Reeves (Gartner), Lydia Leong (Gartner), Doug Toombs
(Gartner), Gregor Petri (Gartner EU), Tiny Haynes (Gartner EU), Jeffery Hammond
(Forrester), James Staten (Forrester)
VIMAT/BE/CE/120940107018
Page 45
Download