Uploaded by Mathew Sawe

HPC CLUSTER Notes 1

advertisement
HPC CLUSTER, VIRTUALIZATION AND CLOUD COMPUTING
HPC
High-performance computing (HPC) is the ability to process data and perform complex
calculations at high speeds. To put it into perspective, a laptop or desktop with a 3 GHz
processor can perform around 3 billion calculations per second. While that is much faster than
any human can achieve, it pales in comparison to HPC solutions that can perform quadrillions of
calculations per second.
One of the best-known types of HPC solutions is the supercomputer. A supercomputer contains
thousands of compute nodes that work together to complete one or more tasks. This is called
parallel processing. It’s similar to having thousands of PCs networked together, combining
compute power to complete tasks faster.
High-performance computing (HPC) is the use of super computers and parallel processing
techniques for solving complex computational problems. The term HPC is occasionally used as a
replacement for supercomputing, although technically a supercomputer is a system that performs
at or near the currently highest operational rate for computers.
A supercomputer is a computer with a high level of performance compared to a generalpurpose computer.
Parallel computing is a type of computing architecture in which several processors execute or
process an application or computation simultaneously. Parallel computing helps in performing
large computations by dividing the workload between more than one processor, all of which
work through the computation at the same time. Most supercomputers employ parallel
computing principles to operate.
Parallel computing is also known as parallel processing.
How Does HPC Work?
HPC solutions have three main components:



Compute
Network
Storage
To build a high-performance computing architecture, compute servers are networked together
into a cluster. Software programs and algorithms are run simultaneously on the servers in the
cluster. The cluster is networked to the data storage to capture the output. Together, these
components operate seamlessly (without interruption) to complete a diverse set of tasks.
HPC technology is implemented in multidisciplinary areas including:






Biosciences
Geographical data
Oil and gas industry modeling
Electronic design automation
Climate modeling
Media and entertainment
The common term for a high performance computer today is a cluster
Cluster Computing
Cluster computing is nothing but two or more computers that are networked together to provide
solutions as required. However, this idea should not be confused with a more general client-server
model of computing as the idea behind clusters is quite unique.
OR
A computer cluster consists of a set of loosely or tightly connected computers that work
together so that in many respects they can be viewed as a single system
The components of a cluster are usually connected to each other through fast local area networks
(“LAN”) with each node (computer used as a server) running its own instance of an operating
system. Computer clusters emerged as a result of convergence of a number of computing trends
including the availability of low cost microprocessors, high-speed networks, and software for
high performance distributed computing. Compute clusters are usually deployed to improve
performance and availability over that of a single computer, while typically being more costeffective than single computers of comparable speed or availability.
A cluster of computers joins computational powers of the compute nodes to provide a more
combined computational power. Therefore, as in the client-server model, rather than a simple
client making requests of one or more servers, cluster computing utilize multiple machines to
provide a more powerful computing environment perhaps through a single operating system.
In its simplest structure, HPC clusters are intended to utilize parallel computing to apply more
processor force for the arrangement (solution) of a problem.
HPC clusters will typically have a large number of computers (often called ‘nodes’) and, in
general, most of these nodes would be configured identically.
Compute nodes
A compute node is the place where all the computing is performed. Most of the nodes in a cluster
are ordinarily compute nodes. With a specific end goal to give a general arrangement, a compute
node can execute one or more tasks, taking into account the scheduling system.
Cluster Structure
It’s tempting to think of a cluster as just a bunch of interconnected machines, but when you begin
constructing a cluster, you’ll need to give some thought to the internal structure of the cluster.
This will involve deciding what roles the individual machines will play and what the
interconnecting network will look like.
CLUSTERS APPROACH
Symmetric Cluster
The simplest approach is a symmetric cluster. With a symmetric cluster each node can function
as an individual computer. This is extremely straightforward to set up. You just create a subnetwork with the individual machines (or simply add the computers to an existing network) and
add any cluster-specific software you'll need. Since all traffic must pass through the head,
asymmetric clusters tend to provide a high level of security. If the remaining nodes are
physically secure and your users are trusted, you'll only need to harden the head node.
Symmetric clusters
There are several disadvantages to a symmetric cluster. Cluster management and security can be
more difficult. Workload distribution can become a problem, making it more difficult to achieve
optimal performance.
There are several disadvantages to a symmetric cluster. Cluster management and security can be
more difficult. Workload distribution can become a problem, making it more difficult to achieve
optimal performance.
A Symmetric Architecture.
The head often acts as a primary server for the remainder of the clusters. The primary
disadvantage of this architecture comes from the performance limitations imposed by the cluster
head. For this reason, a more powerful computer may be used for the head. While beefing up the
head may be adequate for small clusters, its limitations will become apparent as the size of the
cluster grows. An alternative is to incorporate additional servers within the cluster. For example,
one of the nodes might function as an NFS server, a second as a management station that
monitors the health of the clusters, and so on.
Asymmetric clusters
The cluster architecture is classified as;

High Performance Computing

High Availability Computing
The choice of architecture differs by depending on the type of applications and purposes. High
availability (HA) clusters are used in mission critical applications to have constant availability of
services to end-users through multiple instances of one or more applications on many computing
nodes.
High Performance Computing (HPC) clusters are built to improve throughput in order to handle
multiple jobs of various sizes and types or to increase performance.
The most common HPC clusters are used to shorten turnaround times on compute-intensive
problems by running the job on multiple nodes at the same time or when the problem is just too
big for a single system
The term "cluster" can take different meanings in different contexts. This article focuses on three
types of clusters:

Fail-over clusters

Load-balancing clusters

High-performance clusters
Fail-over clusters
The simplest fail-over cluster has two nodes: one stays active and the other stays on stand-by but
constantly monitors the active one. In case the active node goes down, the stand-by node takes
over, allowing a mission-critical system to continue functioning.
Load-balancing clusters
Load-balancing clusters are commonly used for busy Web sites where several nodes host the
same site, and each new request for a Web page is dynamically routed to a node with a lower
load.
High-performance clusters
These clusters are used to run parallel programs for time-intensive computations and are of
special interest to the scientific community. They commonly run simulations and other CPUintensive programs that would take an inordinate amount of time to run on regular hardware.
Cluster Computing
Cluster computing is nothing but two or more computers that are networked together to provide
solutions as required. However, this idea should not be confused with a more general clientserver model of computing as the idea behind clusters is quite unique.
A cluster of computers joins computational powers of the compute nodes to provide a more
combined computational power. Therefore, as in the client-server model, rather than a simple
client making requests of one or more servers, cluster computing utilize multiple machines to
provide a more powerful computing environment perhaps through a single operating system.
In its simplest structure, HPC clusters are intended to utilize parallel computing to apply more
processor force for the arrangement (solution) of a problem.
Cluster Classifications
The various features of clusters are as follows.
1. High Performance
2. Expandability and Scalability
3. High Throughput
4. High Availability
COMPONENTS FOR CLUSTERS
The components of clusters are the hardware and software used to build clusters and nodes. They
are as follows.

Processors

Memory and Cache

Disk and I/O

System Bus
COMMON PROBLEMS FACED WITH HPC CLUSTERS:

Incorrectly designed infrastructure

Hard to manage cluster

Not having the right monitoring tools

Difficult cluster manageability

Lack of proper monitoring tools:
o
Infiniband
o
Storage
o
Networking
Common uses of high-performance clusters
Almost every industry needs fast processing power. With the increasing availability of cheaper
and faster computers, more companies are interested in reaping the technological benefits. There
is no upper boundary to the needs of computer processing power; even with the rapid increase in
power, the demand is considerably more than what's available.
Life sciences research
Proteins molecules are long flexible chains that can take on a virtually infinite number of 3D
shapes. In nature, when put in a solvent, they quickly "fold" to their native states. Incorrect
folding is believed to lead to a variety of diseases like Alzheimer's; therefore, the study of protein
folding is of fundamental importance.
One way scientists try to understand protein folding is by simulating it on computers. In nature,
folding occurs quickly (in about one millionth of a second), but it is so complex that its
simulation on a regular computer could take decades. This focus area is a small one in an
industry with many more such areas, but it needs serious computational power.
Other focus areas from this industry include pharmaceutical modeling, virtual surgery training,
condition and diagnosis visualizations, total-record medical databases, and the Human Genome
Project.
Oil and gas exploration
Seismograms convey detailed information about the characteristics of the interior of the Earth
and seafloors, and an analysis of the data helps in detecting oil and other resources. Terabytes of
data are needed to reconstruct even small areas; this analysis obviously needs a lot of
computational power. The demand for computational power in this field is so great that
supercomputing resources are often leased for this work.
Other geological efforts require similar computing power, such as designing systems to predict
earthquakes and designing multispectral satellite imaging systems for security work.
Graphics rendering
Manipulating high-resolution interactive graphics in engineering, such as in aircraft engine
design, has always been a challenge in terms of performance and scalability because of the sheer
volume of data involved. Cluster-based techniques have been helpful in this area where the task
to paint the display screen is split among various nodes of the cluster, which use their graphics
hardware to do the rendering for their part of the screen and transmit the pixel information to a
master node that assembles the consolidated image for the screen.
Reasons to Use a Cluster
Clusters or combination of clusters are used when content is critical or when services have to be
available and / or processed as quickly as possible. Internet Service Providers (ISPs) or ecommerce sites often require high availability and load balancing in a scalable manner. The
parallel clusters are heavily involved in the film industry for rendering high quality graphics and
animations, recalling that the Titanic was rendered within this platform in the Digital Domain
laboratories. The Beowulf clusters are used in science, engineering and finance to work on
projects of protein folding, fluid dynamics, neural networks, genetic analysis, statistics,
economics, astrophysics among others. Researchers, organizations and companies are using
clusters because they need to increase their scalability, resource management, availability or
processing to supercomputing at an affordable price level.
Reasons to Use HPC
HPC is primarily used for two reasons. First, thanks to the increased number of central
processing units (CPUs) and nodes, more computational power is available. Greater
computational power enables specific models to be computed faster, since more operations can
be performed per time unit. This is known as the speedup.
Second, in the case of a cluster, the amount of memory available normally increases in a linear
fashion with the inclusion of additional nodes. As such, larger and larger models can be
computed as the number of units grows. This is referred to as the scaled speedup. Applying such
an approach makes it possible to, in some sense, “cheat” the limitations posed by Amdahl’s law,
which considers a fixed-size problem. Doubling the amount of computational power and
memory allows for a task that is twice as large as the base task to be computed within the same
stretch of time.
VIRTUALIZATION
Virtualization refers to the creation of a virtual resource such as a server, desktop, operating
system, file, storage or network.
The main goal of virtualization is to manage workloads by radically transforming traditional
computing to make it more scalable. Virtualization has been a part of the IT landscape for
decades now, and today it can be applied to a wide range of system layers, including operating
system-level virtualization, hardware-level virtualization and server virtualization.
Virtualization is a large umbrella of technologies and concepts that are meant to provide an abstract
environment — whether virtual hardware or an operating system — to run applications. The term
virtualization is often synonymous with hardware virtualization, which plays a fundamental role in
efficiently delivering Infrastructure-as-a-Service (IaaS) solutions for cloud computing. In fact,
virtualization technologies have along trail in the history of computer science and have been available in
many flavors by providing virtual environments at the operating system level, the programming
language level, and the application level. Moreover, virtualization technologies provide a virtual
environment for not only executing applications but also for storage, memory, and networking.
Since its inception, virtualization has been sporadically explored and adopted, but in the last few years
there has been a consistent and growing trend to leverage this technology. Virtualization technologies
have gained renewed interested recently due to the confluence of several phenomena:
• Increased performance and computing capacity
• Underutilized hardware and software resources
• Lack of space
• Greening initiatives
• Rise of administrative costs
Virtualization is a broad concept that refers to the creation of a virtual version of something, whether
hardware, a software environment, storage, or a network. In a virtualized environment there are three
major components: guest, host, and virtualization layer. The guest represents the system component
that interacts with the virtualization layer rather than with the host, as would normally happen. The host
represents the original environment where the guest is supposed to be managed. The virtualization
layer is responsible for recreating the same or a different environment where the guest will operate.
Such a general abstraction finds different applications and then implementations of the virtualization
technology. Virtualization commonly implemented with hypervisor technology, which is a software or
firmware elements that can virtualizes system resources.
The Use of Computers
Virtualization
Basic techniques used in virtualization
Emulation: It is a virtualization technique which converts the behavior of the computer
hardware to a software program and lies in the operating system layer which lies on the
hardware. Emulation provides enormous flexibility to guest operating system but the speed of
translation process is low compared to hypervisor and requires a high configuration of hardware
resources to run the software.
Virtual Machine Monitor or Hypervisor: A software layer that can monitor and virtualize the
resources of a host machine conferring to the user requirements. It is an intermediate layer
between operating system and hardware. Basically, hypervisor is classified as native and hosted.
The native based hypervisor runs directly on the hardware whereas host based hypervisor runs on
the host operating system. The software layer creates virtual resources such as CPU, memory,
storage and drivers.
Hypervisor Types
Type 1 (bare metal) hypervisors.
Type 1, or bare metal, hypervisors run on the server hardware. They get more control over the
host hardware, thus providing better performance and security. And guest virtual machines run
on top of the hypervisor layer. There are a couple of hypervisors available on the market that
belong to this hypervisor family, for example, Microsoft Hyper-V, VMware vSphere ESXi
Server, and Citrix XenServer.
Type 2 (hosted) hypervisors
These second type of hypervisors run on top of the operating system as an application installed
on the server. And that's why they are often referred to as hosted hypervisors. In type 2
hypervisor environments, the guest virtual machines run on top of the hypervisor layer.
While looking at the type 2 hypervisor architecture, we can see that the hypervisor is installed on
top of the operating system layer, which doesn't allow the hypervisor to directly access the
hardware. This inability to have direct access to the host's hardware increases overhead for the
hypervisor, and thus the resources that you may run on the type 1 hypervisor are more while
those on the type 2 hypervisor are less for the same hardware.
The second major disadvantage of type 2 hypervisors is that this hypervisor runs as a Windows
NT service; so if this service is killed, the virtualization platform will not be available anymore.
The examples of Type 2 hypervisor are Microsoft Virtual Server, Virtual PC, and VMware
Player/VMware Workstation.
Para Virtualization: This technique provides special hypercalls that substitutes the instruction
set architecture of host machine. It relates communication between hypervisor and guest
operating system to improve efficiency and performance. Accessing resources in para
virtualization is better than the full virtualization model since all resources must be emulated in
full virtualization model. The drawback of this technique is to modify the kernel of guest
operating system using hypercalls. This model is only suitable with open source operating
systems.
This does not emulate the hardware environment in the software, instead it does organize the
access to hardware resources in aid of the virtual machine. The para-virtualized type of hardware
virtualization offers possible performance benefits when a guest machine operating system is
running in the virtualized environment with modifications done to the guest that is been
virtualized.
Para-virtualization
Full Virtualization: In this type of virtualization, a complete look-alike of the real hardware is
virtualized to allow the software (consisting of guest operating system) to function without any
modification as shown below.
Full virtualization
Types of virtualization
Virtualization is a broad term, and addresses a wide range of core-computing elements, but here
we will mainly discuss three major types of virtualization, which are as follows:
Server virtualization
Network virtualization
Storage virtualization
Virtualization types
Server Virtualization: In server virtualization, single server performs the task of multiple
servers by portioning out the resources of an individual server across multi-environment. The
hypervisor layer allows for hosting multiple applications and operating systems locally or
remotely. The advantages of virtualization include cost savings, lower capital expenses, high
availability and efficient use of resources.
Virtualization helps organizations to consolidate various IT workloads while running
independently on a single hardware. What this consolidation mainly does is to achieve cost
reduction by getting rid of running dedicated hardware for one application. Other than the cost
factor, server consolidation greatly helps to ensure that existing hardware equipment does not
stay underutilized and it is utilized in a more productive way.
Client / Desktop Virtualization: This client virtualization technology makes the system
administrator to virtually monitor and update the client machines like workstation desktop,
laptop and mobile devices. It improves the client machines management and enhances the
security to defend from hackers and cybercriminals. There are three types of client virtualization.
First, remote or server hosted virtualization which is hosted on a server machine and operated by
the client across a network. Second, local or client hosted virtualization in which the secured and
virtualized operating environment runs on local machine. Third, application virtualization that
provides multiple ways to run an application which is not in traditional manner. In this technique
an isolated virtualized environment or partitioning technique is used to run an application.
Storage Virtualization: It creates the abstraction of logical storage from physical storage. Three
kinds of data storage are used in virtualization, they are DAS (Direct Attached Storage), NAS
(Network Attached Storage) and SAN (Storage Area Network). DAS is the conventional method
of data storage where storage drives are directly attached to server machine. NAS is the shared
storage mechanism which connects through network. The NAS is used for file sharing, device
sharing and backup storing among machines. SAN is a storage device that are shared with
different server over a high accelerate network. Hypervisor is the software package that controls
working access to the physical hardware of host machine. There are two kinds of hypervisor
models as hosted and bare metal / native. Hosted hypervisor instance operates on top of the host
operating system whereas bare metal based hypervisor operates directly on the hardware of host
machine.
Benefits of Virtualization
•
Sharing of resources helps cost reduction
•
Isolation: Virtual machines are isolated from each other as if they are physically
separated
•
Encapsulation: Virtual machines encapsulate a complete computing environment
•
Hardware Independence: Virtual machines run independently of underlying hardware
•
Portability: Virtual machines can be migrated between different hosts.
Advantages of Virtualization
Security: A security breach on one of the virtual machines does not affect the other VM because
of isolation. This is achieved by the different compact environment that have different or
separate security measures in the different guest machines.
Memory virtualization.
Reliability and Availability: When there’s a software failure in one virtual machine or guest
machine, it doesn’t affect other virtual machines.
• Cost: Virtualization is cost effective by combining small servers to secure a more powerful
server. The cost effectiveness of virtualization runs down to the hardware, operations (man
power), floor space, and software licenses.
The cost reduction created from virtual machine ranges from 29% to 64%.
• Adaptability to Workload Differences: In virtualization when workload changes or varies,
the workload degree can be optimized easily by shifting the resources and priority allocations
between or among virtual machines. Processors can also be moved from one virtual machine to
another.
• Load Balancing: The software state of a VM is relatively condensed by the hypervisor, this
makes it possible for migration of the entire virtual machine to another platform, it improves load
balancing.
• Legacy Applications: This enables the running of legacy applications on old OD in the guest
machines. For example if an enterprise decides to migrate to a different OS, it is possible to
maintain the old legacy applications on the old VM or guest machine.
CLOUD COMPUTING
Nowadays we are witnessing the emergence of a different technology that enhances business and
personal use, the most important one especially from enterprise company perspective is cloud
computing. Cloud computing is a service model for IT provisioning, often based on virtualization and
distributed computing technologies
Cloud Basic Services
Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a
shared pool of configurable computing resources (e.g., networks, servers, storage, applications,
and services) that can be rapidly provisioned and released with minimal management effort or
service provider interaction.
Cloud Types
Most people separate cloud computing into two distinct sets of models:
A. Deployment models: This refers to the location and management of the cloud's
infrastructure.
B. Service models: This consists of the particular types of services that you can access on a
cloud computing platform.
DEPLOYMENT MODELS
A deployment model defines the purpose of the cloud and the nature of how the cloud is located.
The NIST definition for the four deployment models is as follows:
1. Public cloud: The public cloud infrastructure is available for public use alternatively for a
large industry group and is owned by an organization selling cloud services.
2. Private cloud: The private cloud infrastructure is operated for the exclusive use of an
organization. The cloud may be managed by that organization or a third party. Private clouds
may be either on- or off-premises.
3. Hybrid cloud: A hybrid cloud combines multiple clouds (private, community of public)
where those clouds retain their unique identities, but are bound together as a unit. A hybrid cloud
may offer standardized or proprietary access to data and applications, as well as application
portability.
4. Community cloud: A community cloud is one where the cloud has been organized to serve a
common function or purpose. It may be for one organization or for several organizations, but
they share common concerns such as their mission, policies, security, regulatory compliance
needs, and so on. A community cloud may be managed by the constituent organization(s) or by a
third party.
SERVICE MODELS
In the deployment model, different cloud types are an expression of the manner in which
infrastructure is deployed. You can think of the cloud as the boundary between where a client's
network, management and responsibilities ends and the cloud service provider's begins. As cloud
computing has developed, different vendors offer clouds that have different services associated.
Cloud is discussed in terms of services: three service models are available at different layers;
they are known as:
1. Infrastructure as a Service: IaaS provides virtual machines, virtual storage, virtual
infrastructure and other hardware assets as resources that clients can provision. The IaaS service
provider manages all the infrastructure, while the client is responsible for all other aspects of the
deployment. This can include the operating system, applications, and user interactions with the
system.
2. Platform as a Service: PaaS provides virtual machines, operating systems, applications,
services, development frameworks, transactions, and control structures. The client can deploy its
applications on the cloud infrastructure or use applications that were programmed using
languages and tools that are supported by the PaaS service provider. The service provider
manages the cloud infrastructure, the operating systems, and the enabling software. The client is
responsible for installing and managing the application that it is deploying.
3. Software as a Service: SaaS is a complete operating environment with applications,
management and the user interface. In the SaaS model, the application is provided to the client
through a thin client interface (a browser, usually), and the customer's responsibility begins and
ends with entering and managing its data and user interaction. Everything from the application
down to the infrastructure is the vendor's responsibility. Examples of IaaS service providers
include:
• Amazon Elastic Compute Cloud (EC2)
• Eucalyptus
• GoGrid
• FlexiScale
• Linode
• RackSpace Cloud
• Terremark
Examples of PaaS services are:
• Force.com
• GoGrid CloudCenter
• Google AppEngine
• Windows Azure Platform
Examples of SaaS cloud service providers are:
• GoogleApps
• Oracle On Demand
CHARACTERISTICS
The five essential characteristics that cloud computing systems must offer:
1. On-demand self-service: A client can provision computer resources without the need for
interaction with cloud service provider personnel.
2. Broad network access: Access to resources in the cloud is available over the network using
standard methods in a manner that provides platform-independent access to clients of all types.
This includes a mixture of heterogeneous operating systems, and thick and thin platforms such as
laptops, mobile phones, and PDA.
3. Resource pooling: A cloud service provider creates resources that are pooled together in a
system that supports multi-tenant usage. Physical and virtual systems are dynamically allocated
or reallocated as needed. Intrinsic in this concept of pooling is the idea of abstraction that hides
the location of resources such as virtual machines, processing, memory, storage, and network
bandwidth and connectivity.
4. Rapid elasticity: Resources can be rapidly and elastically provisioned. The system can add
resources by either scaling up systems (more powerful computers) or scaling out systems (more
computers of the same kind), and scaling may be automatic or manual. From the standpoint of
the client, cloud computing resources should look limitless and can be purchased at any time and
in any quantity.
5. Measured service: The use of cloud system resources is measured, audited, and reported to
the customer based on a metered system. A client can be charged based on a known metric such
as amount of storage used, number of transactions, network I/O (Input/Output) or bandwidth,
amount of processing power used, and so forth. A client is charged based on the level of services
provided.
Cloud Computing has numerous advantages. Some of them are listed below 
One can access applications as utilities, over the Internet.

One can manipulate and configure the applications online at any time.

It does not require to install a software to access or manipulate cloud application.

Cloud Computing offers online development and deployment tools, programming untime
environment through PaaS model.

Cloud resources are available over the network in a manner that provide platform
independent access to any type of clients.

Cloud Computing offers on-demand self-service. The resources can be used without
interaction with cloud service provider.

Cloud Computing is highly cost effective because it operates at high efficiency with
optimum utilization. It just requires an Internet connection Cloud Computing offers load
balancing that makes it more reliable.
Why organizations are interested in cloud computing?
Cloud computing can significantly reduce the cost and complexity of owning and operating
computers and networks. If an organization uses a cloud provider, it does not need to spend
money on information technology infrastructure, or buy hardware or software licences. Cloud
services can often be customized and flexible to use, and providers can offer advanced services
that an individual company might not have the money or expertise to develop.
Threats in cloud computing and how they occur
Data breaches
When a virtual machine is able to access the data from another virtual machine on the same
physical host, a data breach occurs – the problem is much more prevalent when the tenants of the
two virtual machines are different customers.
Data loss
The data stored in the cloud could be lost due to the hard drive failure.
Account Hijacking
It’s often the case that only a password is required to access our account in the cloud and
manipulate the data, which is why the usage of two-factor authentication is preferred.
Insecure APIs
Various cloud services on the Internet are exposed by application programming interfaces. Since
the APIs are accessible from anywhere on the Internet, malicious attackers can use them to
compromise the confidentiality and integrity of the enterprise customers.
Denial of Service
An attacker can issue a denial of service attack against the cloud service to render it inaccessible,
therefore disrupting the service. There are a number of ways an attacker can disrupt the service
in a virtualized cloud environment: by using all its CPU, RAM, disk space or network
bandwidth.
Malicious Insiders
Employees working at cloud service provider could have complete access to the company
resources. Therefore cloud service providers must have proper security measures in place to
track employee actions like viewing a customer’s data. Since cloud service provides often don’t
follow the best security guidelines and don’t implement a security policy, employees can gather
confidential information from arbitrary customers without being detected.
Abuse of cloud services
One of cloud computing’s greatest benefits is that it allows even small organizations access to
vast amounts of computing power. It would be difficult for most organizations to purchase and
maintain tens of thousands of servers, but renting time on tens of thousands of servers from a
cloud computing provider is much more affordable. However, not everyone wants to use this
power for good. It might take an attacker years to crack an encryption key using his own limited
hardware, but using an array of cloud server she might be able to crack it in minutes.
Cloud computing security
Cloud computing security or, more simply, cloud security refers to a broad set of policies,
technologies, and controls deployed to protect data, applications, and the associated
infrastructure of cloud computing. It is a sub-domain of computer security, network security,
and, more broadly, information security.
Because of the cloud's very nature as a shared resource, control are of particular concern. With
more organizations using associated cloud providers for data operations, proper
vulnerable areas have become a priority for organizations contracting with a cloud computing
provider.
Cloud computing security processes should address the incorporate to maintain the customer's
regulations. The processes will also likely include a in the case of a cloud security breach.
The three most important protection goals- confidentiality, integrity, authenticity – are
introduced in the next few sections, and then explained in more detail with reference to selected
cloud computing scenarios.
Pillars of Cloud Computing Security
Confidentiality
The confidentiality of a system is guaranteed providing it prevents unauthorized gathering of
information. In data secure systems, the “confidentiality” characteristic requires authorizations
and checks to be defined, to ensure that information cannot be accessed by subjects who do not
have corresponding rights. This comprises both access to stored data authorized by users and
data transferred via a network. It must be possible to assign and withdraw the rights that are
necessary to process this data, and checks must be implemented to enforce compliance.
Cryptographic techniques and access controls based on strong authentication are normally used
to protect confidentiality.
Integrity
A system guarantees data integrity if it is impossible for subjects to manipulate the protected data
unnoticed or in an unauthorized way. Data, messages, and information are considered to have
integrity if they are trustworthy and cannot be tampered with. A cloud computing system assures
the integrity of the protected data if this information cannot be modified by third parties. If
integrity is specified as a protection goal for cloud services, not only the cloud surface itself that
is accessed by the end user must achieve this goal but also all other components within a cloud
system.
Authenticity
The authenticity of a subject or object is defined as its genuineness and credibility; these can be
verified on the basis of its unique identity and characteristic features. Information is authentic if
it can be reliably assigned to the sender, and if it can be proved that this information has not been
changed since it was created and distributed. A secure technique for identifying the
communication partners and mechanisms for ensuring authenticity are essential here.
Cloud security controls
Cloud security architecture is effective only if the correct defensive implementations are in place.
An efficient cloud security architecture should recognize the issues that will arise with security
management. The security management addresses these issues with security controls. These
controls are put in place to safeguard any weaknesses in the system and reduce the effect of an
attack. While there are many types of controls behind a cloud security architecture, they can
usually be found in one of the following categories.
Deterrent controls
These controls are intended to reduce attacks on a cloud system. Much like a warning
sign on a fence or a property, deterrent controls typically reduce the threat level by
informing potential attackers that there will be adverse consequences for them if they
proceed. (Some consider them a subset of preventive controls.)
Preventive controls
Preventive controls strengthen the system against incidents, generally by reducing if not
actually eliminating vulnerabilities. Strong authentication of cloud users, for instance,
makes it less likely that unauthorized users can access cloud systems, and more likely
that cloud users are positively identified.
Detective controls
Detective controls are intended to detect and react appropriately to any incidents that
occur. In the event of an attack, a detective control will signal the preventative or
corrective controls to address the issue. System and network security monitoring,
including intrusion detection and prevention arrangements, are typically employed to
detect attacks on cloud systems and the supporting communications infrastructure.
Corrective controls
Corrective controls reduce the consequences of an incident, normally by limiting the
damage. They come into effect during or after an incident. Restoring system backups in
order to rebuild a compromised system is an example of a corrective control.
Download