hello world - Department of Computer Engineering


ETM 555


Lecture Notes

Version 5. / 2012


Part 1: Hardware/Software Systems, Grid / Cloud Computing

ETM 555 1

ETM 555

Part 1


Systems, Grid Computing


ETM 555


Parallel/Distributed Processing

High Performance Computing

Top 500 list

Grid computing picture of

Tianhe, the most powerful computer in the world in Nov-2010

ETM 555 3


Von Neumann Architecture

RAM Device Device


ETM 555

• sequential computer


History of Computer Architecture

4 Generations (identified by logic technology)

1. Tubes

2. Transistors

3. Integrated Circuits

4. VLSI (very large scale integration)

ETM 555 5

ETM 555




• Traditional mainframe/supercomputer performance 25% increase per year

• But … microprocessor performance 50% increase per year since mid 80’s.

ETM 555 7

Moore’s Law

• “Transistor density doubles every 18 months”

• Moore is co-founder of


• 60 % increase per year

• Exponential growth

• PC costs decline.

• PCs are building bricks of all future systems.

ETM 555 8

VLSI Generation

ETM 555 9

Bit Level Parallelism

(upto mid 80’s)

• 4 bit microprocessors replaced by 8 bit, 16 bit, 32 bit etc.

• doubling the width of the datapath reduces the number of cycles required to perform a full 32-bit operation

• mid 80’s reap benefits of this kind of parallelism (full 32bit word operations combined with the use of caches)

ETM 555 10

Instruction Level Parallelism

(mid 80’s to mid 90’s)

• Basic steps in instruction processing (instruction decode, integer arithmetic, address calculations, could be performed in a single cycle)

• Pipelined instruction processing

• Reduced instruction set (RISC)

• Superscalar execution

• Branch prediction

ETM 555 11

Thread/Process Level Parallelism

(mid 90’s to present)

• On average control transfers occur roughly once in five instructions, so exploiting instruction level parallelism at a larger scale is not possible

• Use multiple independent “threads” or processes

• Concurrently running threads, processes

ETM 555 12

Evolution of the Infrastructure

Electronic Accounting Machine Era: 1930-1950

General Purpose Mainframe and Minicomputer Era: 1959-


Personal Computer Era: 1981 – Present

Client/Server Era: 1983 – Present

Enterprise Internet Computing Era: 1992- Present

ETM 555 13

ETM 555


Memory Hierarchy



Real Memory





Sequential vs Parallel Processing

• physical limits reached

• easy to program

• expensive supercomputers

• “raw” power unlimited

• more memory, multiple cache

• made up of COTS, so cheap

• difficult to program

ETM 555 15

Amdahl’s Law

• The serial percentage of a program is fixed. So speed-up obtained by employing parallel processing is bounded.

• Lead to pessimism in in the parallel processing community and prevented development of parallel machines for a long time.

Speedup = s +




• In the limit:

Spedup = 1/s

ETM 555 s


Gustafson’s Law

• Serial percentage is dependent on the number of processors/input.

• Demonstrated achieving more than 1000 fold speedup using

1024 processors.

• Justified parallel processing

ETM 555 17

Grand Challenge Applications

• Important scientific & engineering problems identified by

U.S. High Performance Computing & Communications

Program (’92)

ETM 555 18

Flynn’s Taxonomy

• classifies computer architectures according to:

1. Number of instruction streams it can process at a time

2. Number of data elements on which it can operate simultaneously

Data Streams

Single Multiple





Instruction Streams

ETM 555 19

SPMD Model

(Single Program Multiple Data)

• Each processor executes the same program asynchronously

• Synchronization takes place only when processors need to exchange data

• SPMD is extension of SIMD (relax synchronized instruction execution)

• SPMD is restriction of MIMD (use only one source/object)

ETM 555 20

Parallel Processing Terminology

• Embarassingly Parallel :

-applications which are trivial to parallelize

-large amounts of independent computation

-Little communication

Data Parallelism :

-model of parallel computing in which a single operation can be applied to all data elements simultaneously

-amenable to SIMD or SPMD style of computation

• Control Parallelism :

-many different operations may be executed concurrently

-require MIMD/SPMD style of computation

ETM 555 21

Parallel Processing Terminology

• Scalability:

- If the size of problem is increased, number of processors that can be effectively used can be increased (i.e. there is no limit on parallelism).

- Cost of scalable algorithm grows slowly as input size and the number of processors are increased.

Data parallel algorithms are more scalable than control parallel alorithms

• Granularity:

- fine grain machines: employ massive number of weak processors each with small memory

- coarse grain machines: smaller number of powerful processors each with large amounts of memory

ETM 555 22

Shared Memory Machines

Shared Address Space process

(thread) process

(thread) process

(thread) process

(thread) process


•Memory is globally shared, therefore processes (threads) see single address space

•Coordination of accesses to locations done by use of locks provided by thread libraries

•Example Machines: Sequent, Alliant, SUN Ultra, Dual/Quad Board Pentium PC

•Example Thread Libraries: POSIX threads, Linux threads.

ETM 555 23

Shared Memory Machines

• can be classified as:

-UMA: uniform memory access

-NUMA: nonuniform memory access




P based on the amount of time a processor takes to access local and global memory.










Interconnection network/ or BUS





Interconnection network





Interconnection network








ETM 555

(b) (c)


Distributed Memory Machines

M process process M


M process process M process


•Each processor has its own local memory (not directly accessible by others)

•Processors communicate by passing messages to each other

•Example Machines: IBM SP2, Intel Paragon, COWs (cluster of workstations)

•Example Message Passing Libraries: PVM, MPI

ETM 555 25

Beowulf Clusters

•Use COTS, ordinary PCs and networking equipment

•Has the best price/performance ratio

PC cluster

26 ETM 555

What is Multi-Core Programming ?

Answer: It is basically parallel programming on a single computer box (e.g. a desktop, a notebook, a blade)

ETM 555 27

Important Benefit of

Multi-Core : Reduced Energy Consumption

Dual core

Single core

1 GHz 1 GHz

2 GHz

Energy per cycle(E ) = C*Vdd


ETM 555


= 0.25*C*Vdd


= 0.25*Energy


Multi-Core Computing

A multi-core microprocessor is one which combines two or more independent processors into a single package, often a single integrated circuit.

A dual-core device contains only two independent microprocessors.

ETM 555 29

Comparison of Different Architectures

CPU State

Execution unit


Single Core Architecture

ETM 555 30

Comparison of Different Architectures

CPU State

Execution unit



CPU State

Execution unit


ETM 555 31

Comparison of Different Architectures

CPU State

Execution unit


CPU State

Hyper-Threading Technology

ETM 555 32

Comparison of Different Architectures

CPU State

Execution unit


CPU State

Execution unit


Multi-Core Architecture

ETM 555 33

Comparison of Different Architectures

CPU State

Execution unit

CPU State

Execution unit


Multi-Core Architecture with Shared Cache

ETM 555 34

Comparison of Different Architectures

CPU State

Execution unit


CPU State CPU State

Execution unit


CPU State

Multi-Core with Hyper-Threading Technology

ETM 555 35

Graphics Processing Units (GPUs)

GPU devotes more transistors to data processing

ETM 555

Hillis’ Thesis ’85

(back to the future !)

Piece of silicon

Sequential computer

Parallel computer

• proposed “The Connection Machine” with massive number of processors each with small memory operating in SIMD mode.

• CM-1, CM-2 machines from Thinking Machines Corporation (TMC)were examples of this architecture with 32K-128K processors.

ETM 555 37

Floating Point Operations for the CPU and the GPU

ETM 555

Memory Bandwidth for the CPU and the GPU

ETM 555

ETM 555

NVIDIA GPU Supports Various Languages or

Application Programming Interfaces

Automatic Scalability

A multithreaded program is partitioned into blocks of threads that execute independently from each other, so that a GPU with more cores will automatically execute the program in less time than a GPU with fewer cores.

ETM 555

ETM 555

Grid of Thread Blocks

ETM 555

Memory Hierarchy

GPU Programming Model

Heterogeneous Programming

Serial code executes on the host while parallel code executes on the device.

ETM 555

ETM 555 45

Top 500 Most Powerful Computers List

• http://www.top500.org/list/2011/06

ETM 555 46

Grid Computing

• provide access to computing power and various resources just like accessing electrical power from electrical grid

• Allows coupling of geographically distributed resources

• Provide inexpensive access to resources irrespective of their physical location or access point

• Internet & dedicated networks can be used to interconnect distributed computational resources and present them as a single unified resource

• Resources: supercomputers, clusters, storage systems, data resources, special devices

ETM 555 47

Grid Computing

• the GRID is, in effect, a set of software tools , which when combined with hardware, would let users tap processing power off the Internet as easily as the electrical power can be drawn from the electricty grid.

• Examples of Grids:

-TeraGrid (USA)

-EGEE Grid (Europe)

- TR-Grid (Turkey)

ETM 555 48


ETM 555

Power Grid

Compute Grid




Civil Protection

Comp. Chemistry

Earth Sciences




High Energy Physics

Life Sciences


Material Sciences

ETM 555

>250 sites

48 countries

>50,000 CPUs

>20 PetaBytes

>10,000 users

>150 VOs

>150,000 jobs/day



Virtualization is abstraction of computer resources.

Make a single physical resource such as a server, an operating system, an application, or storage device appear to function as multiple logical resources

It may also mean making multiple physical resources such as storage devices or servers appear as a single logical resource

Server virtualization enables companies to run more than one operating system at the same time on a single machine

ETM 555 51

Advantages of Virtualization

Most servers run at just 10-15 %capacity – virtualization can increase server utilization to 70% or higher.

Higher utilization means fewer computers are required to process the same amount of work. Fewer machines means less power consumption.

Legacy applications can also be run on older versions of an operating system

Other advantages: easier administration, fault tolerancy, security

ETM 555 52

VMware Virtual Platform

Virtual machines

Real machines

Virtual machine 1

Apps 1

OS 1

Virtual machine 2

Apps 2

OS 2

X86, motherboard disks, display, net ..

X86, motherboard disks, display, net ..

VMware Virtual Platform

X86, motherboard, disks, display, net ..

•VMware is now 40 billion dollar company !!

ETM 555 53

Cloud Computing

• Style of computing in which IT-related capabilities are provided “ as a service ”,allowing users to access technology-enabled services from the Internet

("in the cloud") without knowledge of, expertise with, or control over the technology infrastructure that supports them.

• General concept that incorporates software as a service (SaaS), Web 2.0

and other recent, well-known technology trends, in which the common theme is reliance on the Internet for satisfying the computing needs of the users.

ETM 555 54

Cloud Computing

Virtualisation provides separation between infrastructure and user runtime environment

Users specify virtual images as their deployment building blocks

Pay-as-you-go allows users to use the service when they want and only pay for what they use

Elasticity of the cloud allows users to start simple and explore more complex deployment over time

Simple interface allows easy integration with existing systems


ETM 555

Cloud: Unique Features

Ease of use


Runtime environment

Hardware virtualisation

Gives users full control



Cloud providers can buy hardware faster than you!

ETM 555


Cloud computing is about much more than technological capabilities.

Technology is the mechanism, but, as in any shift in business, the driver is economics.

Nicholas Carr,The author of “The Big Switch”

ETM 555

Better Economics

We want to pay only for what we use

And we want to control it accurately.

ETM 555

Facing New Challenges

Complexity of modern IT infrastructures: physical servers, virtual machines, clusters, Grids, geographical distribution

Cost of electricity

Credit crunch

Further pressures to reduce costs

Openness to the acceptable security concept

ETM 555

Develop Test Release

Install Configure Operate

Develop Test Operate http://www....

ETM 555

Develop Test

Undifferentiated heavy lifting

• Hardware costs

• Software costs

• Maintenance

• Load balancing

• Scaling

• Utilization

• Idle machines

• Bandwidth management

• Server hosting

• Storage


• High availability

ETM 555


ETM 555

The 70/30 Switch

Finding Solutions

Improving utilisation rates through market based algorithms for resource allocation

Accessing external infrastructures on-demand

Using a single management platform for all computing resources

ETM 555

ETM 555

Cloud vs Grid

From the customers/end users point of view

They are the same

ETM 555

Grid/cloud market structure









Lower cost

Access to larger infrastructure

Faster calculations

More storage


Faster calculations

Easier provisioning

The Grid/Cloud


Very complicated


Lack of confidence



ETM 555


Improving Utilization

Cloud Computing:

(+) no need to own hardware, shared access, improved utilisation through pay-as-you-use

(-) incompatible platforms, ‘fair price’ is dubious to users

Enterprise/Departmental Grid:

(+) improves utilisation rates of physical servers, enables collaboration

(-) limited scalability, lack of interoperability between vendors, limited efficiency of policy based mechanisms

Virtual Servers:

(+) improved utilisation rates, better scalability, easy disaster recovery

(-) increased number of servers to manage, incompatible virtualization platforms

Hardware Servers:

(-) low utilisation rates, scalability problems



ETM 555

Grid and Clouds


Why we need it?

(The Problem)

Main Target


Business Model

– Where the money comes from?

Classic Grid Computing

To enable the R&D community to achieve its research goals in reasonable time.

Computation over large data sets, or of paralleizable compute-intensive applications .

First - Academia

Second – certain industries


Sponsor-based (Mainly government money).

Industry pays

Internal Implementations.

Cloud computing

Reduce IT costs.

On-demand scalability for all applications , including research, development and business applications.

Mainly Industry

Hosted by commercial companies , paid-for by users.

Based on the economies of scale and expertise. Only pay for what you need, when you need it:

(On- Demand + Pay per Use).

ETM 555


Key differentiators:

• Open source – no vendor lock-in

• Scalability

Interfaces and Market















ETM 555

ETM 555


Security and Trust

Customer SLA – compare Cost/Performance

Dynamic VM migration – Unique Universal IP

Clouds Interoperability

Data Protection & Recovery

Standards: Security

Management Tools

Integration with Internal Infrastructure

Small compact economical applications

Cost/Performance prediction and measurement

Keep it Transparent and Simple

Cloud Market

"The future is about having a platform in the cloud,"

Microsoft Chief Steve Ballmer said of the trend in a

July, 2008 e-mail to employees.

ETM 555

ETM 555

Cloud Market

“By 2012,

80 percent of Fortune 1000 companies will pay for some cloud computing service,


30 percent of them will pay for cloud computing infrastructure”.

Gartner, 2008

Why Now? (Economy)

- CIOs -> Do more with Less (Energy costs / Recession will boost it)

Lower cost for Scalability

Enterprise IT budget - Spending 80% on MAINTENANCE

In average, we utilize only 15% of our computing resources capacity

Peak Times economy

The Enterprise IT is not its core business

Psychology of Internet/Cloud trust (SalesForce, Gmail, Internet banking, etc.)

Ideal for Developers

ETM 555

ETM 555

Why Now? (Benefits)

Cost savings, leveraging economies of scale

Pay only for what you use

Resource flexibility

Rapid prototyping and market testing

Increased speed to market

Improved service levels and availability

Self-service deployment

Reduce lock-in and switching costs

ETM 555

Clouds Types

VM Based (EC2, GoGrid)

Storage Based (EMC, S3)

Customers Applications based (Google)

Cloud Applications based (SalesForce)

Grid Computing/HPC Applications

Mobile Clouds (iPhone UI, WEB APPS)

Private Clouds

Cloud of Clouds

ETM 555


Cloud Computing - The New IT Economy

Pay-per-Use for On-Demand Scalability

All major vendors are investing in Clouds

Cloud Trading Market will evolve

VM will be mobile across clouds

Mobile phones (iPhone) cloud users

International implications (Access to Data)

Example Cloud: Amazon Web Services

EC2 (Elastic Computing Cloud) is the computing service of Amazon

Based on hardware virtualisation

Users request virtual machine instances, pointing to an image (public or private) stored in S3

Users have full control over each instance (e.g. access as root, if required)

Requests can be issued via SOAP and REST


ETM 555

Example Cloud: Amazon Web Services

S3 (Simple Storage Service) is a service for storing and accessing data on the Amazon cloud

From a user ’s point-of-view, S3 is independent from the other Amazon services

Data is built in a hierarchical fashion, grouped in buckets (i.e. containers) and objects

Data is accessible via various protocols

Elastic Block Store

Locally mounted storage

Highly available


ETM 555

Example Cloud: Amazon Web Services

Other AWS services:

SQS (Simple Queue Service)


Billing services: DevPay

Elastic IP (Static IPs for Dynamic Cloud Computing)

Multiple Locations


ETM 555

Example Cloud: Amazon Web Services

Pricing information http://aws.amazon.com/ec2/


ETM 555

EC2 – “Google of the Clouds”

According to Vogels (Amazon CTO), 370,000 developers have registered for Amazon Web Services since their start in 2002, and the company now spends more bandwidth on the developers than it does on e-commerce.


In the last two months of 2007 usage of Amazon Web

Services grew by 40%

$131 million revenues in Q1 from AWS

60,000 customers

The majority of usage comes from banks, pharmaceuticals and other large corporations

ETM 555

ETM 555 83

ETM 555



Data Explosion

• IDC estimate put the size of the “digital universe” at

- 0.18 zettabytes in 2006

-forecasting a tenfold growth by 2011 to 1.8 zettabytes

The New York Stock Exchange generates about one terabyte of new trade data per day

Facebook hosts approximately 10 billion photos, taking up one petabyte of storage.

The Internet Archive stores around 2 petabytes of data, and is growing at a rate of 20 terabytes per month.

The Large Hadron Collider near Geneva, Switzerland, produce about

15 petabytes of data per year.

ETM 555 85

Hadoop Projects


A set of components and interfaces for distributed filesystems and general

I/O (serialization, Java RPC, persistent data structures).


A serialization system for efficient, cross-language RPC, and persistent data storage.


A distributed data processing model and execution environment that runs on large clusters of commodity machines.


• Pig

Distributed filesystem that runs on large clusters of commodity machines.

A data flow language and execution environment for exploring very large datasets. Pig runs on HDFS and MapReduce clusters.

ETM 555 86

Hadoop Projects


A distributed data warehouse. Hive manages data stored in HDFS and provides a query language based on SQL (and which is translated by the runtime engine to MapReduce jobs) for querying the data.


A distributed, column-oriented database. HBase uses HDFS for its underlying storage, and supports both batch-style computations using

MapReduce and point queries (random reads).


A distributed, highly available coordination service. ZooKeeper provides primitives such as distributed locks that can be used for building distributed applications.


A tool for efficiently moving data between relational databases and HDFS.

ETM 555 87

RDBMS Compared to MapReduce

• MapReduce can be seen as a complement to an RDBMS

• MapReduce is a good fit for problems that need to analyze the whole dataset, in a batch fashion, particularly for ad hoc analysis.

• An RDBMS is good for point queries or updates, where the dataset has been indexed to deliver low-latency retrieval and update times of a relatively small amount of data.

• MapReduce suits applications where the data is written once, and read many times, whereas a relational database is good for datasets that are continually updated.

ETM 555 88

ETM 555

RDBMS Compared to MapReduce


Amazon’s Cloud Load Balancing


Elastic Load Balancing automatically distributes incoming application traffic across multiple Amazon EC2 instances.

• http://docs.amazonwebservices.com/ElasticLoadBalancing/latest/DeveloperGuide/

ETM 555 90
