Introduction - Computer Information Systems

advertisement
ICS362 – Distributed Systems
Dr. Ken Cosh
Week 1
Course Description
This course provides an introduction to
the basic issues in the design and
implementation of distributed systems.
Topics include communication,
processes, naming, synchronisation,
consistency and replication, fault
tolerance and security.
Course Objectives

On completion of this course students will be
able to:
–
–
3.1 Discuss key elements to consider when
managing Distributed Systems, such as security,
fault tolerance, consistency and replication.
3.2 Compare differences between different Object
Based Systems, File Systems, Web Based
Systems and Co-ordination Based Systems.
References


1) (Compulsary) Distributed Systems,
Principles and Paradigms, 2nd Edition,
Andrew S. Tanenbaum & Maarten Van
Steen, 2007.
2) Distributed Systems, Concepts and
Design, 4th Edition, George Coulouris, Jean
Dollimore, Tim Kindberg, 2005.
Topics










Introduction
Architectures
Processes
Communication
Naming
Synchronisation
Consistency / Replication
Fault Tolerance
Security
Example Systems
Assessment



1. Quizzes and Presentations
2. Midterm exam
3. Final exam
-
30%
30%
40%
Course Info.

Mon / Wed
12:30-14:00
Room PC319
Office Hours: By Appointment

NOTE: Plagiarism = 0.



What is a Distributed System?


“A distributed System is a collection of
independent computers that appears to it
users as a single coherent system.”
(Tanenbaum)
“Hardware of Software components located
at networked computers communicate and
coordinate their actions only by passing
messages” (Coulouris)
Key Features




Components that are autonomous
Users think they are dealing with a single
system
This requires some collaboration
Note: The challenges involved are
independent of the type of computers used.
Characteristics of DS




How it works is hidden from user.
Interaction is consistent & uniform
Scalability
Continuously available, even if some parts
are out of order
Layered Architecture

Commonly implemented through layers &
middleware
Application A
Application B
Application C
Distributed System Layer (Middleware)
Local OS 1
Local OS 2
Local OS 3
Local OS 4
Network
Goals


Make Resources Available
Hide the fact that resources are distributed
–


Distribution Transparency
Be Open
Be Scalable
Make Resources Available

E.g. Printers, storage facilities, data, files,
webpages, networks etc.
–
–
–

For economic reasons
For collaboration reasons
To create virtual organisations
This produces challenges
–
–
Security
Privacy
Distribution Transparency

An important goal of
distributed systems is
to hide the fact that
processes / resources
are physically
distributed Enabling
users to use the
system without
worrying about where
the resources are.
•Access Transparency
•Location Transparency
•Migration Transparency
•Relocation Transparency
•Replication Transparency
•Concurrency Transparency
•Failure Transparency
Access Transparency

Different Resources may represent data in different
formats, but this shouldn’t be an issue for the user.
–

A user on an Intel workstation sending data to a Sun
SPARC machine, shouldn’t be concerned that Intel orders
its bytes by little endian format (high order bytes first) while
SPARC uses big endian format (low order bytes first).
Different file naming formats should also not be of
concern to the user. ‘/’ or ‘\’.
Location Transparency

Location Transparency refers to the physical
position of a resource, which should be
hidden from the user. This is normally
achieved through naming, where normally
only logical names are used;
–


http://cis.payap.ac.th/index.php
Where is it (physically)?
Has it always been there?
Migration / Relocation Transparency

In the previous web address, you have no idea
whether index.html has always been on the
cis.payap.ac.th server, or when it might have moved
there. If resources can be moved without affecting
the way the resource is accessed then migration
transparency is provided. If that movement occurs
while the resource is being accessed, then
relocation transparency is provided. Consider
moving around using a wireless laptop.
Replication Transparency

The efficiency of distributed systems can be
improved greatly by locating replicas (copies)
of a resources physically closer to a user.
Replication transparency enables the system
to do this, without the user knowing they are
using a replica.
Concurrency Transparency

A goal of distributed systems is often sharing
of resources between users. These users
may wish to access or even update the same
data at the same time (concurrently). An
important challenge when designing
distributed systems is how to deal with
concurrent accesses.
–
How to maintain consistency when different users
use the same resource in different ways.
Failure Consistency

“You know you have one when the crash of a
computer you’ve never heard of stops you from
getting any work done!”

Failure Consistency tries to mask failures such as
this.

It is difficult to identify between a resource that has
failed and a resource which is performing badly
(slowly).
–
Consider opening a webpage - is it dead or painfully slow,
how long should the browser wait?
Complete Transparency?

Complete Transparency isn’t always
completely necessary.
–

E.g. daily newspaper arriving at 7am regardless
of location in the world.
Nor is it always possible.
–
Physics behind signal transmission.
Openness

A further goal of distributed systems is openness that any resource conforms to a set of open
standards. Doing so enables different parts of the
system to make use of required services.

This is normally achieved through modules which
offer services which are specified through interfaces,
using a standard IDL (Interface Definition Language).

The IDL specifies the syntax of the resource, harder
to specify is the semantics of what the services
actually do.
Openness

Distributed Systems should be complete and neutral,
and in doing so should be interoperable and
portable;
–
Interoperability refers to how well 2 different systems
(possibly from different manufacturers) can co-exist making
use of each others services.
–
Portability refers to whether an application written for
system A can be used by system B.
Openness

Another feature of open systems is flexibility.
Systems should be flexible to enable users to
specialise their interactions without affecting
other users or components.

Flexibility is often achieved through
designing systems as a collection of small,
replaceable or adaptable components.
Scalability

A further goal of Distributed Systems is that
they should be scalable - that is that they can
grow;
–
Scalable by size; more users or resources can be
added to the system.
–
Scalable by location; resources and users may be
physically distant.
–
Scalable by administration; system can be easily
manageable as it grows.
Scalability

One problem often encountered when dealing with
scalability is dealing with centralisation.
–
–
–

Centralised services
Centralised data
Centralised algorithms
Imagine how the internet would work if there was
only one single DNS table, and every address
resolution request had to be directed through that
computer.
Scalability

Another problem affecting scalability concerns
whether synchronous communication is actually
possible.
–

Many existing systems were designed for synchronous
communication.
The laws of physics (including the speed of light),
limits the speed of communication between
physically distant resources.
–
Leaving a ‘client’ blocked until a reply is sent back.
Scalability & Administration

What happens when a system needs to scale
across multiple, independent adminstrative
domains?
–
Conflicting policies



Resource Usage
Management
Security
Solving Scalability (briefly & currently)

Hiding Communication Latencies
–

Distribution
–

Essentially asynchronous communication. Not waiting for a
reply, instead creating a special handler (thread) to
complete previous requests.
Splitting a component into smaller parts – e.g. DNS, splits
.com, .th, .edu etc.
Replication
–
For example caching. A copy of the data closer to the
request.
Replication & Scalability

Replication can have a downside effect on
Scalability
–
–
Consistency Problems
How big a problem is this?
Complexity

Clearly designing a DS is a complex task. Some
common false assumptions adding to complexity:
–
–
–
–
–
–
–
–
The network is reliable
The network is secure
The network is homogenous
The topology doesn’t change
Latency is zero
Bandwidth is infinite
Transport cost is zero
There is one administrator
Examples of DS

Distributed Computing Systems
–
–

Distributed Information Systems
–
–

Cluster Computing
Grid Computing
Transaction Processing Systems
Enterprise Application Integration
Distributed Pervasive Systems
Distributed Computing Systems


For high performance computing tasks
When price/performance ration of PCs and
Workstations improved, it was financially &
technically attractive to build supercomputers
by hooking up a collection of simple
computers on a high speed network.
Cluster Computing



Homogeneous hardware
Master node handles allocation of tasks and
user interface
E.g. Beowulf Linux clusters
Grid Computing

Heterogeneous Hardware
–

No assumptions about hardware, OS, Networks,
Administrative domains, security policies
Resources from different organisations are
brought together to allow collaboration –
essentially realising a virtual organisation.
–
Towards Service Oriented Architectures
Distributed Information Systems

When Business Information Systems moved
into a networked environment.
–
–
Sharing data between functional units
Sharing functionality both internally and externally
Transaction Processing Systems

Consider a transaction as an operation on a
database.
–

Handled through Remote Procedure Calls (RPCs)
Each transaction should have 4
characteristics (ACID)
–
–
–
–
Atomic
Consistent
Isolated
Durable
ACID

Atomic
–

Consistent
–

Certain invariants must remain true – e.g. the total amount
of money in a bank must remain the same before an after
internal transfers (even if momentarily during the transaction
this isn’t true).
Isolated
–

Either the whole transaction happens, or none of it.
Two concurrently running transactions should not interfere
with each other.
Durable
–
One a transaction commits, there is no going back.
Enterprise Application Integration

Applications are built on top of databases –
separated from the databases.
–

So these applications may need to communicate
with each other.
Which leads to different communication
middleware
–
–
RPC
Remote Method Invocations (RMI)
Distributed Pervasive Systems

Thus far systems have been ‘stable’, i.e. relatively
permanent fixed nodes with high quality connections.
–
Pervasive systems integrate mobile / embedded computing
devices.

–

Small, battery-powered, mobile, wirelessly connected nodes
which blend into their environment.
Nodes should be able to discover local services and react
accordingly
E.g. Home Systems, Electronic Health Care
Systems, Sensor Networks
Download