Uploaded by Nefertiti ToDieFor

ICS 2403 Distributed Systems Course: Principles & Applications

advertisement
ICS 2403 DISTRIBUTED SYSTEMS
ICS 2403 DISTRIBUTED SYSTEMS (45 CONTACT HOURS)
Pre-requisite BIT 2108 Computer Networks
Course Purpose
Students will examine the principles, techniques, and practices relevant to the design and
implementation of distributed systems through hands-on experience.
Learning Outcomes
By the end this course, the student should be able to:
i.
Discuss concurrency, independent failure of components, lack of a global clock
ii.
Associate distributed systems in a realistic context through examples: Internet, intranet,
mobile computing
iii.
Motivate the benefits of resource sharing and discuss Web challenges including
heterogeneity, openness, security, scalability, failure handling, concurrency, transparency
iv.
Use the acquired knowledge to develop a simple client-server application.
Course Description
Overview of distributed computing; computational models, communication complexity, design
and analysis of distributed algorithms and protocols, fault-tolerant protocols, synchronous
computations. Applications such as communication in data networks, control in distributed system
such as election, and distributed mutual exclusion, manipulation of distributed data such as
ranking. Java remote method innovation (RMI), remote procedure call (RPC), common object
request brokerage architecture (CORBA).
Teaching Methodology
Lectures, laboratory exercises, assignments and a class project
Instructional Materials
LCD projectors, computers, white boards, appropriate software
Course Assessment
30% Continuous Assessment (Tests 10%, Assignment 10%, Practical 10%)
70% End of Semester Examination.
Course Text Books
1. Andrew Tanenbaum (2002). Distributed Systems, Prentice-Hall, ISBN 456-7755
BY MASESE
Page 1 of 30
ICS 2403 DISTRIBUTED SYSTEMS
2. M.L. Liu, Pearson (2004). Distributed Computing: Principles and Applications by
Addison-Wesley, ISBN 456-67738438
3. Alan C. Shaw, Lubomir F. Bic (2002). Operating Systems Principles. Prentice Hall.
ISBN: 0130266116.
Reference Text Books
1. Andrew S. Tanenbaum (1994). Distributed Operating Systems, Prentice-Hall, 1994,
ISBN:456-88594
2. Sape J. Mullender (1993). Distributed Systems, 2nd Edition, ACM Press, ISBN:
043077585
3. Charles Crowley (1996). Operating Systems: A Design-Oriented Approach. Irwin
Professional Publishing. ISBN: 0256151512
Course Journals
1. Acta Informatica ISSN 0001-5903
2. Advances in Computational Mathematics ISSN 1019-7168
3. Advances in data Analysis and Classification ISSN1 1862-5347
4. Annals Of software Engineering ISSN 1022-7091
Reference Journals
1. Journal of computer science and Technology ISSN 1000-9000
2. Journal of Science and Technology ISSN 1860-4749
3. Central European Journal Of Computer Science ISSN 1896-1533
4. Cluster computing ISSN 1386-7857
BY MASESE
Page 2 of 30
ICS 2403 DISTRIBUTED SYSTEMS
COURSE UNIT: DISTRIBUTED SYSTEM
CODE:
ICS 2403
Lecturer:
MASESE CHUMA-
CONTACT:
0782526000
Objective
By the end of this session, students will be able to:
i.
discuss the concepts of distributed system
ii.
what is a distributed system?
iii.
what is a centralized system?
iv.
characteristics of centralized systems
v.
centralized vs. distributed systems
vi.
advantages of distributed systems over centralized systems
vii.
characteristics of a distributed system
viii.
advantages of distributed system
ix.
disadvantages of distributed system
x.
challenges in the design of distributed system
BY MASESE
Page 3 of 30
ICS 2403 DISTRIBUTED SYSTEMS
INTRODUCTION
The process of computation was started from working on a single processor. This uniprocessor
computing can be termed as centralized computing. As the demand for the increased processing
capability grew high, multiprocessor systems came to existence. The advent of multiprocessor
systems, led to the development of distributed systems with high degree of scalability and resource
sharing. The modern-day parallel computing is a subset of distributed computing
WHAT IS A DISTRIBUTED SYSTEM?
A distributed system is a collection of computer programs that utilize computational
resources across multiple, separate computation nodes to achieve a common, shared goal.
Distributed systems aim to remove bottlenecks or central points of failure from a system.
A distributed system is A collection of independent computers that appears to its users as a
single coherent system [Tanenbaum and van Steen, 2007]
A distributed system is a collection of autonomous computational entities conceived as a single
coherent system by its designer
A system in which hardware or software components located at net-worked computers
communicate and coordinate their actions only by message passing. [Coulouris]
A distributed system is a collection of computer programs that utilize computational resources
across multiple, separate computation nodes to achieve a common, shared goal. Also known as
distributed computing or distributed databases, it relies on separate nodes to communicate and
synchronize over a common network. These nodes typically represent separate physical hardware
devices but can also represent separate software processes, or other recursive encapsulated
systems. Distributed systems aim to remove bottlenecks or central points of failure from a system.
A distributed system is A collection of independent computers that appears to its users as a
single coherent system.
The definition of distributed systems deals with two aspects that:
➢Deals with hardware: The machines linked in a distributed system are autonomous.
BY MASESE
Page 4 of 30
ICS 2403 DISTRIBUTED SYSTEMS
➢Deals with software: A distributed system gives an impression to the users that they are dealing
with a single system.
Distributed computing is a field of computer science that studies distributed systems. A
distributed system consists of multiple autonomous computers that communicate through a
computer network. The computers interact with each other in order to achieve a common goal. A
computer program that runs in a distributed system is called a distributed program, and
distributed programming is the process of writing such programs.
Distributed computing also refers to the use of distributed systems to solve computational
problems. In distributed computing, a problem is divided into many tasks, each of which is solved
by one or more computers.
A distributed system is a collection of independent computers, interconnected via a network,
capable of collaborating on a task.
Distributed computing is computing performed in a distributed system. Distributed computing has
become increasingly common due advances that have made both machines and networks cheaper
and faster.
Some examples of distributed systems:
♦ Local Area Network and Intranet
♦ Database Management System
♦ Automatic Teller Machine Network
♦ Internet/World-Wide Web
♦ Mobile and Ubiquitous Computing
Motivation
The following are the key points that acts as a driving force behind DS:
•
Inherently distributed computations: DS can process the computations at geographically
remote locations.
•
Resource sharing: The hardware, databases, special libraries can be shared between
systems without owning a dedicated copy or a replica. This is cost effective and reliable.
•
Access to geographically remote data and resources: As mentioned previously,
computations may happen at remote locations. Resources such as centralized server scan
also be accessed from distant locations.
BY MASESE
Page 5 of 30
ICS 2403 DISTRIBUTED SYSTEMS
•
Enhanced reliability: DS provides enhanced reliability, since they run on multiple copies
of resources. The distribution of resources at distant locations makes them less susceptible
for faults. The term reliability comprises of:
1.Availability: the resource/ service provided by the resource should be accessible at all times
2.Integrity: the value/state of the resource should be correct and consistent.
3.Fault-Tolerance: the ability to recover from system failures
•
Increased performance/cost ratio: The resource sharing and remote access
features of DS naturally increase the performance / cost ratio.
•
Scalable: The number of systems operating in a distributed environment can be
increased as the demand increases.
2. Centralized vs. Distributed Computing
WHAT IS A CENTRALIZED SYSTEM?
A centralized system is a type of system where all the important tasks like processing data,
storing information, and making decisions are done by a single main computer or server. This
means that there is one central place that controls and manages all the resources and important
choices for the whole system. In such systems, all resources, data, and functionalities are
managed and controlled from this central point.
BY MASESE
Page 6 of 30
ICS 2403 DISTRIBUTED SYSTEMS
CHARACTERISTICS OF CENTRALIZED SYSTEMS
•
Single Point of Control: In a centralized system, there is a single point of control and
authority. This central entity typically makes all decisions and manages all resources.
•
Centralized Data Management: All data and resources are stored and managed centrally.
This means that all data processing, storage, and retrieval activities occur within the central
system.
•
Hierarchical Structure: Centralized systems often have a hierarchical structure, with
lower-level nodes or entities reporting to and receiving instructions from the central
authority.
•
Communication Flow: Communication within a centralized system typically flows from
peripheral nodes or entities to the central node.
•
Simplicity in Management: Centralized systems are relatively simpler to manage and
administer since all control and decision-making are centralized. This can lead to efficient
coordination and streamlined operations.
For Example:
Many businesses operate with centralized IT infrastructures where data centers or servers
centrally manage resources such as file storage, application hosting, and network services.
Use Cases of Centralized Systems
•
Small Office Network: Many offices use one main computer to run things. This main
computer stores files for all workers. It also helps computers access the network. The main
computer checks workers are who they say. Using one main computer makes it simpler to
manage everything. It also allows all workers to use things the same way.
•
Traditional Client-Server Architecture: A lot of older programs like email, websites, and
databases work one way. Clients talk to one main server to get what they need. This setup
has a center. Computers connect to the main spot to get services or info.
•
Standalone Applications: Apps running on one machine do everything locally. They process and store things without needing other machines. This is a centralized system. All the work
happens on the single machine you are using.
Centralized vs. Distributed Systems
Below are the difference between Centralized and Distributed System:
BY MASESE
Page 7 of 30
ICS 2403 DISTRIBUTED SYSTEMS
Aspect
Centralized System
Distributed System
Control
Centralized control and authority
Decentralized control and authority
Resource
Management
All resources managed centrally
Communication flows to central
Communication
Fault Tolerance
node
scalability
nodes
Direct communication between nodes
Redundancy, less vulnerable to single
Single point of failure
Limited
Resources distributed across multiple
points of failure
due
to
Highly scalable, new nodes can be added
Scalability
centralization
easily
Complexity
Relatively simpler to manage
More complex to manage
ADVANTAGES OF DISTRIBUTED SYSTEMS OVER CENTRALIZED SYSTEMS
1. RELIABILITY: If one machine crashes, the system as a whole can still survive.
2. SPEED: A distributed system may have more total computing power than a mainframe
3. OPEN SYSTEM: Since it is an open system it is always ready to communicate with other
systems. An open system that scales has an advantage over a perfectly closed and self-contained
system.
BY MASESE
Page 8 of 30
ICS 2403 DISTRIBUTED SYSTEMS
4. ECONOMIC: Collection of microprocessors offers a better price or performance than
mainframes.
5. INCREMENTAL GROWTH: Computing power can be added in small increments.
Advantages of Distributed Systems over Centralized System
• Economics: a collection of microprocessors offers a better price/performance than mainframes.
Low price/performance ratio: cost effective way to increase computing power.
• Speed: a distributed system may have more total computing power than a mainframe.
• Inherent distribution: Some applications are inherently distributed. Ex. a supermarket chain.
• Reliability: If one machine crashes, the system as a whole can still survive. Higher availability
and improved reliability.
• Incremental growth: Computing power can be added in small increments. Modular
expandability
• Another deriving force: the existence of large number of personal computers, the need for
people to collaborate and share information.
Why do we use distributed systems?
The alternative to using a distributed system is to have a huge centralized system, such as a
mainframe. For many applications there are a number of economic and technical reasons that make
distributed systems much more attractive than their centralized counterparts.
Cost. Better price/performance as long as commodity hardware is used for the component
computers.
Performance. By using the combined processing and storage capacity of many nodes,
performance levels can be reached that are beyond the range of centralized machines.
Scalability. Resources such as processing and storage capacity can be increased incrementally.
Reliability. By having redundant components, the impact of hardware and software faults on
users can be reduced.
Inherent distribution. Some applications, such as email and the Web (where users are spread
out over the whole world), are naturally distributed. This includes cases where users are
geographically dispersed as well as when single resources (e.g., printers, data) need to be shared.
CHARACTERISTICS OF A DISTRIBUTED SYSTEM
BY MASESE
Page 9 of 30
ICS 2403 DISTRIBUTED SYSTEMS
1. Concurrency: In a distributed system, multiple nodes can carry out operations
simultaneously, enabling parallel processing and better performance.
2. High Scalability: Distributed systems may scale horizontally by adding more computers
to the network and can support a high number of nodes. This enables them to handle rising
consumer expectations and accept increasing workloads.
3. Fault-tolerance: Distributed systems are built to be fault-tolerant. The system can
continue to function even if one or more nodes go down by shifting the workload to the
nodes that are still up and running.
4. Transparency: By hiding the underlying architectural complexity of the system,
distributed systems seek to provide users and applications with transparency. This covers
the location, handling of failures, and transparency in resource access.
5. Heterogeneity: Nodes in distributed systems frequently have various hardware and
software configurations. They might employ various programming languages, run on
various operating systems, or have differing processing speeds. To ensure interoperability
and coordination across the nodes, managing this heterogeneity is a challenge.
6. Consistency and synchronization: A major challenge is ensuring data consistency and
state synchronization among distant nodes. In order to handle concurrent updates and
ensure data consistency, distributed systems use a variety of mechanisms, including
distributed algorithms, consensus protocols, and distributed transactions.
7. Security and Privacy: Authentication, access control, data integrity, and confidentiality
are security issues that must be addressed in distributed systems. In designing a distributed
system, it is crucial to ensure safe communication and to preserve sensitive data.
Benefits of distributed systems
Distributed systems offer a number of advantages over monolithic, or single, systems:
•
Scalability & flexibility. It is easier to add computing power as the need for services
grows. In most cases today, you can spin up servers to a distributed system on the fly,
increasing performance and further reducing time to completion.
•
Fault tolerance. Distributed systems reduce the risks involved with having a single point
of failure, bolstering reliability and fault tolerance.
BY MASESE
Page 10 of 30
ICS 2403 DISTRIBUTED SYSTEMS
•
Reliability. A well-designed distributed system can withstand failures in one or more of
its nodes without severely impacting performance. In a monolithic system, the entire
application goes down if the server goes down.
•
Speed. Heavy traffic can bog down single servers when traffic gets heavy, impacting
performance for everyone. The scalability of distributed databases and other distributed
systems makes them easier to maintain and also sustain high-performance levels.
•
Geo-distribution. Distributed content delivery is both intuitive for any internet user, and
vital for global organizations.
Advantages of Distributed System
1. Improved Performance: Parallel processing, in which tasks are split up and carried out
simultaneously by several nodes, is possible with distributed systems. Comparatively to a
single, centralized system, this results in quicker execution times and better performance.
The task is distributed over several nodes, which allows for optimal resource use and can
accommodate heavier workloads or higher user demands.
2. Load Balancing: Load balancing strategies can be used in distributed systems to divide
the workload among nodes in a fair manner. As a result, resources are used to their full
potential and performance is enhanced. This prevents any one node from becoming
overburdened with work. The effective scalability of resources as needed is made possible
through load balancing, which also helps prevent bottlenecks.
3. Data Replication and Data Locality: Replication of data across several nodes is a
common technique used in distributed systems. This raises data availability and lowers the
possibility of data loss or non-availability as a result of node failures. Data can also be kept
nearby nodes or users who access it frequently, lowering network latency and enhancing
system performance.
4. Redundancy and Disaster Recovery: Redundancy and disaster recovery capabilities can
be offered by distributed systems. The system is better able to recover from errors or
disasters when data and tasks are replicated. By ensuring that there are backup resources
or nodes accessible in the event of failures, redundancy helps to reduce downtime and data
loss.
BY MASESE
Page 11 of 30
ICS 2403 DISTRIBUTED SYSTEMS
5. Flexibility and Modularity: Distributed systems allow for freedom in design and
modularity. It is possible to build the system up of microservices or loosely linked
components, which makes it simpler to create, deploy, and manage. This modular design
encourages flexibility in system architecture and evolution and enables independent
component scalability. This flexibility of distributed system will help in providing better
user experience and helps in processing the user requests faster.
6. Geographic Distribution and Reduced Latency: Data and services can be placed closer
to end consumers because of distributed systems' ability to span several different
geographic regions. The system can lower latency and speed up reaction times by putting
nodes in various areas. Services like content delivery networks (CDNs) or real-time
applications that demand low-latency interactions will particularly benefit from this.
7. Resource sharing: Distributed systems allow multiple users and programmers to share
resources. Computing resources, such as processing power, memory, and storage, can be
efficiently utilized and shared across the system, resulting in resource allocation
optimization.
8. Flexibility and extensibility: Distributed systems allow for the addition or removal of
nodes without affecting the overall system. This enables easy scaling and adaptation to
changing requirements and workloads.
9. Increased data availability: Distributed systems can replicate and spread data across
numerous nodes, boosting data availability and accessibility. Even if specific nodes are
inaccessible, data can still be accessed from other nodes.
10. Collaboration and coordination: Multiple people or entities can collaborate and
coordinate using distributed systems. They serve as a platform for sharing resources,
communicating, and synchronizing tasks, facilitating effective teamwork.
11. Improved fault isolation: Failures or faults in one component or node can be isolated and
restricted in a distributed system, preventing them from affecting the entire system. This
enhances system stability and decreases the impact of failures.
12. Enhanced security: Distributed systems provide increased security capabilities by
utilizing distributed security techniques. By disseminating data and processing, it becomes
more difficult for unauthorized entities to compromise the entire system.
BY MASESE
Page 12 of 30
ICS 2403 DISTRIBUTED SYSTEMS
13. Easier software development: Distributed systems encourage modular and decentralized
software development. Developers can work on independent components or services that
can be easily merged into the larger system. This increases development productivity and
makes system maintenance and updates easier.
14. Increased reliability: Distributed systems are less prone to full failures or data loss when
data is duplicated across numerous nodes. Even if one node fails, the system can still
function with the remaining nodes.
Disadvantages of Distributed System
1. Increased communication overhead: Distributed systems often demand frequent
communication and coordination among nodes. This communication cost might degrade
system performance and deplete network bandwidth.
2. Higher latency: The distributed design of the system adds extra communication costs,
which might result in higher latency as compared to centralized solutions. Network delays
and message forwarding can all have an impact on the system's total reaction time.
3. Increased development and maintenance complexity: In comparison to centralized
systems, developing and maintaining distributed systems can be more difficult and timeconsuming. Coordination and synchronization of activities across numerous nodes, as well
as resolving failure scenarios, necessitate extra work and knowledge.
4. Network dependency: Data interchange and coordination in distributed systems are
primarily reliant on network connectivity. Network failures or latency issues can have a
substantial influence on the system's performance and availability.
5. Cost and complexity of infrastructure: The networking hardware, servers, and storage
required for distributed systems can be expensive and difficult to set up and manage.
6. Debugging and troubleshooting: Comparatively to a centralized system, locating and
fixing problems in a distributed system can be more difficult. Advanced monitoring and
diagnostic technologies are necessary to resolve issues or performance bottlenecks
affecting several nodes
7. Scalability limitations: Although distributed systems are very scalable, there may be some
restrictions depending on the system's design and architecture. There can be scalability
bottlenecks in some apps or components that are difficult to get around.
BY MASESE
Page 13 of 30
ICS 2403 DISTRIBUTED SYSTEMS
8. Software compatibility: Multiple software components frequently operate on various
nodes in distributed systems. It might be difficult to ensure compatibility and easy
integration between these components, especially if they were created by various teams or
organizations.
9. Security risks: Compared to centralized systems, distributed systems face more security
threats. It can be more difficult and vulnerable to flaws to manage access control,
authentication, and data secrecy across several nodes.
10. Consistency and data integrity: It can be difficult to guarantee consistency and data
integrity among distributed nodes. It takes careful planning and implementation of
techniques, such as distributed transactions or consensus protocols, to achieve global
consistency in a distributed system.
11. Dependency on network stability: A reliable network infrastructure is crucial for
distributed systems. System availability may be decreased or even rendered completely
unavailable as a result of network failures or disturbances.
12. Complexity of failure handling: In a distributed system, handling errors can be
challenging. Robust fault-tolerance methods and careful design are needed for failure
scenarios to detect faults, start recovery mechanisms, and preserve consistency among
nodes.
13. Lack of global view: It is difficult to monitor and manage distributed systems because
they lack a centralized global view of the entire system. Decentralized monitoring and
management solutions are necessary for administrators.
CHALLENGES IN THE DESIGN OF DISTRIBUTED SYSTEM
The following are the challenges being faced while designing distributed system.
1. Heterogeneity: It’s underlying network infrastructure, computer hardware and software (e.g.
Operating systems), programming languages (in particular data representation).
2. Openness
❖ Ensuring extensibility and maintainability of the systems
❖ Adherence to standard interface
3. Security
❖ Privacy
BY MASESE
Page 14 of 30
ICS 2403 DISTRIBUTED SYSTEMS
❖ Authentication
❖ Availability
4. Scalability
❖ Handling increasing number of files and users
❖ Growth of storage space.
5. Handling of failures
❖ Detection (may be impossible)
❖ Exception handling (e.g. time-outs when waiting for a web resource)
❖ Redundancy of data storage
❖ Redundant routes in network
❖ Replication of name tables in multiple domain name servers
6. Concurrency
❖ Consistent scheduling of concurrent threads (so that dependencies are preserved e.g. in
❖ concurrent transitions)
❖ Avoidance of dead and life lock problems.
7. Transparency: concealing the heterogeneous and distributed nature of the system so that it
appears to the user like one system.
Resource sharing and the web challenges
Resources may be shared either in the form of printer, scanner, machine and so on.
Terms used in the web
A. Services: it is a distinct of a computer system that manages a collection of related resources
and present functionality to users. For instance, we can access the shared file service to send
document through the printing service.
B. Server: it means a running program on a networked computer that accepts request from program
running on other computer to perform a service and respond appropriately.
WWW (World Wide Web)
It is an evolving system for publishing and accessing resources and service across the internet.
Among the web browsers are Mozilla, fire fox, internet explorer, etc. and are used to retrieve and
view documents of many types, view video streams and so on.
Properties of WWW (World Wide Web)
1. It is an open system and it can be extended and implemented in new ways without distributing
BY MASESE
Page 15 of 30
ICS 2403 DISTRIBUTED SYSTEMS
its existing functionality.
2. The web is open with respect to the type of resources that can be published and shared on it.
Web characteristics
Heterogeneity
The internet enables users to access services and run application over a heterogeneity collection of
computers network. It is applicable on the following:
a) Computer network
b) Computer hardware
c) Operating system
d) Programming language
The internet consists of many different sorts of network, their differences are masked by fact that
all of the computers attached to them use the Internet Protocols to communicate with one another.
Openness
This characteristic determines whether the system can be extended and re-implemented in various
ways. The openness of distributed system is determined primarily by the degree to which a new
resource sharing service can be added and be made available for use by variety of client programs.
Security
1. Information security: it depends on three components;
a. Confidentiality: it is confidential and to protect the unauthorized individual.
b. Integration: it deals with the protection against alteration and corruption.
c. Availability: it is protection against interference with the mean to access the resource.
Example: In banking, users send their credit card number across the internet
2. Denial of service attacks: this is a security problem whereby a user may wish to disrupt a
service for some reasons.
3. Security on mobile code: it needs to be handled with care.
Failure handling
Software, hardware or program may produce incorrect result or stop before completing an intended
computation due to the system failure
The following techniques could be employed in dealing with failure.
a. Detecting failure: its failure can be detected.
b. Masking failure: it is failure due to detect and can be hidden or made less server.
BY MASESE
Page 16 of 30
ICS 2403 DISTRIBUTED SYSTEMS
c. Redundancy: services can be made to tolerate failure by the use of redundant components.
There should always be at least two different routes between any two routes on the internet.
Scalability
A system is scalable if it will remain effective when there is significant increase in the number of
resources and the number of users. For a system with user to be scalable, the quantity of physical
resources required to support them should be O(n) that is proportional to n.
Transparency
This is concealment from the user and the application programmer of the separation of components
in a distributed system so that the system is perceived as a whole rather than collection of
independent components.
Risks of distributed systems
The challenges of distributed systems create a number of correlating risks.
•
Security. Distributed systems are as vulnerable to attack as any other system, but their
distributed nature creates a much larger attack surface that exposes organizations to threats.
•
Risk of network failure. Distributed systems are beholden to public networks to transmit
and receive data. If one segment of the internet becomes unavailable or overloaded,
distributed system performance may decline.
•
Governance and control issues. Distributed systems lack the governability of monolithic,
single-server-based systems, creating auditing and adherence issues around data privacy
laws. Globally distributed environments are challenging when it comes to providing certain
levels of assurance and understanding exactly where data resides.
•
Cost control. Unlike centralized systems, the scalability of distributed systems allows
administrators to easily add additional capacity as needed, which can also increase costs.
Pricing for cloud-based distributed computing systems are based on usage (such as the
number of memory resources and CPU power consumed over time). If demand suddenly
spikes, you might face a massive bill.
Examples of distributed systems
Here are some very common examples of distributed systems:
BY MASESE
Page 17 of 30
ICS 2403 DISTRIBUTED SYSTEMS
•
Telecommunications networks that support mobile and internet networks
•
Graphical and video-rendering systems
•
Scientific computing, such as protein folding and genetic research
•
Airline and hotel reservation systems
•
Multiuser video conferencing systems
•
Cryptocurrency processing systems (e.g. Bitcoin)
•
Peer-to-peer file-sharing systems
•
Distributed community computing systems
•
Multiplayer video games
•
Global, distributed retailers and supply chain management
Distributed Systems History and OS Models
Minicomputer model: In this model, each user has a local machine. The machines are
interconnected, but the connection may be transient (e.g., dialing over a telephone network). All
the processing is done locally but you can fetch remote data like files or databases.
Workstation model: In this model, you have local area networks (LANs) that provide a
connection nearly all of the time. An example of this model is the Sprite operating system. You
can submit a job to your local workstation. If your workstation is busy, Sprite will automatically
transmit the job to another idle workstation to execute the job and return the results. This is an
early example of resource sharing where processing power on idle machines is shared.
Client-server model: This model evolved from the workstation model. In this model, there are
powerful workstations who serve as dedicated servers while the clients are less powerful and rely
on the servers to do their jobs.
Processor pool model: In this model, the clients become even less powerful (thin clients). The
server is a pool of interconnected processors. The thin clients rely on the server by sending almost
all their tasks to the server.
Cluster computing systems / Data centers: In this model, the server is a cluster of servers
connected over high-speed LAN.
Grid computing systems: This model is similar to cluster computing systems except that the
server is now distributed in location and is connected over a wide area network (WAN) instead of
LAN.
BY MASESE
Page 18 of 30
ICS 2403 DISTRIBUTED SYSTEMS
WAN-based clusters / distributed data centers: Similar to grid computing systems, but now it
is clusters/data centers rather than individual servers that are interconnected over WAN.
Virtualization and data center
Cloud computing: Infrastructures are managed by cloud providers. Users only lease resources on
demand and are billed on a pay-as-you-go model.
Emerging Models - Distributed Pervasive Systems: The nodes in this model are no longer
traditional computers but smaller nodes with microcontrollers and networking capabilities. They
are very resource constrained and present their own design challenges. For example, today’s car
can be viewed as a distributed system as it consists of many sensors, and they communicate over
LAN. Other examples include home networks, mobile computing, personal area networks, etc.
Applications and Real-World Examples of Distributed Computing
Distributed computing is not just a theoretical concept; it has practical applications across various
industries and sectors. Here are some notable examples and applications:
Big Data Analytics: Distributed computing is fundamental in big data. It allows for the processing
and analysis of vast datasets that are beyond the capacity of a single machine.
Frameworks like Apache Hadoop and Spark are used for this purpose, distributing data processing
tasks across multiple nodes.
•
Cloud Computing: Services like Amazon Web Services (AWS), Microsoft Azure, and Google
Cloud Platform rely on distributed computing to offer scalable and reliable cloud services. These
platforms host applications and data across numerous servers, ensuring high availability and
redundancy.
•
Scientific Research: Many scientific projects require immense computational power. Distributed
computing enables researchers to solve complex scientific problems by utilizing the combined
power of multiple computers. An example is the SETI (Search for Extraterrestrial Intelligence)
project, which uses the idle processing power of thousands of volunteered computers worldwide.
BY MASESE
Page 19 of 30
ICS 2403 DISTRIBUTED SYSTEMS
•
Financial Services: The financial sector employs distributed computing for high-frequency
trading, risk management, and real-time fraud detection, where rapid processing of massive
amounts of data is crucial.
•
Internet of Things (IoT): In IoT, distributed computing helps manage and process data from
countless devices and sensors, enabling real-time data analysis and decision-making.
Advantages of Distributed Computing
Distributed Computing offers several significant advantages over traditional single-system
computing. These include:
•
Scalability: Distributed systems can easily grow with workload and requirements, allowing for
the addition of new nodes as needed.
•
Availability: These systems exhibit high fault tolerance. If one computer in the network fails, the
system continues to operate, ensuring consistent availability.
•
Consistency: Despite having multiple computers, distributed systems maintain data consistency
across all nodes, ensuring reliability and accuracy of information.
•
Transparency: Users interact with a distributed system as if it were a single entity, without
needing to manage the complexities of the underlying distributed architecture.
•
Efficiency: Distributed systems offer faster performance and optimal resource utilization,
effectively managing workloads and preventing system failures due to volume spikes or underuse
of hardware.
Types of Distributed Computing Architecture
Distributed computing consists of various architectures, each with unique characteristics and use
cases. The main types include:
•
Client-Server Architecture: This common structure divides functions into clients and servers.
Clients handle limited processing and requests, while servers manage data and resources. It offers
security and ease of management but can face bottlenecks in high-traffic situations.
BY MASESE
Page 20 of 30
ICS 2403 DISTRIBUTED SYSTEMS
•
Three-Tier Architecture: It adds a middle layer (application servers) between clients and
database servers, reducing communication bottlenecks and improving performance.
•
N-Tier Architecture: Involves multiple client-server systems working together, often used in
modern enterprise applications.
•
Peer-to-Peer Architecture: Assigns equal responsibilities to all networked computers, popular in
content sharing, file streaming, and blockchain networks.
Parallel Computing vs. Distributed Computing
While often used interchangeably, parallel and distributed computing have distinct characteristics:
Parallel Computing Involves multiple processors carrying out calculations simultaneously,
typically within a single machine or tightly coupled system. All processors have access to shared
memory, facilitating quick information exchange.
Distributed Computing Consists of multiple computers (or nodes), each with its own private
memory, working on a common task. These nodes communicate via message passing, making it a
more loosely coupled system compared to parallel computing. This structure is ideal for tasks
distributed across different geographic locations or separate systems.
Review note
Transparency in distributed systems
Transparency is the concealment from the user and the application programmer of the separation
of the components of a distributed system (i.e., a single image view). Transparency is a strong
property that is often difficult to achieve. There are a number of different forms of transparency
including the following:
BY MASESE
Page 21 of 30
ICS 2403 DISTRIBUTED SYSTEMS
1. Access Transparency: Local and remote resources are accessed in same way
2. Location Transparency: Users are unaware of the location of resources
3. Migration Transparency: Resources can migrate without name change
4. Replication Transparency: Users are unaware of the existence of multiple copies of
resources
5. Failure Transparency: Users are unaware of the failure of individual components
6. Concurrency Transparency: Users are unaware of sharing resources with others
Note that complete transparency is not always desirable due to the trade-offs with performance
and scalability, as well as the problems that can be caused when confusing local and remote
operations. Furthermore, complete transparency may not always be possible since nature imposes
certain limitations on how fast communication can take place in wide-area networks.
Is decentralized systems a subset of distributed systems?
No, decentralized systems are a superset of distributed systems. All distributed systems are
decentralized but not every decentralized system is a distributed system.
Examples include parallel machines and networked machines. Distributed systems have the
following advantages:
1. Resource sharing. Distributed systems enable communication over the network and resource
sharing across machines (e.g., a process on one machine can access files stored on a different
machine).
2. Economic. Distributed systems lead to better economics in terms of price and performance. It
is usually more cost-effective to buy multiple inexpensive small machines and share the resources
across those machines than buying a single large machine.
3. Reliability. Distributed systems have better reliability compared to centralized systems. When
one machine in a distributed system fails, there are other machines to take over its task, and the
whole system can still function. It is also possible to achieve better reliability with a distributed
system by replicating data on multiple machines.
4. Scalability. As the number of machines in a distributed system increases, all of the resources
on those machines can be utilized which leads to performance scaling up. However, it is usually
hard to achieve linear scalability due to various bottlenecks.
BY MASESE
Page 22 of 30
ICS 2403 DISTRIBUTED SYSTEMS
5. Incremental growth. If an application becomes more popular and more users use the
application, more machines can be added to its cluster to grow its capacity on demand. This is an
important reason why the cloud computing paradigm is so popular today.
Types of Distributed Systems
Distributed Computing Systems
•
Many distributed systems are configured for High-Performance Computing Cluster
Computing: Essentially a group of high-end systems connected through a LAN:
Distributed Information Systems
•
The vast amount of distributed systems in use today are forms of traditional information
systems, that now integrate legacy systems. Example: Transaction processing systems.
Distributed Pervasive Systems
•
There is a next-generation of distributed systems emerging in which the nodes are small,
mobile, and often embedded as part of a larger system.
What is the Criterion of Distributed Computer System (Metrics)?
i.
Latency – network delay before any data is sent
ii.
Bandwidth – maximum channel capacity (analogue communication Hz, digital
communication bps)
iii.
Granularity – relative size of units of processing required. Distributed systems operate
best with coarse grain granularity because of the slow communication compared to
processing speed in general
iv.
Processor speed
v.
Reliability – ability to continue operating correctly for a given time
vi.
Fault tolerance – resilience to partial system failure
vii.
Security – policy to deal with threats to the communication or processing of data in a
system
viii.
Administrative/management domains – issues concerning the ownership and access to
distributed systems components
BY MASESE
Page 23 of 30
ICS 2403 DISTRIBUTED SYSTEMS
Applications of distributed computing and newer challenges
1. Mobile systems
2. Sensor networks
3. Ubiquitous or pervasive computing
4. Peer-to-peer computing
5. Publish-subscribe, content distribution, and multimedia
6. Distributed agents
7. Distributed data mining
8. Grid computing
9. Security in distributed system
1. Mobile systems
Mobile systems typically use wireless communication which is based on electromagnetic waves
and utilizes a shared broadcast medium the characteristics of communication are different;
set of problems such as
a. routing,
b. location management,
c. channel allocation,
d. localization and position estimation,
e. the overall management of mobility
f. There are two popular architectures for a mobile network.
1. base-station approach,
also known as the cellular approach, wherein a cell which is the geographical region
within range of a static but powerful base transmission station is associated with that
base station
2. ad-hoc network approach where there is no base station
All responsibility for communication is distributed among the mobile nodes, wherein
mobile nodes have to participate in routing by forwarding packets of other pairs of
communicating nodes
2. Sensor networks
BY MASESE
Page 24 of 30
ICS 2403 DISTRIBUTED SYSTEMS
A sensor is a processor with an electro-mechanical interface that is capable of sensing physical
parameters, such as temperature, velocity, pressure, humidity, and chemicals Sensors may be
mobile or static; sensors may communicate wirelessly, although they may also communicate
across a wire when they are statically installed.
3. Ubiquitous or pervasive computing
The intelligent home, and the smart workplace are some example of ubiquitous
environments Ubiquitous systems are essentially distributed systems; recent advances in
technology allow them to leverage wireless communication and sensor and actuator mechanisms
.
4. Peer-to-peer computing
• Peer-to-peer (P2P) computing represents computing over an application layer network wherein
all interactions among the processors are at a “peer” level, without any hierarchy among the
processors.
• P2P computing arose as a paradigm shift from client–server computing where the roles among
the processors are essentially asymmetrical.
• P2P networks are typically self-organizing, and may or may not have a regular structure to the
network.
5. Publish-subscribe, content distribution, and multimedia
In a dynamic environment where the information constantly fluctuates
there needs to be:
i. an efficient mechanism for distributing this information (publish),
ii. an efficient mechanism to allow end users to indicate interest in receiving specific kinds
of information (subscribe),
iii. an efficient mechanism for aggregating large volumes of published information and
filtering it as per the user’s subscription filter
6. Distributed agents
Agents collect and process information, and can exchange such information with other agents
Challenges in distributed agent systems include coordination mechanisms among the agents,
controlling the mobility of the agents, and their software design and interfaces.
7. Distributed data mining
BY MASESE
Page 25 of 30
ICS 2403 DISTRIBUTED SYSTEMS
The data is necessarily distributed and cannot be collected in a single repository, massive to collect
and process at a single repository in real-time.
8. Grid computing
Grid Computing is a subset of distributed computing, where a virtual supercomputer comprises
machines on a network connected by some bus, mostly Ethernet or sometimes the Internet. idle
CPU cycles of machines connected to the network will be available to others
9. Security in distributed systems
The traditional challenges of security in a distributed setting include:
confidentiality (ensuring that only authorized processes can access certain
information),
authentication (ensuring the source of received information and the identity of the
sending process),
availability (maintaining allowed access to services despite malicious actions).
A model of distributed computations
A distributed system consists of a set of processors that are connected by a
communication network.
The communication network provides the facility of information exchange among
processors.
The processors do not share a common global memory and communicate solely by
passing messages over the communication network.
Discuss about the transparency requirements of distributed system.
Transparency deals with hiding the implementation policies from the user, and can be
classified as follows
•
Access transparency hides differences in data representation on different systems and
provides uniform operations to access system resources.
•
Location transparency makes the locations of resources transparent to the users.
•
Migration transparency allows relocating resources without changing names.
•
Relocation transparency: The ability to relocate the resources as they are being accessed
is.
BY MASESE
Page 26 of 30
ICS 2403 DISTRIBUTED SYSTEMS
•
Replication transparency does not let the user become aware of any replication.
•
Concurrency transparency deals with masking the concurrent use of shared resources for
the user.
•
Failure transparency refers to the system being reliable and fault-tolerant.
6. List the algorithmic challenges in designing a distributed system.
•
Designing useful execution models and frameworks
•
Dynamic distributed graph algorithms and distributed routing algorithms
•
Time and global state in a distributed system
•
Synchronization/coordination mechanisms
•
Group communication, multicast, and ordered message delivery
•
Monitoring distributed events and predicates
•
Distributed program design and verification tools
•
Debugging distributed programs
•
Data replication, consistency models, and caching
What do you understand the by load balancing in a distributed environment?
The goal of load balancing is to gain higher throughput, and reduce the user perceived
latency.
Load balancing may be necessary because of a variety of factors such as high network traffic or
high request rate causing the network connection to be a bottleneck, or high computational load
the objective is to service incoming client requests with the least turnaround time.
The following are some forms of load balancing:
• Data migration- The ability to move data (which may be replicated) around in the system, based
on the access pattern of the users.
• Computation migration -The ability to relocate processes in order to perform a redistribution of
the workload.
• Distributed scheduling -This achieves a better turnaround time for the users by using idle
processing power in the system more efficiently.
Explain in detail about the design issues of a distributed System.
BY MASESE
Page 27 of 30
ICS 2403 DISTRIBUTED SYSTEMS
The following functions must be addressed when designing and building a distributed
system:
1. Communication
2. Processes
3. Naming
4. Synchronization
5. Data storage and access
6. Consistency and replication
7. Fault tolerance
8. Security
9. Applications Programming Interface (API) and transparency
10. Scalability and modularity
1. Communication
This task involves designing appropriate mechanisms for communication among the processes in
the network. Some example mechanisms are: remote procedure call (RPC), remote object
invocation (ROI), message-oriented communication versus stream-oriented communication.
2. Processes
Some of the issues involved are: management of processes and threads at clients/servers; code
migration; and the design of software and mobile agents.
3. Naming
Devising easy to use and robust schemes for names, identifiers, and addresses is essential for
locating resources and processes in a transparent and scalable manner.
4. Synchronization Mechanisms
synchronization or coordination among the processes are essential. Mutual exclusion is the
classical example of synchronization, In addition, synchronizing physical clocks, and devising
logical clocks that capture the essence of the passage of time,
5. Data storage and access
Schemes for data storage, and implicitly for accessing the data in a fast and scalable manner across
the network are important for efficiency.
Traditional issues such as file system design have to be reconsidered in the setting of a distributed
system.
BY MASESE
Page 28 of 30
ICS 2403 DISTRIBUTED SYSTEMS
6. Consistency and replication
To avoid bottlenecks, to provide fast access to data, and to provide scalability, replication of data
objects is highly desirable.
7. Fault tolerance
Fault tolerance requires maintaining correct and efficient operation in spite of any failures of links,
nodes, and processes.
Process resilience, reliable communication, distributed commit, checkpointing and recovery,
agreement and consensus, failure detection, and self-stabilization are some of the mechanisms to
provide fault-tolerance.
8. Security
Distributed systems security involves various aspects of cryptography, secure channels, access
control, key management – generation and distribution, authorization, and secure group
management.
9. Applications Programming Interface (API) and transparency
Transparency deals with hiding the implementation policies from the user, and can be
classified as follows
•
Access transparency hides differences in data representation on different systems and
provides uniform operations to access system resources.
•
Location transparency makes the locations of resources transparent to the users.
•
Migration transparency allows relocating resources without changing names.
•
Relocation transparency: The ability to relocate the resources as they are being accessed
is.
•
Replication transparency does not let the user become aware of any replication.
•
Concurrency transparency deals with masking the concurrent use of shared resources for
the user.
•
Failure transparency refers to the system being reliable and fault-tolerant.
10. Scalability and modularity
•
The algorithms, data (objects), and services must be as distributed as possible.
•
Various techniques such as replication, caching and cache management, and asynchronous
processing help to achieve scalability.
BY MASESE
Page 29 of 30
ICS 2403 DISTRIBUTED SYSTEMS
4. Explain the algorithmic challenges of designing a distributed system
•
Designing useful execution models and frameworks
•
Dynamic distributed graph algorithms and distributed routing algorithms
•
Time and global state in a distributed system
•
Synchronization/coordination mechanisms
•
Group communication, multicast, and ordered message delivery
•
Monitoring distributed events and predicates
•
Distributed program design and verification tools
•
Debugging distributed programs
•
Data replication, consistency models, and caching
BY MASESE
Page 30 of 30
Download