MasterSlave

advertisement
Master/Slave Architecture Pattern
Source: Pattern-Oriented Software Architecture, Vol. 1, Buschmann, et al
Problem
• A system must be fault-tolerant (i.e., it keeps running even when
individual components fail)
– Example: e-commerce web-site should continue operating even if a
database server goes down
• A system must produce accurate results with high probability, even when
individual components fail (e.g., safety-critical systems)
– Example: Airplane control software must produce accurate results,
even if a piece of hardware malfunctions
• A system needs to solve computationally-intensive problems (e.g.,
problems for which there are no efficient algorithms, problems with large
data sets, etc.)
Master
Slave
Solution
+Service()
-DivideWork()
-CallSlaves()
-CombineResults()
1
2..*
+SubService()
• A Master component divides work into subtasks
• The Master delegates subtasks to multiple independent Slaves
• Slaves compute results for their subtasks in parallel, and return their
partial results to the Master
• The Master computes the final result based on the partial results
returned by the Slaves
Dynamic Behavior
Client
Master
Slave 1
Service
DivideWork
CallSlaves
SubService
SubService
result
result
CombineResults
result
Slave 2
Solution : Fault Tolerance
• Master delegates execution of the service to each slave
• When the first Slave terminates, the result is returned to the client
• Slaves are kept in synch, so if one of them fails, the system keeps
running
• Reliability is achieved through replication
Solution : Computational Accuracy
•
•
•
•
•
•
•
Slaves are not identical
Requires at least three slaves
Each slave provides a different
implementation of the service
Master delegates execution of the
service to each slave
Master compares results from slaves
If results agree, return to client
If results disagree, take appropriate
action
– Generate exception
– Have slaves "vote" (return most
common result)
Master
+Service()
-DivideWork()
-CallSlaves()
-CombineResults()
1
3..*
ConcreteSlaveA
«interface»
AbstractSlave
+SubService()
ConcreteSlaveB
ConcreteSlaveC
Solution : Parallel Computation
• A complex or very large problem needs to be solved
• Use multiple CPUs/Cores and parallelism to substantially reduce the
time required to solve the problem
– Master divides problem into subtasks
– Slaves solve subtasks in parallel
– Master combines partial results into final problem solution
• Much faster than sequential solution
Known Uses
•
Fault Tolerance
– Hardware and software with "failover" capabilities
– Network routers, Database servers, Web servers, …
•
Computational Accuracy
– Any safety-critical system (airplanes, spacecraft, nuclear reactors)
•
Parallel Computation
– Parallel sorting algorithms
– Distributed compilation
– Parallel test execution
– Rendering farms (animation companies)
– Factoring large numbers into prime factors
– SETI (Search for Extraterrestrial Intelligence)
• Use Internet-connected computers to analyze radio telescope data
Consequences
• Provides fault tolerance, computational accuracy, and system speed up
• Master/Slave is not always feasible because some tasks cannot be
partitioned
• Defining an AbstractSlave interface provides flexibility to add or
exchange slaves without affecting the Master implementation
• Hard to implement
• Implementation will not be very portable if it strongly depends on the
particular hardware configuration being used (e.g., optimization)
Implementation
•
Implementation can be relatively complex
•
Master/Slave systems are highly concurrent since we want slaves to work in parallel
•
Slaves are implemented as separate threads or processes
•
Masters and Slaves may run on one multi-processor machine, or they may be distributed
across several computers (e.g., a cluster)
•
How will data be transferred between Master and Slave? Does each slave need its own
copy of the data, or can they share?
– Shared memory
– Shared database
– Over a network
•
There are tools and libraries available that make building parallel, distributed systems
easier
– PVM (Parallel Virtual Machine) – library for implementing parallel algorithms
– MPI (Message Passing Interface) – standard API for implementing parallel algorithms
Download