Course Overview Slides

advertisement
Distributed Systems
(15-440)
Mohammad Hammoud
December 4th, 2013
Course Objectives
The course aims at providing an indepth and hands-on understanding on
Principles on
which
distributed
systems are
based
Principles on
which
distributed
systems are
optimized
Distributed
system
programming
models and
analytics
engines
How modern
distributed
systems meet the
demands of
contemporary
distributed
applications
List of Topics
Considered: a reasonably critical and
comprehensive understanding.
Thoughtful: Fluent, flexible and efficient
understanding.
Masterful: a powerful and illuminating
understanding.
.1.
Architectures and Communications
.2.
Naming
.3.
Synchronization
.4.
Consistency and Replication
.5.
Fault Tolerance
.6.
Programming Models
.7.
Distributed File Systems
.8.
Virtualization
Course Content
 Course Overview and Introduction (2 Lectures):
–
–
–
–
Why distributed systems?
Defining distributed systems
Course overview and intended learning outcomes
Trends in distributed systems
•
•
•
•
High performance platforms
Mobile and ubiquities computing
Cloud computing
Etc.,
– Challenges in designing distributed systems
• Heterogeneity, openness, security, scalability, reliability,
concurrency, transparency and quality of service
Course Content
 Architectural Models (1 Lecture):
– Client-server, peer-to-peer, tiered and layered
architectures
 Networking (1 Lecture):
– Types of networks
– Networking principles:
• Packet transmission
• Network Layers (Physical, data-link, network and
transport layers)
• Congestion control
Course Content
 Communication Paradigms (1 Lecture):
– Socket communication
• TCP and UDP sockets
– Remote invocation
• RPC and RMI
– Indirect communication
• Message-queuing, publish-subscribe, and group
communication systems
Course Content
 Naming (2 Lectures):
– Flat naming
• Broadcasting, forwarding pointers, home-based
naming, and distributed hash tables
– Structured naming
• Hierarchical name spaces, name resolution, linking
and mounting
– Attribute-based naming
• LDAP and RDF
Course Content
 Synchronization (3 Lectures):
– Time synchronization
• Physical clocks (UTC, Cristian & Berkeley Algorithms
and Network Time Protocol)
• Logical clocks (Lamport and vector clocks)
– Distributed Mutual Exclusion
• Permission-based
• Token-based
– Election Algorithms
• Bully and Ring algorithms
Course Content
 Consistency and Replication (3 Lectures):
– Data-Centric Consistency Models:
• Continuous, Sequential and Causal Models
– Client-Centric Consistency Models:
• Eventual consistency and client consistency guarantees
– Replica Management:
• Server and content replication and placement strategies
– Consistency Protocols:
• Primary-based, replicated-write and cache coherence
protocols
Course Content
 Distributed Programming Models (4 Lectures):
– Classical programming models
• Shared-memory and message-passing models
– MPI Library
• Point-to-point and group communication routines
– Hadoop MapReduce, Google’s Pregel and
CMU’s GraphLab
•
•
•
•
•
•
The parallelism models
The programming models
The architectural models
The computational models
Task/Vertex/Job scheduling
Distributed application suitability
Course Content
 Fault-Tolerance (3 Lectures)
– Failure models
• Crash, omission, timing, response and byzantine models
– Process resilience and agreement protocols
• Lamport’s agreement protocol
– Reliable communication
• Request-reply reliable communication (Request-reply call
semantics)
• Group reliable communication (Virtual synchrony and atomic
multicasting)
– Recovery (Checkpointing and message-logging)
Course Content
 Distributed File Systems (2 Lectures):
– DFS Aspects:
• Architectures (Client-server, cluster-based, and
•
•
•
•
•
•
symmetric architectures)
Processes (Stateless vs. state-full processes)
Communication (RPC2)
Naming (Constructing global name spaces)
Synchronization (Semantics of file sharing)
Consistency and replication (Client-side caching, serverside replication and versioning)
Fault-tolerance (Quorum-based mechanisms)
Course Content
 Virtualization (1 Lecture):
– Why Virtualization in distributed systems?
– Virtualization types
• Full virtualization vs. para-virtualization
– Virtual machine types
• Process VMs vs. system VMs
Download