Reliable IP Multicast: Y P S

advertisement
Reliable IP Multicast:
status and selected topics
A CSE620 Presentation
YE PU & YAN SUN
Overview
 Introduction
 Reliable Multicast Protocols
 Case Studies
 Multicast congestion control
 Routing for Multicast
 The MBone and the Internet2
 Summary
5/28/2016 5:01:12 PM
Reliable Multicast
Introduction
Why Multicast?
–
In many emerging
applications, one sender
will transmit to a group of
receivers simultaneously
Unicasting
Why Reliable?
–
–
Audio/Video applications
do not require reliability
Many other exciting
applications do, e.g.
remote WB, collaborative
VR, data dissemination
5/28/2016 5:01:12 PM
Reliable Multicast
Multicasting
Reliable Multicast: Basic Questions
• What is the "right" definition of reliable multicast?
• Is there a baseline(e.g., reliable delivery of all data)?
• should ordering/causality be part of the networking semantics of reliable
multicast?
• where to draw the line between network- and application-level functionality?
• Design Approaches
• How important is scalability (large number of participants)?
• Are there fundamental differences from one setting to another (1-many vs
many-many) that require different approaches?
• Are separate designs, each optimized for a different scenario, the way to go?
• Can one protocol (or protocol framework) fit all requirements? Will n protocols
(or framworks) fit k (n<k) scenarios?
• Framework
• Is there a value (and if so, what is it) of developing a common framework (a la
RTP) in which various reliable multicast protocols can be built
• What should that framework look like?
-- ACM
SIGCOMM
Workshop,
August 27, 1996
• In terms of IETF, is there any part
(which
one)Multicast
that should
be Stanford,
standardized?
5/28/2016 5:01:12 PM
Reliable Multicast
Reliability Mechanism: Who’s Responsible?
Sender Initiated
ACK Implosion
NACK Trigger
NACK Implosion
5/28/2016 5:01:12 PM
• Sender is responsible for packet loss
detection
• Based on positive acknowledgements
(ACKs)
• ACK implosion at large scale multicast,
poor scalability
Receiver Initiated
• Receiver is responsible for packet loss
detection
• Based on negative acknowledgements
(NACKs)
• Alleviates ACK implosion, better
performance
• Potentially NACK implosion
Reliable Multicast
Loss Recovery: What did ja say?
Loss Recovery: Detection and Retransmission of
lost packets
Global Recovery
• Repair are multicasted to the entire group
• Efficient where loss is often concentrated at the backbone gateway
Local Recovery
• Try to recover from packet loss without going all the way to the
source
• Response multicast within a scope just large enough to reach
each affected receiver
Forward Error Control (FEC)
• Retransmit error-correcting codes instead of original packet data
• Simultaneously repair packet losses with a single packet
5/28/2016 5:01:12 PM
Reliable Multicast
Feedback Control
Feedback Control: Mechanism that restricts the
amount of feedback generated by multicast group
Structure Based
Rely on a designated receiver (DR) to process and filter feedback traffic
Timer Based
Delay retransmission request for a random time interval, uniformly
distributed between the current time and one-way trip time to the source
5/28/2016 5:01:12 PM
Reliable Multicast
Multicast Protocols by 1997
Protocol
Data
Propagation
B-cast
M-cast
Reliability
Mechanism
ACK/NACK
NACK
Repair
Request
U-cast
U-cast
Retransmission
U-cast
M-cast
Flow
Control
Rate
Locus of
Control
Central
Central
Ordering
M-cast
U-cast/M-cast
ACK/NACK
ACK/NACK
M-cast
U-cast
U-cast
U-cast/M-cast
Window
Rate
Central
Central
URGC
RTP
B-cast
M-cast
U-cast
SRM
LBRM
RAMP
M-cast
M-cast
M-cast
NACK
Rate-adjust by
Feedback
NACK
ACK/NACK
NACK
U-cast
-
Rate
Central
Distributed
M-cast
U-cast
U-cast
Probabilistic
Polling
Timer-based
Structure-based
-
M-cast
U-cast/M-cast
U-cast/M-cast
TRM
M-cast
NACK
M-cast
Timer-based
M-cast
Muse
M-cast
NACK
U-cast
MDP
M-cast
NACK
M-cast
AFDP
M-cast
NACK
U-cast
TMTP
RMTP
MFTP
M-cast
M-cast
U/M/B-cast
ACK/NACK
ACK
ACK/NACK
STORM
M-cast
NACK
U-cast
U-cast
U-cast (ACK)
M-cast (NACK)
U-cast
RBP
MTP
MTP-2
RMP
XTP
5/28/2016 5:01:12 PM
Feedback
Control
-
-
Timer-based
Group
Management
Explicit
Explicit
Target Application
Total
Sequence
Number
Total
-
Explicit
Explicit
General Solution
General Solution
Explicit
-
General Solution
Interactive Multimedia
Rate
Distributed
Central
Distributed
Sequence
Number
Explicit
Explicit
Explicit
Window
Distributed
Sequence
Number
-
Implicit
Interactive Multimedia
Interactive Multimedia
Interactive Multimedia
(Optical Gigabit
Network)
Interactive Multimedia
Sequence
Number
Sequence
Number
-
Implicit
News Article
Propagation
Data (File) Distribution
Explicit
Data (File) Distribution
Implicit
Implicit
Explicit
Data Dissemination
Data Dissemination
Data Dissemination
Implicit
Data Dissemination
-
U-cast
-
Distributed
M-cast
Rate
Distributed
M-cast
Rate
Central
Structure-based
Structure-based
Timer-based
M-cast
U-cast/M-cast
M-cast
Window
Window
Window
Distributed
Distributed
Distributed
Structure-based
U-cast
Window
Distributed
-
Reliable Multicast
Total
Total
-
Implicit
General Solution
General Solution
Case Study: SRM
SRM: Scalable Reliable Multicast
•
•
•
•
Originally designed for wb
Currently operational over the MBone
Receiver-reliable, NACK-based
Any receiver can multicast NACK or repair packet
5/28/2016 5:01:12 PM
Reliable Multicast
SRM Loss Recovery Principle
Data-driven Recovery
(sequence gap detection)
Control-driven Recovery
(session message sequence)
• Source assign unique sequence number
• NACK generated when missing data detected
5/28/2016 5:01:12 PM
Reliable Multicast
SRM: Source Path Message
• Each member multicasts periodic session messages that
report the sequence number state for active sources
• Receivers detect the loss of the last packet in a burst
• Members also use session messages to determine the
current participants of session
• Average session message bandwidth: 5% of data
bandwidth
5/28/2016 5:01:12 PM
Reliable Multicast
SRM NACK Suppression
• NACK is multicast to the entire group
• Receiver in need of that data can suppress its own NACK
• Simultaneous detection of packet loss: random delay and
receiver with smallest delay wins
5/28/2016 5:01:12 PM
Reliable Multicast
SRM: Loss Recovery Algorithm
Loss Detection
set the backoff parameter b =1
upon miss data D from host S, choose a random delay t on 2b [C1d(S), (C1+C2)d(S)]
schedule a request packet, REQD, for transmission in t seconds
if we receive REQD from some other host before t seconds, then set
b = b +1 and restart the request timer
5. otherwise, if data, D, or the repair reply, REPD, is received before t seconds, cancel
REQD
6. otherwise, send REQD after t seconds
1.
2.
3.
4.
Retransmission
1. upon receipt of REQD from host A, if D is locally available, choose a random delay t
on [C1d(A),(C1+C2)d(A)]
2. schedule the repair packet REPD for transmission in t seconds
3. if REPD is received before t seconds, then cancel the repair timer
4. otherwise, send REPD after t seconds
5/28/2016 5:01:12 PM
Reliable Multicast
Case Study: RMTP
RMTP: Reliable Multicast Transport Protocol
•
•
•
•
Designed for file dissemination(single-sender)
Deployed in AT&T’s billing network
Based on a hierarchical structure
A special Designated Receiver (DR) is responsible
for sending ACKs to sender
5/28/2016 5:01:12 PM
Reliable Multicast
RMTP: Network Topology
• Receivers grouped into local
region
• Source multicasts packets to
receivers
• Receivers unicast periodical ACK
to its AP/DR
• DR provides local repair if data is
• available
• DR unicasts its own ACK to
parent to consolidation of traffic
to the next DR in hierarchy
• Source determines
retransmission based on status
send by DR
5/28/2016 5:01:12 PM
Reliable Multicast
RMTP: ACK Processing & Retransmission
A sender’s send window
Send Sequence Space
send window
A receiver’s receive window
avail_win
swin_lb
send_next
packet sent but not yet acknowledged
5/28/2016 5:01:12 PM
Reliable Multicast
RMTP: Formation of Local Region
• RMTP assumes there is some information about the
approximate location of receivers
• Some receivers and servers are chosen as DR
• Each DR periodically sends a special packet
SEND_ACK_TOME in which TTL field is set to a pre-determined
value(say 64)
• Each receiver chooses the DR whose SEND_ACK_TOME has
the largest TTL value
5/28/2016 5:01:12 PM
Reliable Multicast
Case Study: PGM
PGM: Pragmatic General Multicast
• Router supported to provide scaling
• Provide no notion of membership
• NACK based, with suppression
5/28/2016 5:01:12 PM
Reliable Multicast
PGM: Data Packet Types
ODATA: original content data
NACK: selective negative acknowledgement
NCF: NACK confirmation
RDATA: retransmission data(repair)
SPM: source path message
TSI: Each PGM packet contains a Transport
Session Identifier (TSI) to identify the session
and source of data
5/28/2016 5:01:12 PM
Reliable Multicast
PGM: NACK/NCF Dialogue
• NACK + random delay is unicast from
router upstream towards source
• PGM-aware router keeps forwarding
NACKs until it sees a NCF or RDATA
• Only one NACK is forwarded for every
packet loss
• Source multicast NCFs to the whole
group to provide NACK reliability
5/28/2016 5:01:12 PM
Reliable Multicast
PGM: Source Path Message
• SPMs are multicast downstream interleaved with ODATA
• PGM-aware routers use SPM to determine unicast path
forwarding NACKs
• Receivers use SPM to determine the last PGM aware router
to forward NACK
5/28/2016 5:01:12 PM
Reliable Multicast
PGM: Retransmission
Sender
• Retransmit immediately after getting a NACK
Router
• Maintain retransmission states for every interface that
received NACK
• Only forward retransmission on one interface per NACK
5/28/2016 5:01:12 PM
Reliable Multicast
PGM-aware Router Features
•
Routers intercept SPMs and use them to establish source path
state for the corresponding source and group
•
Routers forward only the first copy of any NACK they receive to
the upstream PGM-aware router to constrain NACK forwarding
•
Routers discard exact duplicates of any NACK for which they
already have repair state
•
Routers use NACKs to maintain repair state consisting a list of
interfaces upon which a given NACK was received, and return
the RDATA only on these interface
•
Routers can also optionally redirect NACKs to a designated
local retransmitter (DLR) rather than the source
5/28/2016 5:01:12 PM
Reliable Multicast
Congestion Control
Why Congestion Control?
• Needs to use available bandwidth fairly among multiple best-effort
flows over a shared link
TCP Congestion Control
•
•
•
•
Multiplicative decrease at the indication of congestion
Linear increase when there is no congestion
Encourage fair sharing of bandwidth
No safeguard against aggressive flows (endtoend feedback
controlled)
Multicast without CC
• NonTCPcompatible flows can lock out competing TCP flows
• Simultaneous congestion collapses
• Need endtoend feedbackbased TCPcompatible congestion control
mechanism
5/28/2016 5:01:12 PM
Reliable Multicast
Control Metrics
Fairness - How it shares bandwidth with other connections, and how it
discriminates against connections of different lengths. This is the closest
thing to the "performance" of a connection
Safety - How wide of a range of operating conditions can the algorithm
support without causing the network to go in to an unstable operating
range
Responsiveness - How fast an algorithm adapts to changes in the
network load
Variability (or accuracy) - How consistent is the performance of the
algorithm in the face of a given environment? i.e. what is the variance in
throughputs?
Scalability - How do these metrics scale in the face of large scale
groups?
5/28/2016 5:01:12 PM
Reliable Multicast
Control Approaches
Window-based: “Slow start” TCP-style sliding
window algorithm
Rate-adaptive: Adjust transmission rate upon
receipt of NACKs
Forward Error Correction (FEC): Rarely used
due to encoding/decoding overhead
5/28/2016 5:01:12 PM
Reliable Multicast
MTCP: Hierarchical Congestion Control
Source
Hierarchical Congestion Reports
ACKs and
summary
• Internal tree nodes sender's agent
(SA)
• receivers send feedback to their
SAs
• SAs send a summary of the
congestion level of their children to
their parents
MR 1
SA
MR 2
MR 4
MR 3
ACKs and
summary
MR 5
SA
MR 6
MR 7
MR 8
ACKs
Group
Member
Group
Member
MR 9
MR 10
Group
Member
5/28/2016 5:01:12 PM
Group
Member
Group
Member
Reliable Multicast
MTCP: Hierarchical CC (cnt’d)
Window Based Control
• Send controls its rate based on its summary
Congestion Window Adjustment (when CWND goes down)
• RTD timeout
• Fast retransmission (in conjunction with selective acknowledgment)
• Three NACKs for the same packet reduces the window (note that not
every loss causes CWND to go down by half)
• Based on TCPVegas scheme (I.e., long RTT causes it to go down)
5/28/2016 5:01:12 PM
Reliable Multicast
Forward-error Correction Coding (FEC)
5/28/2016 5:01:12 PM
•
"Simultaneous repair"
utilize(n,k) block codes
•
Packet stream is grouped into
platoons of n packets each
Reliable Multicast
FEC/ARQ
Receiver
Sender
Receiver
• On detected loss the receiver
NACKs the platoon rather than
the packet
• If each receiver indicates the
number m of packets loss from
that platoon, then the responder
can merely send m of k parity
packets.
5/28/2016 5:01:12 PM
Reliable Multicast
Proactive FEC/ARQ
Receiver
Sender
Receiver
Proactive: Send some
repairs before loss
Proactive factor: r
• Sender sends round(rk)
packets
• Recevers NACKs to get
add’l repairs
5/28/2016 5:01:12 PM
Reliable Multicast
Multicast Routing
•
Requires a significant amount of state and complexity in routers
(requires at least per-group state information and often even per-source
information) => Very slow deployment and use by Internet standards
•
Dense Mode: Sender broadcasts traffic and triggers prune messages
(DVMRP, PIM-DM)
Sparse Mode: Group members explicitly sends join messages (MOSPF,
CBT, PIM-SM)
•
Advantage
 Less routing state to keep (only routers on the multicast path keep)
 Explicit join: multicast traffic only flows across links leading to identified
receivers
Disadvantage
 Single-point-of-failure at RP
 Hot spot of multicast traffic at RP and non-optimal path on multicast tree
5/28/2016 5:01:12 PM
Reliable Multicast
Multicast Routing in Early MBone
Multicast
Application
(sender or
receiver)
Multicast
Application
(sender or
receiver)
R1
R2
MR4
MR3
1. MR3 and MR4, running the Multicast Router Daemon (mrouted), support IGMP. Mrouted
encapsulates multicast datagrams in unicast datagrams to send, and decapsulates multicast
datagrams from unicast datagrams it receives
2. R1 and R2 are non-multicast enabled routers. They forward unicast encapsulated multicast
packets just like any other unicast datagram
MBone on non-multicast capable Internet
5/28/2016 5:01:12 PM
Reliable Multicast
DVMRP
Source
•
local subnet
MR 1
•
MR 3
•
MR 4
MR 5
MR 2
Group
Member
Group
Member
MR 8
MR 6
Hops:
MR 7
1
2
3
4
•
•
Group
Member
Group
Member
First protocol developed to
support multicast routing
Tree is constructed on demand
using a “broadcast and prune”
Reverse Path Forwarding (RPF)
ensures no loops in the tree and
only shortest paths included
RPF uses unicast routing table
Does not scale to support
multicast groups that are sparsely
distributed over a large network
1. the message reaches router 1
2. the message reaches routers 2,3, and 4
3. routers 3 and 4 exchange messages. Each one just drops the message, because
it didn’t arrive over the interface that gives the shortest path back to the source
4. the message reaches router 7. Router 7 realizes it is a leaf router and there are
no group members on its subnet, so it sends a prune message back to router 6,
the upstream router. Router 6, in turn, sends a prune message to router 4. Router
3 also sends a prune message to router 1
5/28/2016 5:01:12 PM
Reliable Multicast
MOSPF
•
MR 1
•
MR 2
Group
Member
Group
Member
MR 5
MR 4
•
MR9
MR 3
MR 6
•
Group
Member
MR 7
MR 8
Group
Member
•
Intended for use within a single
routing domain
Dependent on the use of OSPF
Tree is only calculated when a
router receives the first datagram in a stream
All routers calculate exactly the
same tree
Does not scale well due to
periodic flooding of group
membership reports
1. MR 1 computes tree - knows members of group via IGMP and hence
knows path to MR 4 is via MR 2, path to MR 8 is via MR 5, etc.
2. MR 2 computes tree - determines path to MR 4 is direct, path to MR 8 is
via MR 5 and MR 3 computes tree - determines path to MR 9 is direct
3. MR 5 computes tree - determines path to MR 8 is direct
Note that the multicast transmission triggers this process (i.e. data driven process) and each router, when it receives
a message, calculates exactly the same distribution tree as its predecessors and uses it to forward the message.
5/28/2016 5:01:12 PM
Reliable Multicast
Core Based Tree (CBT)
•
•
•
•
5/28/2016 5:01:12 PM
a single tree that is shared by
all members of the group,
Multicast traffic for the entire
group is sent and received
over the same tree, regardless
of the source
significant savings in terms of
the amount of multicast state
information that is stored in
individual routers
concentration of traffic around
the core
load balancing might be
achieved by using more than
one core
Reliable Multicast
PIM-SM
MR
MR
•
MR
•
MR
MR
•
MR
MR
MR
•
Initial group-shared tree
construction similar to CBT
Supports both group-shared
tree and shortest-path tree
Relies on unicast routing
tables to adapt to network
topology changes
Independent of the particular
unicast routing protocol
1. The sender at Source 2 registers at the Rendezvous Point
Multicast Router RPt
2. A receiver joins at Rpt; there is now a bigger shared tree
3. The receiver is receiving lots of data from Source 2. The
receiver sends an explicit join to Source 2 to construct a
shortest path route
5/28/2016 5:01:12 PM
Reliable Multicast
Interdomain Multicast Routing
Near-term Solution - PIM-SM/MBGP/MSDP:
•
•
Multicast Border Gateway Protocol (MBGP): multicast route aggregation
and abstraction as well as hop-by-hop policy routing is provided in unicast
using the Border Gateway Protocol (BGP)
Multicast Source Discovery Protocol (MSDP): works by having
representatives in each domain announce to other domains the existence of
active sources. MSDP is run in the same router as a domain's RP (or one of
the RPs)
Long-term Solution - BGMP/MAAA:
•
•
Border Gateway Multicast Protocol (BGMP): first proposed as a long-term
solution to Internet-wide, inter-domain multicast.
Multicast Address Allocation Architecture (MAAA): consists of Multicast
Address-Set Claim (MASC) protocol (domain level), Address Allocation
Protocol (AAP) (within a domain), and Multicast Address Dynamic Client
Allocation Protocol (MADCAP) (for requesting addresses from a multicast
Address Allocation Server (MAAS))
Alternative Solution - Root Addressed Multicast Architecture (RAMA)
5/28/2016 5:01:12 PM
Reliable Multicast
The MBone
•
•
•
•
A virtual network layered on top of the physical
Internet to support routing of IP multicast packets
Initially a test bed for multicast
Extensively exploits tunnels
Routing mainly with DVMRP
“MBONE is truly the start of mass-communication that may supplant television. Used well,
it could become an important component of mass communication.” -- John December
5/28/2016 5:01:12 PM
Reliable Multicast
The MBone
5/28/2016 5:01:12 PM
Reliable Multicast
The Internet2
Internet2 is a collaboration among more than 100 U.S. universities to
develop networking and advanced applications for learning and research.
The design and implementation of a deployment strategy to provide a
consistent and ubiquitous multicast service within the Internet2
community.
Internet2 Multicast-Peering Sites: Abilene, vBNS, NREN, DREN, Esnet,
CANARIE, TEN-155/34 (DANTE), NORDUnet, SurfNet, APAN
Abilene is an advanced backbone network that connects regional network
aggregation points, called gigaPoPs, to support the work of Internet2
universities as they develop advanced Internet applications.
vBNS maintains a native IP multicast service via a PIM sparse-densemode configuration among all vBNS Cisco routers. MBGP routing is used
internally in combination with an MBGP default route representing MBone
sources. vBNS belongs to MCI Worldcom
5/28/2016 5:01:12 PM
Reliable Multicast
The Internet2 (cnt’d)
For Internet2, the plan has always been to try and do
multicast “the right way” in so much as is possible given
the currently available set of protocols. As a result, the
multicast deployment plan is following guidelines set
forth by the Internet2 Multicast Working Group.
Guidelines
•
•
•
5/28/2016 5:01:12 PM
all multicast deployed in Internet2 to be native and
sparse mode
No tunnels are allowed
All routers must support inter-domain multicast routing
using MBGP/MSDP.
Reliable Multicast
Multicast on Abilene Network
5/28/2016 5:01:12 PM
Reliable Multicast
Multicast on vBNS
5/28/2016 5:01:12 PM
Reliable Multicast
Summary
• IP Multicast is emerging as an utterly important topic
in the future Internet
• Achieving reliability: ACKs vs NACKs, Local
Recovery, FEC, …
• Reliable multicast protocols: SRM, RMPT, and PGM
• Multicast congestion control
• Routing in multicast: DVMRP, MOSPF, CBT, PIM-SM
• Interdomain Multicast and multicast deployment on
the MBone and the Internet2
5/28/2016 5:01:12 PM
Reliable Multicast
References
Almeroth, K. C., The Evolution of Multicast: From the MBone to Inter-Domain Multicast to Internet2,
Deployment, IPMI White Paper (www.stardust.com), 1999
Ballardie, A., RFC-2201: Core Based Trees (CBT) Multicast Routing Architecture, September 1997
Costello, A. M. and McCanne S., Search party: using randomcast for reliable multicast with local
recovery, University of California at Berkeley Techanical Report UCB//CSD-98-1011, 1998
Estrin, D., Farinacci D., A. Helmy, Thaler D., Deering S., Handley M., Jacobson V., Liu C., Sharma P.,
Wei L., RFC-2362: protocol independent multicast-sparse mode (PIM-SM): protocol specification
Floyd, S., Jacobson V., Liu C., McCanne S., and Zhang L., A reliable multicast framework for light-weight
sessions and application level framing, IEEE/ACM Transactions on Networking, Vol. 5, No. 6, 1997
IPMI, Reliable IP multicast - PGM overview, IPMI White Paper (www.stardust.com), 1998
http://www.tascnets.com/mist/doc/mcpCompare.html
http://netweb.usc.edu/multicast/
http://www.stardust.com/
http://www.starburstcom.com/
Katia Obraczka, Multicast transport mechanisms: a survey and taxonomy, IEEE Communications
Magazine, January 1998
5/28/2016 5:01:12 PM
Reliable Multicast
References
Mankin, A., Romanow A., Bradner S., and Paxson V., RFC-2357: IETF criteria for evaluating reliable
multicast transport and application protocols, June 1998
McCanne, S., Scalable Multimedia Communication Using IP Multicast and Lightweight Sessions, IEEE
Internet Computing, Vol. 3, No. 2, 1999
Moy, J., RFC-1584: multicast extensions to OSPF, March 1994
Paul, S., Sabnani K. K., Lin J. C., and Bhattacharyya S., Reliable Multicast Transport Protocol (RMTP),
IEEE Journal on Selected Areas in Communications, Vol. 15 No. 3, 1997
Rekhter, Y., Li T., RFC-1771: a border gateway protocol 4 (BGP-4), March 1995
Waitzman, D. and Deering S., RFC1075: distance vector multicast routing protocol, November 1988
5/28/2016 5:01:12 PM
Reliable Multicast
Download