Yoram Ofek
Synchrodyne Networks, Inc.
E-mail: ofek@synchrodyne.com
Phone: (917) 601-7180
© 2002 Yoram Ofek May 2002 1
Person-to-Person
Communications
Typically with Rate
Machine-to-Machine
Communications
Typically No Rate
© 2002 Yoram Ofek May 2002 2
Applications with some notion of rate:
Most demanding: interactive - streaming media - voice/video
end-to-end delay < 100 ms
continuous play - i.e., periodic
Will satisfy also: non-interactive: playback, large file transfers Machine-to-Person
The transition from circuit to packet switching the rate per person will increase 3-4 orders of magnitude:
from 10 4 b/s to 10 8 b/s
May 2002 © 2002 Yoram Ofek
3
(Computing) Machines are still evolving rapidly
e.g., capabilities: “Moore’s Law” - new applications
General characteristic: bursty - unpredictable in time/space
All bits should be transferred correctly with no “shaping” :
max. throughput (burst) AND min. delay & loss
e.g., distributed/parallel computing, data processing
May 2002
Traffic shape at the source t
Transfer with
Minimum Distortions
© 2002 Yoram Ofek
No
“Shaping”
Traffic shape at the Destination t t
4
How to support the two generic traffic types:
1. Ring networks
Machine-to-Machine
Person-to-Person
2. Convergence routing
Machine-to-Machine
3. Time-driven - switched networks
Person-to-Person
Integration of Machine-to-Machine using UTC
4. Dynamic optical networking
Person-to-Person
Integration of Machine-to-Machine using UTC
May 2002 © 2002 Yoram Ofek
5
First token ring was introduced (e.g., IBM, FDDI)
Why token rings?
Can support:
1) Bursty data (asynchronous) with Machine-to-Machine
no rate, (no) loss, (low) latency, fairness, multicast
2) Periodic real-time Person-to-Person
with rate and delay guarantees, multicast
© 2002 Yoram Ofek May 2002 6
Packet are removed at destinations: slotted or insertion ring
Concurrent transmission
Throughput grows with locality
all nodes can transmit simultaneously to their neighbors
10
9
11
12
8
1
2
7
6
3
5
4
© 2002 Yoram Ofek May 2002 7
MetaRing: Fairness with Spatial Bandwidth Reuse
SAT (token) gives predefined transmission quota
Rotates in the opposite direction
Held intermittently if the node is not SATisfied
Node 3
IB
Node 2
IB
May 2002
Node 4
Node 1
IB
Slotted or insertion ring
SAT
IB - Insertion Buffer
Node 6
IB
Node 5
© 2002 Yoram Ofek
8
Equal throughput after each SAT rotation - with multiple variants
Multiple SATs operations for simple fault recovery
SAT/SAT’ for graceful degradation to (multi) bus operation
SAT signal provides for:
Bounded delay with no loss of bursty data Machine-to-Machine
Integration of real-time traffic with known rate Person-to-Person
MetaRing is the underlying network for IBM storage area network (SAN) products (also ANSI SSA - X3T10 standard)
Multi-billion business for IBM
May 2002
Spatial reuse rings are currently very active:
Cisco SRP/DPT and IEEE 802.17
© 2002 Yoram Ofek
9
Is MetaRing panacea? NO
May 2002 © 2002 Yoram Ofek
10
To transfer with max. throughput (burst) AND min. delay and loss
Traffic shape at the source t
No
“Shaping”
Transfer with
Minimum Distortions
Traffic shape at the Destination t t
TCP/IP: unstable/unpredictable throughput/delay/loss
Cannot be done with over fixed routes (congestion and loss)
Dynamic routing:
May 2002
e.g., “Hot-Potato” ( P. Baran ),
Manhattan Street Network - deflection routing ( N. Maxemchuk )
© 2002 Yoram Ofek
11
MetaNet Convergence Routing with Sense of Direction
Invented by Yoram Ofek and Moti Yung
Virtual ring embedding
VN1
Link types:
Ring - part of virtual ring/s
Thread - all other links
A
Embeddings methods - e.g.,:
Simple Hamiltonian Circuit
Euler tree traversal
VN2
D
VN3
B
Sequential Numbering of Virtual Nodes:
VN0, VN1, VN2, …
VN6 C
VN8
G
VN0
F
I
VN15
VN14
Multiple partial virtual rings
E
H
VN7
VN9
May 2002 © 2002 Yoram Ofek
12
MetaNet Convergence Routing over Switched Network
Packet routing paradigm:
1. Packets are forwarded to idle output link
“closer” to their destinations with:
“sense of direction” - along virtual ring(s)
2. Virtual (buffer insertion) ring traffic gets priority to continue on the virtual ring links
May 2002 © 2002 Yoram Ofek
13
MetaNet Convergence Routing over Switched Network
SHORT-CUT Routing:
Example: packet arrives to VN6 on node C with destination H, can shortcut to VN8
Diametric routing in light load
Short-cut
VN1
A
G
VN0
VN3
VN2
B
VN6 C
D
VN7
E
VN8
VN9
H
F
I
VN15
VN14
May 2002 © 2002 Yoram Ofek
14
MetaNet Convergence Routing over Switched Network
Broadcast-with-feedback:
Requirements:
asynchronous - without arbitration
losslessness
correctness
complete coverage
packet copied only once
complete feedback to the source
When short-cuts or jumps are possible the packets are
DUPLICATED
VN3
VN1
VN2
D
A
VN7
B
E
May 2002 © 2002 Yoram Ofek
C
G
VN0
VN9
H
F
VN15
I
VN14
15
MetaNet Convergence Routing over Switched Network
Summary:
Support traffic from bursty sources with no rates:
No packet loss
Machine-to-Machine
Bounded delay
Fairness
Person-to-Person
Broadcast and multicast (with feedback)
May 2002
However, still limitations:
1) on size - it is not a global network!
2) does not support person-to-person communications with known rates
© 2002 Yoram Ofek
16
How to support the two generic traffic types:
1. Ring networks
Machine-to-Machine
Person-to-Person
2. Convergence routing
Machine-to-Machine
3. Time-driven - switched networks
Person-to-Person
Integration of Machine-to-Machine using UTC
4. Dynamic optical networking
Person-to-Person
Integration of Machine-to-Machine using UTC
May 2002 © 2002 Yoram Ofek
17
Time-Driven Priority over Switched Network
How to support communications with known rate on a global network?
Person-to-Person
Flow (congestion) control methods:
Rate control at the network’s boundaries - e.g., ATM ( J. Turner )
with statistical multiplexing inside the network
Inside the network with local clocks scheduling deadline scheduling ( D. Ferari ), GPS ( A. Parekh, R. Gallager )
Inside the network with scheduling based on global time:
UTC - Coordinated Universal Time:
TIME-DRIVEN PRIORITY
Based on pipeline forwarding
May 2002 © 2002 Yoram Ofek
18
Pipeline: optimal method - independent of a specific realization successfully deployed with optimal efficiency in
Factory (automotive), Computers (CPU)
NOW pipeline in global networks! Thanks to GPS that provides UTC
Super-cycle UTC second with 80k Time-frames
Time
Cycle0
T f
T f
Time
Cycle1
T f
Time
Cycle 79
T f
T f
Time
Driven
Priority
1 2 1000 1 2 1000
0 beginning of a UTC second
1 2 1000
1
Time-of-Day or UTC beginning of a UTC second
• Time-of-day or UTC – coordinated universal time - with accuracy of
5
s
May 2002 © 2002 Yoram Ofek
19
1. Immediate forwarding
2. 2-frame forwarding
3. Arbitrary forwarding
Arrive to
Output
Port
1 2
Time Cycle i
Scheme h - # of hops
Immediate
2-frame
Arbitrary-frame
Time Cycle
48 1 2 i
Blocking
Probability
Small p
(q=1-p)
48 t
Forward from
Output
Port
1 2
Arbitrary
Immediate 2-frame i i+1 48 1 2
May 2002 i+2
Time Cycle i
Time Cycle
© 2002 Yoram Ofek
48 t
20
Sender-receiver synchronization The size of successive
MPEG
I picture
Sender
Video Frame
Capture
MPEG
I picture
MPEG
I picture Videoconferencing
Node B
MPEG
Node C
P pictures
Receiver
MPEG
P pictures
MPEG
P pictures
Video Frame t real
Display
Time driven priority videoconferencing with complex periodicity scheduling
Face-to-face quality
Scale the globe
May 2002 © 2002 Yoram Ofek
21
IP and “Best Effort” service are unchanged
Time-driven priority scales the globe:
Jitter: bounded by 2*T f
:
Independent of the network size, traffic load, flow rate
End-to-end delay: 2* h*Tf + prop. Delay
No loss
Person-to-Person
Can easily integrated with:
Machine-to-Machine
MetaNet convergence routing
Optimized for interactive streaming media
May 2002 © 2002 Yoram Ofek
22
Fractional
Switching for Dynamic Optical Networking
Objective: to utilize UTC in the optical domain
In static optical networking all data units on the optical channel are switched in the same way while,
In dynamic optical networking each data unit on the optical channel may be switched differently
May 2002 © 2002 Yoram Ofek
23
SEA
2
STL
SF 5
s
NYC
LA
© 2002 Yoram Ofek May 2002 24
Save
s & Grooming & Small or No Memory
May 2002
SEA
1
5 Fractional
Pipes (F
Ps)
STL
SF
NYC
Number of
s =
Aggregate capacity needed
10 Gb/s
LA
© 2002 Yoram Ofek
25
Header processing only at the edges
Wireless Base
Station
Small
fractions
ADSL DSLAM
(central office)
Switching
Large
fractions
IP/MPLS
May 2002 © 2002 Yoram Ofek
Server Farm
(web, VoD)
Small
fractions
Edge/Access
Router (POP)
Cable Modem
Head-end
26
Pipeline forwarding of whole time frames
No header processing
Banyan based switch structure - optimal
Super-cycle UTC second with 80k Time-frames
Time
Cycle0
T f
T f
Time
Cycle1
T f
Time
Cycle 79
T f
T f
1 2 1000 1 2 1000
0 beginning of a UTC second
May 2002
See pipeline forwarding - PF animation over FlPs
1 2 1000
1
Time-of-Day or UTC beginning of a UTC second
© 2002 Yoram Ofek
27
Why: Dynamic: Fractional
Switching
The Optical Links are Memory
A mesh of linear delay lines
How to preserve pipeline forwarding?
Delay between switches = integer number of time frame
UTC
May 2002 © 2002 Yoram Ofek
28
Time-of-Day or UTC
Switch Controller
Input 1
Optical
Alignment
Output 1
Idle time:
Safety margin between two time frames
Optical
Switching
Fabric
Idle time:
Safety margin between two time frames
May 2002
Input N Output N
Optical
Alignment
T f
T f
T f
T f t+2 t+1 t t-1 t-2 t-3
Time-of-Day or UTC
T : Time frame f
: Time frame payload – with a predefined number of data units
© 2002 Yoram Ofek
29
Multistage Crossbar
Switching elements a*N*lg a
N
For N=256, a=4 4K 64K
N 2
For N=1024, a=4 20K 1,000K
(factor of 16)
(factor of 50)
Scalability Blocking
Multiple time frames
low blocking probability
DWDM
many parallel routes
low blocking probability
May 2002 © 2002 Yoram Ofek
31
Four Channels per Link
80%
70%
60%
50%
40%
30%
20%
10%
0%
50%
May 2002
55%
1 TF
60% 65%
4 TFs
70% 75% 80%
Average Utilization [% ]
85%
64 TFs
90% 95% 100%
1000 TFs 32
Periodic
Schedule on Switch i
TF8
TF1
TF2
Schedule s
Periodic
Schedule on Switch j
TF1
TF8
TF2
TF7
TF3 TF3
TF6
TF4 TF4
TF5 TF5
Always aligned with a bounded error (typically < 1
second)
May 2002 33
Scheduling and Switching without UTC Alignment
Circuit Switching, e.g., SONET
Periodic
Schedule on Switch i
Schedule s
Periodic
Schedule on Switch j
No alignment
Thus, delay (memory) per switch = 1 Time Cycle
May 2002 © 2002 Yoram Ofek
34
IP/MPLS
SONET
May 2002
UTC
Network
Processor
(MPLS)
Port
MPLS Packets
SONET STS-1 frames
SONET
DMUX
(STS-1)
Port
© 2002 Yoram Ofek
F
P 1
F
P 2
F
P i
F
P 1
F
P 2
See Animation
UTC
F
P i
35
May 2002
Person-to-Person
Typically with Rate
Fractional
Switching
Time-driven Priority
MetaNet Convergence
Routing
Machine-to-Machine
Typically No Rate
© 2002 Yoram Ofek
36