Last Lecture: Network Layer

advertisement

1.

2.

3.

4.

5.

6.

7.

Last Lecture: Network Layer

Design goals and issues

Basic Routing Algorithms & Protocols

Addressing, Fragmentation and reassembly o o o o

Internet Routing Protocols and Inter-networking

Intra- and Inter-domain Routing Protocols

Introduction to BGP

Why is routing so hard to get right? ✔

Credits: slides from Jennifer Rexford, Nick Feamster, Hari Balakrishnan,

Timothy Griffin ICNP’02 Tutorial, Xin Hu & Z. Morley Mao

Router design

Congestion Control, Quality of Service

More on the Internet’s Network Layer

SUNY at Buffalo; CSE 489/589 – Modern Networking Concepts; Fall 2010; Instructor: Hung Q. Ngo 1

1.

2.

3.

4.

5.

6.

7.

This Lecture: Network Layer

Design goals and issues

Basic Routing Algorithms & Protocols

Addressing, Fragmentation and reassembly

Internet Routing Protocols and Inter-networking

1.

2.

3.

Router design

Short history ✔

Router architectures

Address lookup problem

Congestion Control, Quality of Service

More on the Internet’s Network Layer

SUNY at Buffalo; CSE 489/589 – Modern Networking Concepts; Fall 2010; Instructor: Hung Q. Ngo 2

1.

Router Design: Short History

 What is an Internet router?

 What limits performance: Memory access time

 The early days: Modified computers

 Programmable against uncertainty

 The middle years: Specialized for performance

 Needed new architectures, theory, and practice

 So how did we do?

Slides from Nick McKeown @ Stanford

3

Ada Lovelace

Ada Lovelace

4

Data

Routers process heads

Head

Slides from Nick McKeown @ Stanford

5

Definitions

N

1

2

R

3

5

6

4

8

7

N = number of linecards. Typically 8-32 per chassis

R = line-rate. 1Gb/s, 2.5Gb/s, 10Gb/s, 40Gb/s, 100Gb/s

Capacity of router = N x R

Slides from Nick McKeown @ Stanford

6

6ft

What a Router Chassis Looks Like

Cisco CRS-1

19”

Juniper M320

17”

Capacity: 1.2Tb/s

Power: 10.4kW

Weight: 0.5 Ton

Cost: $500k

3ft

Capacity: 320 Gb/s

Power: 3.1kW

2ft 2ft

7

What a Router Line Card Looks Like

10-Port GigE

(for Cisco 12000 Series)

1-Port OC48 (2.5 Gb/s)

(for Juniper M40)

Power: about 150 Watts

2in

10in

21in

4-Port 10 GigE

(for Cisco CRS-1)

8

A State of the art edge router

Cisco’s ASR 9000

Linecard: Cisco 100 GE, 100Gbps

A State of the Art Core Router

Max Capacity

25 Tbps

Juniper T1600

17.43 x 37.45 x 31 in

606 lbs

Juniper TX Matrix +

21.4 x 52 x 36.2 in

900 lbs interconnects up to 16 T1600 chassis into a single routing entity

Data

01000111100010101001110100011001

Header

1. Internet Address

2. Age

3. Checksum to protect header

Slides from Nick McKeown @ Stanford

11

Lookup internet address

Check and update age

Check and update checksum

Slides from Nick McKeown @ Stanford

12

Barebones Router

Router Control and Management

Barebones Router

Barebones Router

1

Bottlenecks

Memory, memory, …

2

DRAM as bottleneck

DRAM then DRAM now d d

Address Data Address Data

 DRAMs designed to maximize number of bytes

Access time (“speed”) has stayed pretty much constant

In 11 years: from 50ns down to 20ns, much slower than Moore’s law

Slides from Nick McKeown @ Stanford

17

Outline

 What is an Internet router?

 What limits performance: Memory access time

 The early days: Modified computers

 Programmable against uncertainty

 The middle years: Specialized for performance

 Needed new architectures, theory, and practice

 So how did we do?

 The present: Internet showing its age

 Simple model breaking down

R

R

R

R

1

st

Generation Routers are

Modified Computers

Must run at rate N x R

R

R

R

R

Bottlenecks

1

st

Generation Routers

Off-chip Buffer

Shared Bus

CPU

Route

Table

Buffer

Memory

Line

Interface

MAC

Line

Interface

MAC

Line

Interface

MAC

Typically <0.5Gb/s aggregate capacity

20

Innovation #1: Linecards have routing tables

Prevents central table from becoming a bottleneck at high speeds

Complication: Must update forwarding tables on the fly.

SUNY at Buffalo; CSE 489/589 – Modern Networking Concepts; Fall 2010; Instructor: Hung Q. Ngo 21

2

nd

Generation Router

R

R

R

R

2

nd

Generation Routers

CPU

Route

Table

Buffer

Memory

Line

Card

Buffer

Memory

Fwding

Cache

MAC

Line

Card

Buffer

Memory

Fwding

Cache

MAC

Line

Card

Buffer

Memory

Fwding

Cache

MAC

Typically <5Gb/s aggregate capacity

Problems with Early Routers

• Function more important than speed

• 1993 (WWW) changed everything

• We badly needed

– Some new architecture

– Some theory

– Some practice

Innovation #2: Switched Backplane

Using a switching fabric :

 input ports can simultaneously connect to output ports in one time slot (in a 1-to-1 manner)

Advantage: Exploits parallelism

Disadvantage: Need scheduling algorithm

SUNY at Buffalo; CSE 489/589 – Modern Networking Concepts; Fall 2010; Instructor: Hung Q. Ngo 25

Switching Fabrics Allow for Parallel Transfer

3

rd

Generation Router: Switch

N x R

OQ Switch

Data Hdr

Data Hdr

Header Processing

Lookup

IP Address

Update

Header

Address

Table

Header Processing

Lookup

IP Address

Update

Header

Address

Table

1

2

1

2

N times line rate

Data Hdr Header Processing

Lookup

IP Address

Update

Header

Address

Table

N N

SUNY at Buffalo; CSE 489/589 – Modern Networking Concepts; Fall 2009; Instructor: Hung Q. Ngo

Queue

Packet

Buffer

Memory

Queue

Packet

Buffer

Memory

N times line rate

Queue

Packet

Buffer

Memory

28

Simple Model to View an OQ-Switch

Link 2

Link 1

R1

Link 3

Link 4

Link 1, ingress

Link rate, R

Link 2, ingress

R

Link 3, ingress

R

Link 4, ingress

R

SUNY at Buffalo; CSE 489/589 – Modern Networking Concepts; Fall 2009; Instructor: Hung Q. Ngo

Link 1, egress

Link rate, R

Link 2, egress

R

Link 3, egress

R

Link 4, egress

R

29

Characteristics of an OQ-Switch: Nice!

Arriving packets are immediately written into the output queue, without intermediate buffering.

The flow of packets to one output does not affect the flow to another output.

An OQ switch is work conserving: an output line is always busy when there is a packet in the switch for it .

OQ switches have the highest throughput , and lowest average delay .

The rate of individual flows, and the delay of packets can be controlled (with Weighted Fair Queueing + leaky bucket )

SUNY at Buffalo; CSE 489/589 – Modern Networking Concepts; Fall 2009; Instructor: Hung Q. Ngo 30

Example OQ: Shared Memory Switch (SMS)

Link 1, ingress

A single, physical memory device

Link 1, egress

Link 2, ingress

R

Link 3, ingress

R

Link N , ingress

R

Link 2, egress

R

Link 3, egress

R

Link N , egress

R

31

SUNY at Buffalo; CSE 489/589 – Modern Networking Concepts; Fall 2009; Instructor: Hung Q. Ngo

Required Memory Bandwidth

Basic OQ switch :

Consider an OQ switch with N different physical memories, and all links operating at rate R bits/s.

In the worst case, packets may arrive continuously from all inputs, destined to just one output.

Worst-case memory bandwidth requirement for each memory is (N+1)R bits/s .

Shared Memory Switch:

Maximum memory bandwidth requirement for the memory is 2NR bits/s .

Also, single point of failure!

SUNY at Buffalo; CSE 489/589 – Modern Networking Concepts; Fall 2009; Instructor: Hung Q. Ngo 32

1

2

How Fast Can A “Practical” SMS Be?

5ns SRAM

Shared

Memory

5ns per memory operation

Two memory operations per packet

Therefore, up to 160Gb/s

In practice, closer to 80Gb/s

Note: SRAM’s Very Expensive!

N

200 byte bus

Commercial routers with SM architecture:

Juniper’s E-series/ERX edge routers

M-series/M20, M40, and M160 core routers

33

SUNY at Buffalo; CSE 489/589 – Modern Networking Concepts; Fall 2009; Instructor: Hung Q. Ngo

OQ Example: Shared Medium Switch

Pro: good delay & throughput, broadcast/multicast possible

Con: high bus speed (NR), high memory bandwidth ((N+1)R), highspeed address filter (NR)

Commercial routers: Cisco 7500 series

SUNY at Buffalo; CSE 489/589 – Modern Networking Concepts; Fall 2009; Instructor: Hung Q. Ngo 34

Summary of OQ Switches

OQ switches are ideal

Work-conserving

Maximize throughput

Minimize expected delay

Permit delay guarantees for constrained traffic input interface output interface

Backplane

OQ switches don’t scale well

R

O

Requires

N memory writes per time slot (output speedup of N)

Memory bandwidth is a bottleneck

Parallelism is not straightforward

SUNY at Buffalo; CSE 489/589 – Modern Networking Concepts; Fall 2009; Instructor: Hung Q. Ngo

C

35

1 x R

Arbiter

IQ Switch

Only input interfaces store packets

Advantages

Easier to built (store packets at inputs if contention at outputs)

Relatively easy to design algorithms

Disadvantages

In general, hard to achieve high utilization

We can show that :

Requires input/output speedup of 2 to achieve 100% throughput

With higher speedup, rate guarantee is possible too! input interface output interface

Backplane

R

O

C

SUNY at Buffalo; CSE 489/589 – Modern Networking Concepts; Fall 2009; Instructor: Hung Q. Ngo 38

Main IQ Problem: Head of Line Blocking

39

HoL Blocking Leads to Low Throughput

Karol-Hluchyj-Morgan (IEEE Trans. Comm. 87): 22 %

The best that any queueing system can achieve.

0% 20% 40% 60% 80%

Load

SUNY at Buffalo; CSE 489/589 – Modern Networking Concepts; Fall 2010; Instructor: Hung Q. Ngo

100%

40

41

Solution to HoL: Virtual Output Queues

A Router with Virtual Output Queues

VOQ Can reach the best that any queueing system can achieve.

Caveat: Has to run a sophisticated scheduling algorithm to compute a

“maximum weight matching”!

42

SUNY at Buffalo;

CSE 489/589 –

Modern

Networking

Concepts; Fall

Hung Q. Ngo

40%

Load

60% 80% 100%

Jim Dai & Balaji Prabhakar (2000) Showed an IQ switch using a maximum weight matching algorithm can achieve a throughput of up to 100%

 arbitrarily distributed input traffic as long as

(i) It obeys the strong law of large numbers, and

(ii) it does not oversubscribe any input or output.

More Precise Statement of the Result number of packets that have arrived at input i destined to output j up to time n

Strong law of large numbers assumption :

 this is called the arrival rate at number of departures from up to time n

Non-overloading assumption :

Want a scheduling algorithm to be efficient :

Maximum Weight Matching (MWM)

A

1

(n)

A

11

(n)

A

1N

(n)

L

11

(n)

1

S* ( n )

1

D

1

(n)

A

N

(n)

A

N1

(n)

A

NN

(n)

L

NN

(n)

N N

D

N

(n)

*

S n

 arg max(

T

L n S n

L

11

(n)

L

N1

(n)

“Request” Graph

Maximum

Weight Match

Bipartite Match

Problem with Running a MWM Algorithm

The best known algorithm is still too slow!!!

n: # of inputs/outputs, m = # of edges in the bipartite graph

(Because we have to compute a MWM every few nano-seconds!)

SUNY at Buffalo; CSE 489/589 – Modern Networking Concepts; Fall 2009; Instructor: Hung Q. Ngo 46

From Maximum to Maximal to Randomized!

In Practice : maximal matching, not maximum

Maximal Matches

Wavefront Arbiter (WFA)

Parallel Iterative Matching (PIM) iSLIP

Justification: Dai & Prabakhar [infocom 2000]

Give the fabric a speedup of 2 (thus CIOQ) and even a maxim al matching yields 100% throughput too!

Several other works: use a randomized algorithms

M. Mitzenmacher, B. Prabhakar, and D. Shah (FOCS 02)

P. Giaccone, B. Prabhakar, and D. Shah (INFOCOM 02)

SUNY at Buffalo; CSE 489/589 – Modern Networking Concepts; Fall 2009; Instructor: Hung Q. Ngo 47

Evolution of IQ-Switching Till Early 2000

Theory:

Input

Queueing

(IQ)

58% [Karol, 1987]

IQ + VOQ,

Maximum weight matching

100% [M et al., 1995]

Practice:

Input

Queueing

(IQ)

IQ + VOQ,

Sub-maximal size matching e.g. PIM, iSLIP.

Different weight functions, incomplete information, pipelining.

100% [Various]

Randomized algorithms

100% [Tassiulas, 1998]

IQ + VOQ,

Maximal size matching,

Speedup of two.

100% [Dai & Prabhakar, 2000]

Various heuristics, distributed algorithms, and amounts of speedup

SUNY at Buffalo; CSE 489/589 – Modern Networking Concepts; Fall 2009; Instructor: Hung Q. Ngo 48

Third Generation Routers

“Crossbar”: Switched Backplane

Line

Card

Local

Buffer

Memory

Fwding

Table

MAC

CPU

Card

Routing

Table

Line

Card

Local

Buffer

Memory

Fwding

Table

MAC

Typically <50Gb/s aggregate capacity

Arbiter Arbiter Arbiter Arbiter

Arbiter Arbiter Arbiter Arbiter

Arbiter

Mimicking OQ Switch for 100% Throughput

N x R

Mimicking OQ Switch for 100% Throughput

1 x R

Are they equivalent?

NR

No.

R

Combined Input-Output Queue (CIOQ)

1 x R ? x R

Algorithm

Now are they equivalent?

NR

R 2R

Algorithm

Yes, if it runs 2 times faster .

CIOQ Switches

Both input and output interfaces store packets

Advantages

Easy to built

Utilization 1 can be achieved with limited input/output speedup (≤ 2) input interface output interface

Backplane

Disadvantages

Harder to design algorithms

Two congestion points

Need to design flow control

R

O

C

An input/output speedup of 2, a CIOQ can emulate any work-conserving OQ [G+98,SZ98], need to run a stable marriage matching algorithm

Or, a maximal matching algorithm

SUNY at Buffalo; CSE 489/589 – Modern Networking Concepts; Fall 2009; Instructor: Hung Q. Ngo 56

Stable Marriage Problem

Consider N women and N men

Each woman/man ranks each man/woman in the order of their preferences

Stable matching, a matching with no blocking pairs

Blocking pair; let p(i) denote the pair of i

There are matched pairs (k, p(k)) and (j, p(j)) such that k prefers p(j) to p(k), and p(j) prefers k to j

Gale Shapely Algorithm (GSA)

As long as there is a free man m

 m proposes to highest ranked women w in his list he hasn’t proposed yet

If w is free, m an w are engaged

If w is engaged to m’ and w prefers m to m’, w releases m’

Otherwise m remains free

A stable matching exists for every set of preference lists

Complexity: worst-case O(N 2 )

men pref. list

1 2 4 3 1

2 1 4 3 2

3 4 3 2 1

4 1 2 4 3

Example women pref. list

1 1 4 3 2

2 3 1 4 2

3 1 2 3 4

4 2 1 4 3

If men propose to women, the stable matching is

(1,2), (2,4), (3,3),(2,4)

What is the stable matching if women propose to men?

OQ Emulation with a Speedup of 2

Input preference list : list of cells at that input ordered in the inverse order of their arrival

Output preference list : list of all input cells to be forwarded to that output ordered by the times they would be served in an OQ schedule

Use GSA to match inputs to outputs

Outputs initiate the matching

Can emulate all work-conserving schedulers

Example c.2

b.2

b.1

a.1

c.1

a.2

b.3

c.3

1

2

3

(a) c.2

b.2

b.1

a.1

1 c.1

a.2

2 b.3

c.3

3 a.1

c.1

b.3

a b c

(c) a b c c.2

b.2

b.1

a.1

1 c.1

a.2

2 b.3

c.3

3 a.1

c.1

(b) a b c c.2

b.2

b.1

1 a.2

2 c.3

3 a.1

b.3

c.1

a b c

(d)

4

th

Generation Router

Multirack; optics inside

Optical links

Linecards

100s of metres

Switch

More 4

th

Generation Routers

Alcatel 7670 RSP

Juniper TX8/T640

TX8

Avici TSR Cisco CRS-1

Power consumption per chassis

4

2

0

8

6

16

14

12

10

1990 1993 1996 1999 2002 2003 2004

Slides from Nick McKeown @ Stanford

64

A Typical High Speed Router Today

CIOQ Architecture

Input/output speedup ≤ 2

Input interface

Perform packet forwarding (and classification)

Output interface

Perform packet (classification and) scheduling

Backplane

Switching fabric; speedup N

Schedule packet transfer from input to output

SUNY at Buffalo; CSE 489/589 – Modern Networking Concepts; Fall 2009; Instructor: Hung Q. Ngo 65

5

th

Generation routers?

Load-balancing over passive optics

© Nick McKeown 2006

5

th

Generation routers?

Load-balancing over passive optics

 Electronic processing at R a

Very scalable. Petabits?

© Nick McKeown 2006

Download