slides - Microsoft Research

advertisement
Virtual LAN as A Network
Control Mechanism
Tzi-cker Chiueh
Computer Science Department
Stony Brook University
EdgeNet2006 Summit
1
Ethernet Routing
Spanning tree topology
Source Learning to populate the
forwarding table
Broadcast if don’t know what to do
Question: How to control the routes on
large L2 networks of commodity
Ethernet switches? VLAN
EdgeNet2006 Summit
2
Virtual LAN (IEEE 802.1Q)
Originally proposed to support multiple IP
subnets on a L2 network without L3 routers

VLAN limits the scope of a broadcast packet
4-byte 802.1Q header inserted between SRC
MAC and Type/Length




2-byte 802.1Q tag type = 0x8100
3 bits for priority (IEEE 802.1P)
1 bit for Canonical Format Indicator
12 bits for VLAN ID
EdgeNet2006 Summit
3
EdgeNet2006 Summit
4
VLAN in Practice
802.1Q tag is added at the hosts or edge switches
Packets are exchanged between two VLANs through a
router
Conceptually, each VLAN is like a physical LAN that has
its own
 Spanning tree
 L2 routing table
802.1S allows per-VLAN spanning tree
Number of VLANs supported in real switches is
hundreds
VLAN specification is port-based or host-based

Configuration can be based on SNMP or web requests or CLI
EdgeNet2006 Summit
5
Viking Project
Goal: A network resource management
system for campus-wide L2 network
backbone or Metro Ethernet Services
A large number of low-port-density switches
vs. a small number of high-port-density
switches




Larger geographic coverage
More cost-effective (economy of scales)
More redundancy at the physical connectivity level
Higher aggregate back-plane throughput
EdgeNet2006 Summit
6
Problem with Existing Ethernet
Main problem: single spanning tree



Inefficient
Inflexible routing
Longer failure recovery
EdgeNet2006 Summit
7
Traffic Engineering
Constantly measure traffic load matrix
Compute an active-backup path for each node pair to
balance loads among links and use shorter links
whenever possible  mesh rather than tree
Force a path’s route by setting up a dedicated logical
VLAN for it  ATM-like behavior on Ethernet
Need to combine multiple logical VLANs into one
physical VLAN, which corresponds to a spanning tree;
active and path paths belong to different VLANs
EdgeNet2006 Summit
8
Big Picture
Each host in a single IP subnet participates in
multiple VLANs, and uses different VLANs to
reach different destination
Fast failure recovery: Switch to a different
802.1S VLAN to reach a destination when the
current VLAN fails

The failure recovery time of the Viking prototype is
less than 500 msec, most of which is SNMP trap
Next step: Edge-based traffic shaping and
802.1P for QoS guarantee
EdgeNet2006 Summit
9
EdgeNet2006 Summit
10
IGMP Snooping
Why: Avoid using L2 broadcast when
supporting L3 multicast
How: Snoop on IGMP packets to infer a L2
distribution tree for an IP multicast group on
top of a L2 network’s spanning tree
Supported by most commodity Ethernet
switches
Real switches can only track a small number
of IP multicast groups
Configuration: Sending IGMP packets to the
root, which acts as the default router
EdgeNet2006 Summit
11
Cassini Project
Goal: Leverage commodity Ethernet switches
as building block for storage area network
Multicast is an important primitive
Idea: Use VLAN/IGMP snooping to support
tree-based L2 multicast
Transparent Reliable Multicast:


Multiple L3 connections (e.g. TCP) layered on on
top of a L2 multicast connection
ACK/Retransmission on individual L3 unicast
connection
EdgeNet2006 Summit
12
EdgeNet2006 Summit
13
Conclusion
Many innovative features in commodity
Ethernet switches that are largely exploited
CLI or SNMP or HTTP provides the possibility
of on-the-fly reconfiguration according to
workloads and/or hardware health status
Interesting application scenarios:



Large-scale L2 network
Storage area network
Compute cluster interconnect: program-specific
topology
EdgeNet2006 Summit
14
Thank You!
Questions?
EdgeNet2006 Summit
15
Mariner Project
Goal: Leverage advanced features of
commodity Gigabit Ethernet switches to build
scalable compute cluster interconnects (~1000
nodes)
Programmable application-specific interconnect
topology
Fault management: asynchronous state checkpointing and pessimistic message logging
Scalable multicast state management
EdgeNet2006 Summit
16
EdgeNet2006 Summit
17
Download