VPLS
Yaakov (J) Stein November 2004
Chief Scientist
RAD Data Communications
Tunneling Ethernet
VPNs
MPLS and PWs
L2VPNs
LDP vs. BGP
Generalizations
L3VPNs
Y(J)S VPLS Slide 2
Y(J)S VPLS Slide 3
Ethernet LAN is the most popular LAN but Ethernet can not be made into a WAN
Ethernet is limited in distance between stations
Ethernet is limited in number of stations on segment
Ethernet is inefficient in finding destination address
Ethernet only prunes network topology, does not route so the architecture that has emerged is Ethernet private networks connected by public networks of other types (e.g. IP)
LAN LAN
WAN
Y(J)S VPLS Slide 4
this model is sensible when traffic contains a given higher layer
Ethernet header is removed at ingress and a new header added at egress this model is not transparent Ethernet LAN interconnect
Ethernet LANs with multiple higher layer packet types
(e.g. IPv4, IPv6, IPX, SNA, CLNP, etc.) can’t be interconnected raw L2 Ethernet frames can not be sent the Ethernet layer is terminated at WAN ingress the traffic is no longer Ethernet at all
Ethernet Ethernet
WAN not Ethernet
Y(J)S VPLS Slide 5
users with multiple sites want to connect their LANs so that all locations appear to be on the same LAN this requires tunneling of all Ethernet L2 frames
(not only IP) between one LAN and another the entire Ethernet frame needs to be preserved
(except perhaps the FCS which can be regenerated at egress)
Ethernet Ethernet
X
Ethernet inside X
Y(J)S VPLS Slide 6
for simplicity, let’s think of an IP network : the traditional architecture uses the following packet formats:
WAN
Eth hdr IP hdr payload Eth FCS Eth hdr IP hdr payload Eth FCS
WAN L2 hdr IP hdr payload the VPN model (Ether-IP) uses the following packet formats:
WAN
Eth hdr IP hdr payload Eth FCS Eth hdr IP hdr payload Eth FCS
WAN L2 hdr IP hdr Eth hdr IP hdr payload Eth FCS *
Y(J)S VPLS Slide 7
WAN
Ethernet frames can be carried over various WANs
HDLC: not standardized, Cisco-HDLC
FR: RFC2427 / STD0055 (ex 1490)
ATM: RFC2684 / (ex 1483), LANE entire Ethernet frame (or IP packet) is used as payload
Y(J)S VPLS Slide 8
SONET/SDH
Ethernet over SONET/SDH (EoS) and low-rate TDM entire Ethernet frame is placed in SONET/SDH payload
Formats:
Generic Framing Procedure (GFP) [SDH&OTN G.7041, PDH - G.8040]
Virtual Concatenation (VC) with/without Link Capacity Adjustment Scheme (LCAS)
Link Access Procedure for SDH (LAPS) unlike POS, EoS allows bandwidth sharing between Ethernet ports but SONET/SDH is an expensive infrastructure
Y(J)S VPLS Slide 9
Y(J)S VPLS Slide 10
service provider network
Service Providers (SPs) with packet switched networks (PSNs) want to offer customers site interconnect service since the private networks are interconnected over a public PSN this results in a Virtual Private Network unlike the traditional WAN architecture the entire Ethernet frame must be tunneled through the PSN hence it is sometimes called Transparent LAN Service (TLS)
Y(J)S VPLS Slide 11
customer network physical link customer network emulated link customer network
Customer
Edge
(CE)
Provider
Edge
(PE) provider network
Provider
Edge
(PE)
AC = Attachment Circuit
Customer
Edge
(CE)
AC = Attachment Circuit customer network
Y(J)S VPLS Slide 12
C
C
CE
C customer 1 network
P P P
C C
CE
C customer 2 network
PE
PE
P
P
C
C
C
CE customer 2 network
C C provider network
CE
C
Key
C Customer router/switch
CE Customer Edge router/switch
P Provider router/switch
PE Provider Edge router/switch customer 1 network
Y(J)S VPLS Slide 13
192.115.243.79
192.115.243.19
SP network
192.115.243.19
Y(J)S VPLS Slide 14
– proprietary leased-line (not virtual )
– Frame Relay over E1/T1
– ATM over E1 or multiple-E1
– IPSec tunnel
– L2TP tunnel
– 2547bis
– VPWS / VPLS
Y(J)S VPLS Slide 15
Y(J)S VPLS Slide 16
– routing (topology determination) using L3 (IP) protocols
– path setup (label binding and distribution)
– data forwarding
– speeds up forwarding
– decreases forwarding table size (by using local labels)
– load balance by explicitly setting up paths
– complete separation of routing and forwarding algorithms so new routing algorithm needed but new signaling algorithm may be needed
M
L
S
(MPLS)
– is multiprotocol - from above and below
– can run on IP router or ATM switch with only SW upgrade (but HW helps)
– supports a label stack
– support for traffic engineering and QoS guarantees
Y(J)S VPLS Slide 17
downstream direction
L abel S witched P ath upstream direction
L3 router
L3 link L3 link ingress
L abel E dge R outer
L abel S witched R outers egress
L abel E dge R outer L3 router label switching is needed in the core , access can be L3 forwarding * core interfaces the access at the edge (ingress, egress)
LSR router that can * perform label switching
LER LSR with non-MPLS neighbors (LSR at edge of core network)
LSP unidirectional path used by label switched forwarding (ingress to egress)
* not every packet needs label switching (e.g. only small number of packets, no QoS)
1.3
Y(J)S VPLS Slide 18
unlike TCP, the CO layer lies under the CL layer if there is a broadcast L2 (e.g. Ethernet), the CO layer lies above it higher layers layer 3 (e.g. IP) label switching (layer 2.5) layer 2 (e.g. Ethernet) physical layer shim header hence, MPLS switching is sometime called layer 2.5 switching
Y(J)S VPLS Slide 19
Label (20b) Exp
(3b)
S
(1b)
TTL (8b) when a shim header is needed, its format should be:
Label there are 2 20 different labels (+ 2 20 multicast labels)
Exp (CoS) left undefined by IETF WG was CoS in Cisco Tag Switching could influence packet queuing
Special (reserved) labels
0 IPv4 explicit null
1 router alert
2 IPv6 explicit null
3 implicit null
Stack bit
TTL
S=1 indicates bottom of label stack decrementing hop count used to eliminate infinite routing loops generally copied from/to IP TTL field
S=0 top label
S=0 another label
S=0 yet another label
S=1 bottom label
Y(J)S VPLS Slide 20
binding:
– label assigned by downstream LSR
– per port or per LSR label space
– control driven vs. data driven (traffic driven) distribution:
– upstream label distribution
– piggyback label distribution on routing protocols (e.g. BGP)
– Label Distribution Protocol (LDP) forwarding:
– read top label L
– consult Incoming Label Map
(forwarding table)
– perform label stack operation
(pop L , swap L M , swap L M and push N )
– forward based on L ’s Next Hop Label Forwarding Entry
Y(J)S VPLS Slide 21
192.115.243.19
2 1
SP network
1
192.115.243.19
MPLS label
IP header payload assume customers 1 and 2 use overlapping IP addresses then C-routers have inconsistent tables ingress PE-router pushes a label
P-routers see only MPLS label
Prouters don’t see IP addresses - no ambiguity
P-routers see only the MPLS label - not LAN IP addresses
PE routers know how to map CE LANs
Y(J)S VPLS Slide 22
CE
CE
ACs
CE
PE P P
PE
CE
ACs
CE
CE each LAN mapped to pair of (unidirectional) LSPs supports all LAN traffic types (CE is Ethernet Switch, not IP router) each Ethernet frame encapsulated with MPLS label supports various AC technologies
scaling problem: requires large number of LSPs
P-routers need to reserve resources for each LAN instance
Y(J)S VPLS Slide 23
CE
CE
ACs
CE
PE transport tunnel
PWs are bidirectional
PE
CE
ACs
CE
CE transport MPLS tunnel set up between PEs multiple PWs may be set up inside tunnel
Ethernet frame encapsulated with 2 labels
MPLS (outer) label
PW (inner) label
Ethernet frame
P-routers do not reserve resources for each VPN instance
Y(J)S VPLS Slide 24
encapsulation via “Martini drafts” draft-ietf-pwe3-xxx-encap
L2 can be Ethernet, but also ATM or FR
setup via PW control protocol draft-ietf-pwe3-control-protocol based on targeted LDP
Problems:
supports only point-to-point LAN interconnect (VPWS)
need to manually configure PW for every VPN instance need to setup 2 unidirectional tunnels for every pair of PEs
Y(J)S VPLS Slide 25
outer label inner label control word
Ethernet Frame
• outer label specifies MPLS tunnel
• inner label contains PW label to support multiple Ethernet PWs in a single MPLS tunnel
• optional control word
• enables detection of out-of-order and lost packets
0000 reserved Sequence Number (16b)
• Ethernet Frame
• by default no FCS trailer
(but there is separate “FCS retention” draft)
Y(J)S VPLS Slide 26
Y(J)S VPLS Slide 27
CE
AC
PE PE
AC
CE provider network
Virtual Private Wire Service is a L2 point-to-point service it emulates a wire supporting the Ethernet physical layer set up MPLS tunnel between PEs set up Ethernet PW inside tunnel
CEs appear to be connected by a single L2 circuit
(can also make VPWS for ATM, FR, etc.)
Y(J)S VPLS Slide 28
PE
AC
CE
CE
AC
PE for clarity only one VPN is shown
PE
AC
VPLS emulates a LAN over an MPLS network
CE set up MPLS tunnel between every pair of PEs (full mesh) set up Ethernet PW inside tunnels, for each VPN instance
CEs appear to be connected by a single LAN
PE must know where to send Ethernet frames … but this is what an Ethernet bridge does
Y(J)S VPLS Slide 29
V B CE
CE B V
V B
CE a VPLS-enabled PE has, in addition to its MPLS functions:
VPLS code module (IETF drafts)
Bridging module (standard IEEE 802.1D learning bridge)
SP network
(inside rectangle) looks like a single Ethernet bridge!
Note: if CE is a router, then PE only sees 1 MAC per customer location
Y(J)S VPLS Slide 30
PE maintains a separate bridging module for each VPN
(VPLS instance)
VPLS bridging module must perform:
MAC learning
MAC aging flooding of unknown MAC frames replication (for unknown/multicast/broadcast frames)
unlike true bridge, S panning T ree P rotocol is not used limited traffic engineering capabilities scalability limitations slow convergence
forwarding loops are avoided by split horizon
PE never forwards packet from MPLS network to another PE not a limitation since there is a full mesh of PWs so always send directly to the right PE
Y(J)S VPLS Slide 31
VPLS signaling establish PWs between PEs per VPLS
VPLS autodiscovery locates PEs participating in VPLS instance obtain frame from bridge encapsulate Ethernet frames and inject packet into PW retrieve packet from PW removes PW encapsulation and forward Ethernet frame to bridge
Y(J)S VPLS Slide 32
PE
CE
CE
PE
?
PE
CE in L2VPN CEs appear to be connected by single L2 network
PEs are transparent to L3 routing protocols
CEs are routing peers in L3VPN CE routers appear to be connected by a single L3 network
CE is routing peer of PE, not remote CE
PE maintains routing table for each VPN
Y(J)S VPLS Slide 33
mechanisms may be simplified if Ethernet frames carry only IP traffic enables upgrade of IP routers to support VPLS-like services in this case CE devices are routers, not switches frames are still forwarded based on MAC DA (not L3VPN) but MAC forwarding tables updated via PW signaling, not 802.1D
PE snoops IP and ARP frames to discover CEs connected to it creates (AC,VPN-ID,IP-addr,MAC-addr) entry creates PWs to all PEs participating in VPN-ID sends entries to these PEs
Address Resolution Protocol (ARP) messages are proxied rather than being carried transparently
PE searches entries it has received can support different AC types (Ethernet and FR)
ARP Mediation ensures proper mapping
Y(J)S VPLS Slide 34
Y(J)S VPLS Slide 35
both use TCP for reliable transport
(LDP uses UDP for hellos) both are hard-state protocols both use TLV format for parameters
BGP multiprotocol (IPv4, IPv6, IPX, MPLS) highly complex protocol provides routing / label distribution built-in autodiscovery mechanism
LDP
MPLS only simpler protocol only label distribution extendable for autodiscovery
Y(J)S VPLS Slide 36
header (19B) marker
(16B) length
(2B) type
(1B) data
(variable) marker can be used for authentication
(TCP MD5 signature) length is total BGP PDU length, including header type
– OPEN
(for session initialization)
– UPDATE
(add, change and withdraw routes)
– NOTIFICATION
(return error messages, terminate session)
– KEEPALIVE
(heartbeat)
KEEPALIVE packet consists of 19B header only
Y(J)S VPLS Slide 37
idle – no session (awaiting session initialization) connect – attempting to connect to peer active – started TCP 3-way handshake (router busy ) open sent – have sent OPEN message open confirm
– after receiving TCP SYN for OPEN message established – BGP session up and running
Y(J)S VPLS Slide 38
version
(1B) my AS
(2B) hold time
(2B)
BGP-ID
(2B) op len
(1B) opt parameters
(variable) version (3 or 4) my AS – identifier of autonomous system hold time – max time (sec) between receipt of messages
BGP ID – sender’s BGP identifier op len – length (bytes) of optional parameters opt parameters - TLVs
Y(J)S VPLS Slide 39
WR len
(2B) withdrawn routes
(var)
PA len
(2B)
path attributes
(var)
NLRI
(var)
W ithdrawn R outes – list of routes no longer to be used
(NLRI format- see below)
P ath A ttributes – route specific information (see next page)
N etwork L ayer R eachability I nformation – (classless) routing information len
(1B) prefix
(variable) the NLRI is a list of address-prefixes each prefix must be masked from the left to the length specified
Y(J)S VPLS Slide 40
flags
(1B) type code
(1B)
if 1 must be recognized by all BGP implementations if W=1 and unrecognized attribute, BGP sends notification and session closed
if 1 and attribute unrecognized it is passed along, else silently ignored well-known attributes are always transitive
(for optional transitive attributes only)
(=0 attribute length is 1B, =1 length is 2B)
Y(J)S VPLS Slide 41
error code
(1B) error subcode
(2B) data
(var) all notification messages cause BGP session to close error codes include:
– message header error
– open message error
– update message error
– hold timer expired
– state machine error
– other fatal error
Y(J)S VPLS Slide 42
header (10B) version
(2B) length
(2B)
LDP-ID
(6B) messages
(variable) version – presently 1 length - PDU length, excluding version and length fields
LDP-ID – identifies label space of sending LDP peer
– LSR-ID(4B) globally unique LSR ID
– label space ID (2B) for per-port label spaces
(zero for per-platform label spaces) messages
– zero or more TLVs
(see next page)
Y(J)S VPLS Slide 43
type
(2B) length
(2B) message-ID
(4B) mandatory parameters
(variable) optional parameters
(variable) type
U message code
U – unknown message bit if message type unknown to receiver
U=0 – receiver returns notification to sender
U=1 – receiver silently ignores length - message length, excluding type and length fields
Message-ID – unique ID for message
(for matching with returned notification) if there are mandatory parameters , they most appear in a specific order optional parameters may appear in any order
Y(J)S VPLS Slide 44
Hello (UDP, for discovery)
Initialization (specifies LDP version, label space range, parameters)
KeepAlive (heart beat)
Notification (error, e.g.unsupported version, unknown/malformed msg, timer expired)
Address (LSR advertises its interface IP address(es) to peers)
Address Withdraw (LSR revokes previously advertised interface IP address)
Label Mapping (downstream LSR advertisement of a label mapping for a FEC )
Label Withdraw (downstream LSR informing that binding is revoked)
Label Request (upstream LSR request for binding in downstream-on-demand mode)
Label Release (upstream LSR informing that binding no longer needed)
Label Abort Request (upstream LSR asks to revoke request before satisfied)
Y(J)S VPLS Slide 45
LSR periodically transmits hello UDP messages
– multicast to “all routers on subnet” group
– targeted to preconfigured IP address
LSRs listen on this UDP port for hello messages when LSR receives hello from another LSR
– it opens a TCP connection to that other LSR or (for extended discovery)
– it unicast transmits a hello back to the other LSR
LSR with higher ID sends session initialization message other LSR LDP accepts (sends keepalive) or rejects informative or keepalive messages sent
3.2
Y(J)S VPLS Slide 46
Y(J)S VPLS Slide 47
customers may want their SP to take an active role in managing their networks
P rovider P rovisioned VPN (PPVPN) refers to VPN for which SP participates in management and provisioning
by provisioning we mean
(at least)
: setting up the ACs
(often manual configuration) assigning global VPN-ID to VPN instances discovery of all PEs that participate in a VPN instance associating AC with VPN at PE providing PEs with information needed to set up tunnels configuring tunnels with necessary characteristics
Y(J)S VPLS Slide 48
we have assumed that each PE knows which PEs participate in particular VPN instance manual configuration is problematic logistically autodiscovery refers to automatically finding all PEs in a given VPN
each PE "discovers" other PEs by means of some protocol
BGP (to be discussed later)
RADIUS (
R emote A uthentication D ial I n U ser S ervice
)
CE = RADIUS users, PEs = Network Access Servers (NAS)
PE can authenticate CEs and find other PEs targeted LDP ( “Stokes draft” – now abandoned ) advertise FEC in LDP new TLV in label mapping message contains VPN-id, P or PE, capabilities
Y(J)S VPLS Slide 49
a PW is a bidirectional entity (two LSPs in opposite directions) a PW connects two forwarders
PW setup via targeted LDP signaling
2 different LDP TLVs can be used
– PWid FEC
– Generalized ID FEC
PWid FEC to use both sides of PW provisioned with a unique (32b) value each of PW endpoint independently initiates LSP set up
LSPs bound together into a single PW
Y(J)S VPLS Slide 50
for each forwarder we have a PE-unique Attachment Identifier (AI)
<PE, AI> must be globally unique frequently useful to group a set of forwarders into a attachment group where PWs may only be set up among members of a group then Attachment Identifier (AI) consists of
–
Attachment Group Identifier (AGI) (which is basically a VPN-id)
– Attachment Individual Identifier (AII) the LSPs making up the PW are
< PE1, (AGI, AII1) , PE2, (AGI, AII2) > and
< PE2, (AGI, AII2) , PE1, (AGI, AII1) > we also need to define
– Source Attachment Identifier (SAI = AGI+SAII)
–
Target Attachment Identifier (TAI = AGI+TAII) receiving PE can map TAI uniquely to AC
Y(J)S VPLS Slide 51
Double Sided Provisioning each AC provisioned with local name, remote PE address, and remote name during signaling, local name is sent as SAII, remote name as TAII (AGI = null) to connect 2 ACs by a PW: local name = remote name(PWid FEC) or local name of each must be remote name of the other
Single Sided Provisioning with Discovery each AC provisioned with local name (VPN-id) and AII during signaling, local name is sent as AGI to connect 2 ACs by a PW: both must have the same VPN-id only one needs to be provisioned with remote name (local name of other AC) neither needs to be provisioned with the address of the remote PE during auto-discovery procedure: each PE advertises its <VPN-id, local AII> pairs each PE compares its local <VPN-id, remote AII> pairs with <VPN-id, local AII> pairs from other PEs if match then need to connect local name sent as SAII, remote AII sent as TAII, VPN-id as AGI
Y(J)S VPLS Slide 52
every VPLS instance is assigned a unique VPN-id
PEs are preconfigured or find each other using auto-discovery if PE detects VPN-id to which it belongs it sets up a PW during signaling
– VPN-id is send as the AGI field
– SAII and TAII are set to null
Y(J)S VPLS Slide 53
ex“Lasserre-VKompella draft”, now draft-ietf-l2vpn-vpls-ldp authors: Marc Lasserre - Riverstone and Vach Kompella – Alcatel supported by Cisco, Nortel, Alcatel, Riverstone, Extreme, Luminous, Corrigent, Hatteras, Overture, RAD use LDP for
– PW setup and tear-down signaling
– explicit withdrawal of MACs
(force relearning) full mesh of targeted LDP sessions between VPLS-enabled PEs automatically establish a full mesh of Ethernet PWs
participating PE sends an unsolicited label mapping message to every other PE, specifying VPN-ID
(preferably with generalized PWid FEC element)
if receiving PE accepts, it sends a label mapping message back
Y(J)S VPLS Slide 54
ex“Kompella draft”, now draft-ietf-l2vpn-vpls-bgp authors: Kireeti Kompella, Yakov Rekhter – Juniper
uses BGP4
(with multiprotocol extensions) for: autodiscovery
(uses Route Target extended community as VPN-ID)
PW setup and tear-down
(uses N etwork L ayer R eachability I nformation) force MAC relearning
(uses R elearn S equence N umber TLV) protocol essentially identical to RFC2547bis (to be discussed later)
Y(J)S VPLS Slide 55
define demultiplexor = VPN-ID + ingress PE
VPLS Edge (VE) advertises VPLS NLRIs for each VPLS instance
NLRI defines demultiplexors for all PEs in VPLS instance extended attribute encodes PE capabilities if new PE joins VPLS new NLRI seamlessly adds new label coalesce to a single NLRI with temporary service disruption
PE sets up PW when it receives an NLRI for VPLS to leave VPLS instance PE withdraws NLRI remote PEs remove PWs
Y(J)S VPLS Slide 56
Y(J)S VPLS Slide 57
CE
N-PE access network
U-PE
CE
PE
VPLS
U-PE CE
PE
CE
L2VPN framework allows decomposition of PE
User-Facing PE (U-PE) performs B ridge functions
MAC learning, forwarding decisions
Network-Facing PE (N-PE) performs V PLS functions establishes tunnels, PWs
V B
U-PE is inexpensive CLE, good for MTU applications
Y(J)S VPLS Slide 58
VPLS
MTU
PE
PE
MTU
VPLS
HVPLS
PE
MTU
VPLS straight VPLS has a problem – N 2
PWs are used which means N 2 LDP sessions, and N 2 floods and replications to improve scalability, can use hub-and-spoke topology if VPLS is in multi-tenant buildings, local PE is MTU
HVPLS PEs are full mesh, but do not perform bridging spoke PW set up between PE and MTU (note end-point is virtual bridge)
Y(J)S VPLS Slide 59
Y(J)S VPLS Slide 60
presently most popular provider managed VPN originally specified in RFC 2547, update in draft called 2547bis transports IPv4 (IPv6) traffic in MPLS tunnels uses BGP for route distribution since SPs commonly use BGP for routing
2547 is not an overlay model
– CE routers at different sites are not routing peers
– they do not directly exchange routing information
– they don’t even need to know of each other
– so customer needn’t manage a backbone or virtual backbone
– no inter-site routing problems
Y(J)S VPLS Slide 61
only PE routers maintain VPN information
P routers needn’t maintain any customer routing information
C routes either manually configured in PE or advertised to PE using BGP, OSPF, etc.
PE advertises routes to remote PEs using BGP remote PEs advertise routes to their CEs using BGP, OSPF, etc.
IP address overlap solved using Route Distinguisher (RD)
Y(J)S VPLS Slide 62
C
C
C
CE not peer to CE
CE peer to PE
CE
PE
CE is IP router
P P
SP label ext label
IP Packet
PE
CE
C
C
C
Virtual router (peering) model, not tunneling
PE maintains V irtual R oute F orwarding table for each VPN
BGP (with multiprotocol extensions) used for label distribution in order to support private IP addresses
PE prepends 8B R oute D istinguisher
(unique to site) to IP address
Y(J)S VPLS Slide 63
L3VPN
C switch connects to L2 circuits
BGP or LDP
all L3 traffic types
only Ethernet L2
Cs responsible for routing
“overlay model”
simple customer-SP interface
C peering scales as VPN size
scaling problem
C router peers with SP router
BGP
limited to IP traffic
supports different L2 technologies
SP responsible for routing
“peer model”
complex customer-SP interface
C peering independent of VPN size
scales well
Y(J)S VPLS Slide 64