◆ Managed Virtual Private LAN Services Enrique J. Hernandez-Valencia, Pramod Koppol, and Wing Cheong Lau Within the data communications industry, there is renewed interest in a managed transport service that provides an optimized service interface to packet-oriented applications. This transport service is widely envisioned as an intelligent, flexible, and easily provisionable service platform that must also enable the dynamic creation of new revenue-generating data offerings for service providers. Recent standardization activities have focused on a managed virtual private network model as the basis for such service. In particular, the so-called layer 2 VPN (L2VPN) service model based on Ethernet/MPLS is particularly appealing as it combines the simplicity and ubiquity of Ethernet with the powerful service creation capabilities from MPLS. This paper discusses the various L2VPN architectures being proposed by the industry and identifies the key building blocks required to realize such services. © 2003 Lucent Technologies Inc. Introduction Customers with multiple geographically distributed networks have traditionally constructed virtual private networks (VPNs) by interconnecting their sites via leased circuits. Leased circuits could be physical (dedicated time division multiplexing [TDM] circuits) or virtual (frame relay/asynchronous transfer mode [FR/ATM]) point-to-point connections from a network service provider. Such inter-site networks are typically routed networks and need skilled networking tools and staff to operate and maintain network connectivity. Another approach for connecting multiple sites is to realize an inter-site bridged network instead of a routed network. Such layer 2 multi-site local area networks (LANs) are the subject of this paper. Traditionally, two basic schemes have been employed to realize multi-site bridged LANs. One scheme uses LAN extension mechanisms where an Ethernet segment connecting two bridges, each at two geographically distributed sites, is extended via a point to point circuit or a virtual connection (VC) that provides Ethernet frame transport over a wide area network (WAN). The LAN extension mechanism is transparent to the network service provider. Yet it may turn out to be an expensive option for customers due to the cost of multiple point-to-point circuits. A second mechanism that addresses these issues is based on LAN emulation within the service provider network. Providers of LAN emulation services typically used a star-like topology where the hub of the star is a multicast capable server within the provider network and each site is a spoke. This mechanism makes better use of the network provider resources. Yet it has its own limitations, such as single point of failure, limited scalability, and limited management and administration tools (so far). Layer 2 VPNs (L2VPNs), including LAN extension mechanisms and virtual/physical leased lines to support virtual private networks, have traditionally been Bell Labs Technical Journal 7(4), 61–76 (2003) © 2003 Lucent Technologies Inc. Published by Wiley Periodicals, Inc. Published online in Wiley InterScience (www.interscience.wiley.com). • DOI: 10.1002/bltj.10034 offered over ATM/FR and/or SONET/SDH transport networks. Given the increasing desire on the part of service providers to reduce their operational expenses, as well as to offer new revenue generating services, two recent developments have been in place that may enable them to offer managed L2VPN offerings. The first development is the introduction of IP/MPLS control plane enhancements that make it possible for service providers to offer LAN extensionlike mechanisms and leased line-based virtual networks using the Virtual Pseudo-Wire Service (VPWS). The second development is geared toward realizing a novel scheme, referred to as Virtual Private LAN Service (VPLS), an optimization of the traditional Transparent LAN Service (TLS) for managed Ethernet LANs. While the VPWS scheme is applicable to any mechanism based on non-broadcast connectivity, the VPLS scheme is specifically targeted toward Ethernet LANs. In the VPWS model, nothing changes for the L2VPN customers. Virtual circuits may be constructed from a variety of packet-switching technologies including IEEE VLANs, IP/GRE, and MPLS. It is geared toward improving operational costs for providers offering such a service. The VPLS model is a new L2VPN scheme wherein the service provider network acts as a distributed Ethernet switch. VPLS provides address learning and virtual bridging services but does not require service provider to participate in the customer spanning tree computation procedures. In this paper, we first discuss the basics of the VPWS and VPLS models. We then focus on the VPLS model and discuss the various alternative procedures proposed for address learning and configuration within the service provider network. L2VPN Models A generic model for an L2VPN-capable transport network is illustrated in Figure 1. The customer device attaching to the service provider network, whether an Ethernet/MPLS switch or an IP router, is referred to as the customer edge (CE) device. The service provider’s access device responsible for enabling the L2VPN service is referred to as the provider edge (PE) device. Any other devices in the provider network are referred to as provider (P) core/internal devices. 62 Bell Labs Technical Journal Panel 1. Abbreviations, Acronyms, and Terms AGI—attachment group identifier AI—attachment identifier ATM—asynchronous transfer mode BGP—border gateway protocol BPDU—bridge PDU CE—customer edge DNS—domain name server D-TLS—decoupled TLS FR—frame relay GFP—generic framing procedure H-VPLS—hierarchical VPLS IEEE—Institute of Electrical and Electronics Engineers IETF—Internet Engineering Task Force IP—Internet protocol L2PE—layer 2 PE L2VPN—layer 2 VPN LAN—local area network LDP—label distribution protocol LPE—logical provider edge MAC—media access control MPLS—multi-protocol label switching PDU—protocol data unit PE—provider edge PPVPN—provider provisioned VPN QoS—quality of service RSVP—resource reservation protocol SDH—synchronous digital hierarchy TDM—time division multiplexing TLS—Transparent LAN Service VC—virtual connection VCID—VC identifier VFS—virtual forwarding/switching VLAN—virtual LAN VPLS—Virtual Private LAN Service VPN—virtual private network VPNID—VPN identifier VPWS—Virtual Pseudo-Wire Service VSI—virtual switch instance WAN—wide area network The VPWS Model As a service offering from the point of view of a VPN customer, the VPWS model is not new. What is new is the open IP/MPLS control plane procedures specified to configure and manage the VPN service. These innovations, from an IP/MPLS provider’s point of view, enable a traditional layer 3 provider to enter VPN A S2 VPN A S1 VPN B S1 VPN A S3 IP/MPLS backbone Attachment circuit Provider network VPN B S2 IP—Internet protocol MPLS—Multi-protocol label switching VPN—Virtual private network Customer edge device Provider edge device Provider core device Figure 1. Generic reference model for an L2VPN. the layer 2 managed services offering space. Significantly, it does so in a much more operationally efficient and scalable manner. Figure 2 shows the basic components of a VPWS solution. Typically the PE device would support at least a virtual switch instance (VSI) per CE device. The VSI would typically emulate a layer 2 repeater function. VPWS is the basic construct for what is referred to as CE-based L2VPNS. This VPN, based on the FR/ATM VPN model, assumes that usage of the VPN VCs is under control of the CE, not the PE. Various transport models can be emulated, including emulated Ethernet bridging services. Further characteristics of such L2VPNs are discussed elsewhere [1, 14]. The VPLS Model In this section, we describe the building blocks of the VPLS model and discuss the functionality needed at each of these building blocks. How this functionality is realized is the subject of the next section. Figure 3 shows the basic components of a VPLS solution. Sites S1 through S3 from, say, customer A, are interconnected by the service provider via a VPLS instance referred to as VPN A. A similar configuration may be created to interconnect the sites from customer B. It is important to note that the VPLS scheme assumes that the packets transported across a provider network are Ethernet MAC frames. If this is not the case, appropriate encapsulation procedures need to be defined. For ease of exposition, we will assume that the customer network is an Ethernet-based layer 2 network. Each site connects to the service provider cloud using an attachment circuit to a virtual forwarding/ switching (VFS) entity within a provider edge (PE) device. Each PE may contain many VFSs. All VFSs within the provider network that have sites belonging to the same virtual private LAN are required to have full mesh connectivity among them. A virtual Bell Labs Technical Journal 63 VSI emulates a point-to-point connection (e.g., virtual repeater) VPN A Bridge S1 Attachment circuit PE 1 PE 1 VSI A VSI A VSI B VSI C VSI X Network Provider Bridge VPN B S2 VSI B VSI C VSI X PE—Provider edge VPN—Virtual private network VSI—Virtual switch instance Figure 2. Basic VPWS components. VPN A S2 VPN A S3 Bridge Bridge VFSs in full mesh connectivity Virtual distributed bridge Provider network VPN A Bridge S1 VFS A VFS A PE 1 Attachment circuit VFS A PE 2 Bridge VPN A S4 VFS A VFS A VFS VFSAX Bridge VPN A S5 PE—Provider edge VFS—Virtual forwarding/switching Figure 3. Basic VPLS components. 64 Bell Labs Technical Journal Bridge VPN X S5 VC—Virtual connection VPN—Virtual private network connection identifier, referred to as a VC label, identifies a connection between two VFSs. VC labels are local identifiers. Thus, the tuple (PE 1, PE 2, VC label, VC type) uniquely identifies each virtual connection. A PE may contain multiple VFSs, and two PEs may have a number of VCs that connect VFSs within the PEs. Such VCs are carried within tunnels established between the two PEs. Note that there can be multiple tunnels between any two PEs. The use of tunnels achieves at least two important objectives: first, it alleviates any scalability concerns associated with realizing a large number of VCs within the network; and, second, it facilitates separation of control plane procedures used to establish tunnels and VCs. The second objective is important because, although tunnels are generic in nature, VCs as defined in this context are solely for the purpose of realizing a specific service, in this case VPLS. By separation of control plane procedures for tunnels and VCs, the provider can reuse existing tunneling mechanisms while building new service specific VC establishment procedures. Such procedures will be implemented on all the PE devices and will be transparent to the other nodes within the provider network. Details of the VPLS model are the subject of the next section. VPLS Detailed Overview In this section we discuss the functional requirements associated with Ethernet L2VPNs, their MAC address learning and bridging constraints, and their implications to L2VPN-based bridge emulation services. L2VPN Functional Requirements In the VPLS model, the following are questions of interest: • When a customer MAC frame arrives at a PE, how are the corresponding attachment circuit and VFS identified? • How does a VFS learn and “unlearn” customer MAC addresses? • When a new site is added to a VPN, how do all PEs involved in the VPN come to know of it? • Are all customer MAC frames treated equally or do special procedures need to be followed for some special MAC frames (such as control frames)? Besides endpoint connectivity, two other generic aspects of VPLS that require special consideration include the following: • How is multiplexing/de-multiplexing of VCs into tunnels performed? • Do all attachment circuits need to be of the same kind? The answer to the first set of questions depends on the properties of the provider network. In an IP/MPLS network, a tunnel can be constructed from various technologies, including MPLS. The VC is implemented as an MPLS label. In the case where both the tunnel and VC are constructed via MPLS, a two-level label stack would be required. In other cases, say the tunnel is constructed via IP, the IP-over-IP procedures can be used. If the network is a hybrid Ethernet/TDM transport network [4], then a tunnel may be viewed as a “block” of preset TDM circuits where each component circuit maps to a VC label. Clearly, there are network utilization issues that may make VPLS over a particular transport technology to be unrealistic. A more realistic scenario in which a transport network is involved is in the provision of an attachment circuit. The answer to the second set of questions is a qualified no. Clearly, access link into a service provider network may be viewed as a sub-network in its own right as long as it is able to deliver the native customer protocol data units (PDUs). Yet, the control plane mechanisms may not be readily available in all kinds of technologies. In this paper, however, we assume that all access links are implemented via native Ethernet PHYs. We refer to details of other types of attachment circuits elsewhere [7]. Generalizing the above requirements into control and data plane mechanisms leads us to the following loose classifications of functionality: • Control plane functions: Tunnel and VC related signaling, VPN membership discovery, and address exchange (if any). • Data plane functions: MAC address (un)learning, Ethernet frame encapsulation/de-capsulation, broadcasting/multicasting, VLAN processing and traffic management (if any). In this paper, we focus on the primary functions of MAC address learning and VPN membership Bell Labs Technical Journal 65 management including endpoint identification and auto-discovery. MAC Address Learning Since an Ethernet-based L2VPN must emulate a layer 2 broadcast domain, in theory, unicast communications between any pair of end hosts within an L2VPN can be accomplished via VPN-wide broadcasts. However, unnecessary broadcasts should be avoided in order to conserve bandwidth. It is therefore necessary for L2VPN services to support MAC address learning, especially for multipoint L2VPNs. This is to establish the mapping between an end host, identified by its MAC address and/or IEEE VLAN-tag, and the VPN endpoint that it is residing behind. Before we begin our discussions on the various alternative approaches to provide MAC address learning in L2VPNs, let us review how MAC address learning is supported in conventional bridged networks. We will focus on the architectural properties that are required to realize MAC address learning. Ethernet bridging. In a conventional bridged network, a unicast frame is delivered via a layer 2 broadcast throughout the entire bridged network (or the entire VLAN) if the whereabouts of the destination end host, identified by its MAC address (and VLAN-tag), is unknown. When a frame arrives at each intermediate bridge along its path, the mapping between its source MAC address, say M, and its ingress port/interface i.d. (or equivalently, the i.d. of the ingress physical link) is recorded, i.e., learned, by the bridge. Once a bridge learns this (MAC address, port i.d.) binding, it can use it to determine the egress port/link of a future incoming frame with destination MAC address M. However, in order for this learning scheme to work, the following conditions must be satisfied: 1. Bidirectional connectivity: The connectivity between any pair of neighboring bridges must be bidirectional. This condition is required to ensure the existence of symmetric routes for frame delivery. In other words, for any pair of end hosts, say S and D, within the network, if the ordered list of intermediate bridges taken by a frame to travel from S to D is B1, B2, B3, . . . Bn, the reverse path, i.e., Bn, . . . , B3, B2, B1, should exist and be able to carry frames from D to S. 66 Bell Labs Technical Journal 2. Path uniqueness: There must be a unique, bidirectional route, i.e., the ordered list of bidirectional links, for data delivery between any pair of end hosts within the network. This condition is to ensure the one-to-one nature of the mapping between a MAC address and its associated ingress/ egress port/link i.d. for each intermediate bridge along the data path. Conditions 1 and 2 together guarantee the existence of a unique bidirectional route between each pair of end hosts. This, in turn, guarantees that the (MAC address, port/link i.d.) mapping learned from the forward data path can be used for egress port determination of the frames traveling on the reverse path. 3. Controlled flooding of broadcast frame: The delivery of a broadcast frame within the network must not lead to excessive/uncontrollable flooding. In a conventional bridged network, conditions (1 through 3) are satisfied by forming a logical spanning tree topology to deliver data frames over a physical network of bridges joined by bidirectional physical links. Special control messages, referred as bridge protocol data units (BPDUs), are flooded over the network to discover physical connectivity among the bridges, the bridge i.d’s, and the administrative weight associated with each link. Algorithms such as IEEE 802.1D [5] are used to generate the logical connectivity tree for unicast forwarding purpose. During the formation of the spanning tree, a selected set of physical bridge ports is put on standby mode and restrained from forwarding data frames while the rest of the bridge ports remain active for forwarding. Connectivity/reachability between any pair of bridge ports in the network is guaranteed by the spanning nature of the data delivery tree. These conditions, together with the bidirectional nature of the physical links between adjacent bridges, ensure condition 1 is satisfied. Condition 2 is satisfied by the definition of the logical tree topology. The loop-free property of a spanning tree also allows for efficient support of network-wide layer-2 broadcast while eliminating the possibility of excessive flooding. In particular, when a bridge receives a broadcast frame from one of its ports, it simply sends a copy of the frame to all of its active ports, except the arriving one. By active, we mean a port that is lying along a branch of the spanning tree. Bridging in L2VPNs. To support MAC address learning in L2VPNs, the architectural requirements outlined by conditions 1 to 3 still apply. In addition, the following constraints are imposed: • The bidirectional physical connectivity between each pair of neighboring bridges is now replaced by a bidirectional virtual circuit connectivity between the neighboring PEs. In practice, this bidirectional virtual circuit may also be realized by a pair of unidirectional virtual connections. For example, in the case of MPLS, the bidirectional virtual circuit is realized in form of two unidirectional VC-LSPs, possibly having different paths between the neighboring PEs. • The role of layer 2 bridges in conventional bridged networks is replaced by PEs supporting the L2VPN service. By L2VPN PE, we mean a network element that operates on the layer 2 header of a user frame (as compared to a P device which only processes the headers of the virtual circuit and/or the outer tunnel layer.) Notice that it is not necessary for neighboring PEs to be physically connected to each other. Two PEs are considered to be neighbors in the context of L2VPN provided that (1) they are connected by some bidirectional virtual circuit(s) and (2) the virtual circuit passes through one or more intermediate network elements (P devices) that do not operate on the layer 2 header of the user frames, e.g., do not perform MAC address learning or MAC-address-based forwarding. To further enhance the scalability and network manageability while supporting L2VPN, multiple virtual circuits between the same pair of neighboring PEs can be multiplexed into a single tunneling virtual circuit (also known as the outer tunnel). The identity of an individual VC is visible at its terminating PEs, but not at the intermediate P device(s), which only operate at the outer tunnel level. Realizing MAC Address Learning in L2VPN Under the multiple virtual circuit setup discussed above, MAC address learning in L2VPN can be realized in various ways, as discussed below. Approach 1: MAC-address learning through a spanning tree consisting of PEs connected via bidirectional virtual circuits. This approach generalizes the scheme employed by conventional bridged Ethernet networks. The role of a physical bridge port/link i.d. is replaced by the identifier of a bidirectional virtual circuit. As a frame is transmitted from its ingress PE to its egress PE, it may traverse over multiple VC hops, each connecting a pair of neighboring PEs. Every intermediate PE, if any, along the data path will need to perform MAC-address-based forwarding and MAC address learning. MAC address learning is conducted by recording the binding between the source MAC address, say M, of a user data frame arriving at a PE through an incoming unidirectional virtual circuit and the ID of associated the VC, say VC1. Once this mapping is created, any future data frame arriving at the same PE, with a destination MAC address M, will be forwarded to the outgoing VC that forms a bidirectional virtual circuit pair with VC1. Since this approach is a direct generalization of the spanning tree approach taken by conventional bridged networks, it also inherits its constraints. These include: • Limited flexibility to perform traffic engineering due to the logical spanning tree constraint, which potentially results in inefficient network utilization. • The need of an algorithm to maintain the spanning tree, in an automatic, distributed, and faulttolerant manner. • Relatively long recovery time upon network failure. This is mostly dictated by the convergence time of the spanning tree algorithm (this deficiency is addressed by faster tree management procedures such as the IEEE Rapid Spanning Tree Protocol/IEEE 802.1W), especially when applied to a WAN environment. • Scalability concern of the spanning tree approach as the number of PEs and L2VPNs within a network increase. • The need of new/special hardware support in all PEs along the data path. By new/special, we mean it is beyond what is required in conventional layer 2 bridges/switches. In the context of MPLS-based L2VPNs, special hardware is needed to learn the binding between the incoming VC-label of a frame and its corresponding source MAC-address. Bell Labs Technical Journal 67 From a practical perspective, the spanning-treebased approach has, so far, not been adopted by any L2VPN proposals. Approach 2: Local address learning plus mapping distribution to remote PEs. Under this approach, when a layer 2 data frame arrives at a PE from a customerfacing interface, its corresponding L2VPN, as well as the ingress VPN endpoint identifier, is first determined based on pre-provisioned L2VPN information. Information includes incoming physical port i.d. as well as layer 2 or higher-layer header information of the frame can be used for its classification. The ingress PE then reads the source MAC address of the frame and binds it to its corresponding L2VPN endpoint. This is sometimes referred as the local learning procedure. Subsequently, the PE distributes the locally learned mapping of MAC addresses and VPN endpoint identifiers to other PEs in the network that are hosting the same set of L2VPNs. The mapping distribution is usually done via an out-of-band control channel. For instance, in proposals such as the transparent virtual LAN service by Lasserre [8], extensions to LDP are proposed to carry the mapping (MAC address, VPN endpoint) via target-specific LDP sessions. Such sessions are setup in advance between any pair of PEs that host different endpoints of an L2VPN. The mapping (MAC address, VPN endpoint) of multiple VPNs hosted by the same PE pair is multiplexed into a single target-specific LDP session. The same LDP sessions are also used for distributing the mapping between VPN endpoints and their corresponding inner MPLS labels, also known as the VC labels. When a remote PE receives the mapping (MAC address, VPN endpoint identifier), it uses these mappings to provide unicast service as follows: When a frame arrives at a PE from a customer-facing port, its destination MAC address is used as the key to lookup its destination VPN endpoint from the first mapping. Once the destination VPN endpoint is identified, the corresponding VC-label is looked up from the second mapping and the frame can then be delivered toward its VPN endpoint using the VC-label (as well as other existing outer-tunnel mechanisms, if applicable). One advantage of this approach is that it can be readily implemented using standard layer-2 bridging 68 Bell Labs Technical Journal hardware. This is because local MAC address learning is done only at the customer-facing ports of the ingress PEs. Furthermore, the approach imposes no constraints on the data-path connectivity/forwarding mechanism amongst the PEs for the support of MAC address learning. In contrast, if MAC address learning has to be performed along the data-path within the provider network (as described in the next subsection), it would require either (1) a loop-free data forwarding topology, i.e., a spanning tree, connecting the PEs or (2) other additional forwarding constraints on the data-path. For instance, in the case of [8, 9], a full-mesh of tunnel LSPs is required to create oneVC-hop data-paths amongst all the PEs and every PE must also implement the so-called split-horizon mechanism to eliminate excessive flooding of broadcast frames. (By split-horizon, we mean different treatment of broadcast frames depending on the type of its ingress interface/port. To be more specific, if the broadcast frame arrives from a customer-facing interface, the PE should broadcast it to all other endpoints, both local and remote, of the corresponding L2VPN. However, if the broadcast frame arrives from a network-facing interface, the PE should only broadcast it to the local endpoints of the corresponding L2VPN, i.e., those residing on the same PE.) On the other hand, the use of software-based remote address-mapping distribution may increase unnecessary broadcast traffic (and thus bandwidth consumption) due to delay in transmitting/propagating the remote address mapping. The frequency of remote address mapping distribution has to be carefully determined based on the tradeoffs among • Bandwidth overhead due to unnecessary broadcast as a result of delayed propagation remote mapping information, • Bandwidth overhead for distributing the remote mapping, and • CPU loading on PEs to support remote address mapping distribution. If the control channel, i.e., the LDP session, has a different route than the data path between a pair of PEs, additional service failure modes may occur when either one, but not both, of the paths fail. (One may argue, however, that such problems need to be handled anyway since the control channel usually needs to support other additional functions such as VC-label distribution between PEs.) Regarding the support of L2VPN-wide broadcast operation, the full-mesh PE-to-PE configuration is not as efficient as the spanning tree approach: in the fullmesh approach, data frames have to be duplicated at the ingress PE to support broadcasting, whereas the spanning tree approach can defer frame duplications further down the tree to conserve bandwidth. Approach 3: Local plus remote MAC address learning at the edge PEs only. In this approach, a full-mesh of VCs is first established between all pairs of PEs sharing the same set(s) of VPNs to ensure one-hop PE-to-PE data delivery within the provider network. Such setup eliminates the need of creating a logical spanning tree while still satisfying conditions 1 and 2. Condition 3, i.e., the elimination of excessive flooding during frame broadcast, is achieved by combining the one-hop PEto-PE VC connectivity with the split-horizon forwarding constraint described in the previous section. With one-VC-hop PE-to-PE connections, there is no need for any core P device to perform MAC address learning. For a customer-connecting PE, local MAC address learning is performed in the same way as in a conventional bridged network for frames arriving from customer facing ports. Yet a PE still needs special/new hardware support to perform the socalled remote address learning, i.e., to record the binding between the source MAC address and the incoming virtual circuit identifier of data frames that arrive from a network-facing interface. Approach 3 is attractive in the sense that it neither requires the continual running of the spanning tree algorithm (as in approach 1) nor the remote address mapping distribution (as in approach 2). Nonetheless, the requirement for a full-mesh of PEto-PE VCs in the data-delivery topology invoked by approaches 2 and 3 does pose serious scalability concern. Yet, the full-VC-mesh constraint has been the centerpiece of most basic (non-hierarchical) L2VPN proposals to date. VPN Endpoint Identification To define and support site-specific services, it is essential for the provider of an L2VPN service to be able to identify and reference the endpoints within an L2VPN. From the service provider perspective, a VPN endpoint is a customer-facing logical interface of an L2VPN that resides on one of the PEs within the provider network. For a point-to-point L2VPN, the two endpoints can be identified either by a network-wide unique identifier of the corresponding point-to-point VC or by two separate network-wide unique identifiers, one for each VPN endpoint. The former approach is taken by Martini [11] and Lasserre [8] while the latter approach is adopted by Rosen [13] and Lau [10]. In the Martini and Lasserre proposals, the so-called VCID is used to identify the VC of a point-to-point L2VPN where the VCID must be network-wide unique. This VCID, in effect, serves as the VPN identifier (VPNID) in the case of point-to-point L2VPN as there is only one bidirectional VC per L2VPN. In Rosen’s proposal, the attachment group identifier (AGI) can be used as a VPNID while the attachment identifiers (AI), together with the IP addresses of their corresponding hosting PEs, are used to uniquely differentiate endpoints within an L2VPN. In Lau, each L2VPN is identified by a network-wide unique VPNID per RFC2685, and the semantics of the VCID, as defined in Martini, is altered to be used as an endpoint identifier within a VPN. As such, the VCID has to be unique only within the same VPN. While the endpoint-identifier approach taken by Rosen and Lau can be readily extended to support single-sided provisioning of multipoint L2VPNs, the corresponding extension of the VCID approach in Martini, viz., assigning a different VCID for each pointto-point VC within a multipoint L2VPN, is problematic and thus undesirable. This is because such an extension would require both sides of a VC to agree on the use of the same VCID in advance, which, in turn, would require the provisioning of O(N2) VCIDs for an L2VPN with N endpoints. Worse still, this approach also makes single-sided provisioning impossible as the addition of a new endpoint to an existing VPN would require the provisioning of new VCIDs in all PEs participating in this VPN. The generalization of VCID semantics by Lau is an attempt to overcome the aforementioned Bell Labs Technical Journal 69 cumbersome provisioning procedures for L2VPNs while minimizing required protocol changes (in terms of changes in LDP extensions) with respect to the Martini framework. Alternatively, Lasserre and Kompella [9] address the problem by using the VCID field in Martini to convey the VPNID. Here, the same VCID (as well as VPNID) is provisioned at every endpoint of an L2VPN and there is no explicit identifier for each individual endpoint within an L2VPN. Although this approach does enable single-sided provisioning for multipoint L2VPNs, it forces a hosting PE to hide the identities of multiple endpoints of the same VPN from other remote PEs. The resultant inability of identifying/ referencing individual VPN endpoints can lead to difficulties in supporting persite quality of service (QoS) requirements, especially under the so-called pipe model [2]. In general, the Lasserre–Kompella scheme also requires additional bridging operations to be performed at a destination PE that hosts more than one endpoint of the same L2VPN. (See references in Lau for further discussions on the implications/limitations of the Lasserre– Kompella scheme.) VPN Membership Discovery To enable automatic provisioning of L2VPN services, it is necessary for the provider network to support VPN endpoint/membership discovery. VPN endpoint/membership discovery amounts to the task of maintaining and/or distributing of the following mapping within the provider network(s): Mapping A: I.D.’s of the list of VPNs and/or VPN endpoints S IP address of the hosting PE One rudimentary way to realize VPN membership discovery is as follows: First, a full mesh of control channels is setup between all pairs of PEs within the provider network. Each PE then distributes the i.d.’s of the VPNs, VPN-endpoints and/or VCs hosted by itself, to all other PEs within the network using the established full-mesh of control channels. Based on the distributed information, each PE can construct and store the entire Mapping A of the network. Such an approach is proposed by Lasserre in [9] where a targetspecific LDP session is set up between every pair of 70 Bell Labs Technical Journal PEs within the network to serve as the control channel. LDP extensions based on the Martini framework is used for distributing VPNIDs, VPN endpoint identifiers, and/or VCIDs, together with the corresponding VC-labels, via the downstream-unsolicited label distribution mode of LDP. A key drawback of this rudimentary approach is its poor scalability due to the reliance on a full-mesh of control channels among all PEs within the network. While this approach may be feasible for a smallscale deployment, it definitely becomes burdensome as the network grows and/or when one has to support VPNs that span over multiple provider networks/administrative domains. The scheme also unnecessarily and inefficiently requires all PEs in the network to construct and maintain the entire Mapping A. Note that, in principle, a PE would only need to know the whereabouts of other PEs sharing the same VPN(s). Recently, more scalable and efficient approaches have been proposed to support VPN membership discovery. In [15], extensions to BGP are proposed to support the distribution of Mapping A throughout the network. The key advantage of a BGP-based solution is the reuse of BGP facilities to support Internet-scale information distribution across multiple domains. A drawback is BGP’s own scalability and complexity. In [15], the originating PE uses an extended version of BGP to distribute the i.d.’s of the list of VPNs and/VPN endpoints hosted by itself to the rest of the network. By introducing the notion of a VPNID extended community, one can use the BGP route-filtering mechanism to allow one PE to find out about other PEs sharing the same VPN(s). More recently, a DNS-based approach for VPN membership discovery was proposed by Heinanen [3] in which Mapping A is provisioned into, and maintained by, one or more DNS servers. In this case, a PE can find out other PEs in the same VPN by issuing a DNS query. A key advantage of this approach is its simplicity: by reusing the readily available, distributed database infrastructure supported by the hierarchy of DNS servers, one can reap the benefit of Internet-wide scalability while avoiding the hassle of configuring/ running BGP. Enhancements to VPLS Connectivity Model The VPLS connectivity model discussed in the basic VPLS model presumes a full mesh of tunnels among all communicating PEs. This connectivity model, typically referred to under the generic term non-decouple distributed VPLS, creates a tight relationship between access network topology and the logical connectivity in the core network. The model suffers from scalability and reliability concerns in that it places a heavy burden on the number of logical tunnels and signaling overhead to be supported by the PEs. Solutions in this category include those proposed by Lasserre, V. Kompella [7] and Rosen [13]. An alternative deployment mode, referred to under the generic term distributed decoupled VPLS promotes a looser coupling relationship between access and core devices and addresses the scalability concerns. In this model the access and core transport networks are largely independent and a full mesh of transport tunnels is not required among all PEs. For instance, the core network can be built out of Access network IP/MPLS technology while the access network can be built from technologies such as FR/ATM, switched virtual LANs or direct connectivity. Thus, the access network can be built from simpler devices such as Ethernet bridges and SONET/SDH multiservice platforms, while the core network can be built out of more sophisticated label-switched routers (which may or may not be MPLS enabled). Solutions in this category include the initial the hierarchical VPLS model [6], the original decouple TLS proposal [7], and hybrid logical PE architecture [16]. Hierarchical VPLS The hierarchical VPLS (H-VPLS) reference model is illustrated in Figure 4. Decoupling of PE functions is accomplished by creating a two-tier hierarchy of spoke and hub PEs. Spoke PEs connect directly to a selected number of Hub PEs, say, via Martini-based VCs. The hub PEs perform the same layer 2 processing functions as the spoke PEs (e.g., MAC address learning) but also retain the traditional full mesh core connectivity of the non-decoupled VPLS. Spoke VCs may IP/MPLS backbone Access network VPN A S1 VPN B S1 Provider network VPN A S3 VPN A S2 Full mesh connectivity between hub PEs 1 VC from spoke to hub PE VPN B S2 VPN B S3 Customer edge device Spoke PE device IP—Internet protocol MPLS—Multi-protocol label switching PE—Provider edge VC—Virtual connection VPN—Virtual private network Hub device Provider core device Figure 4. H-VPLS reference model. Bell Labs Technical Journal 71 also be expanded to include any layer 2 tunneling mechanism other than MPLS. This aspect allows expanding the scope of the first tier of PEs to include non-bridging VPLS PE routers at the metro access/core network boundary. The non-bridging PE router may extend a spoke VC to the spoke PE via a layer 2 switch technology such as IEEE VLANs, FR, or ATM. Decoupled TLS The decoupled TLS (D-TLS) reference model is illustrated in Figure 5. As with the H-VPLS model, decoupling of PE functions is accomplished by creating a two-tier hierarchy of spoke and hub PEs. Here, however, customers attach to a layer 2 PE (L2PE), which acts as a metro/wide area bridge connecting the LAN at a customer site to other L2PEs and PEs that are connected to other LANs belonging to the same customer. Functions such as MAC address learning—including learning across the metro area from other L2PEs—and building a spanning tree, both on the LAN side and on the metro side, are relegated to the L2PE. Functions such as discovering other L2PEs connected to a given customer are relegated to the PE, but PEs do not participate directly on layer 2 processing functions (e.g., MAC address learning). Yet relevant VPLS Access network configuration information, such as QoS, still needs to be exchanged between the PE and its L2PEs. Logical PE The logical PE (LPE) reference model is illustrated in Figure 6. There can be multiple switched Ethernet transport domains within a Logical PE. A switched Ethernet transport domain is equivalent to a broadcast domain. All the devices within the switched Ethernet transport domain receive Ethernet broadcast messages sent over the same switched Ethernet transport domain. Multiple switched Ethernet transport domains can be contained within a carrier’s metro access network. For example, a particular 802.1Q VLAN in a carrier’s metro access network A is a switched Ethernet transport domain. PE-cores are envisioned as highly scalable routers/Ethernet VPLS switches with IP/MPLS-based capabilities. PE-cores, like any other PEs, can provide VPLS, layer 2, and layer 3 VPN services. PE-cores are connected to each other over the core network (e.g., MPLS) through P devices as usual VPLS architecture. Decoupled Proposals: A Brief Comparison A brief comparison of the three main proposals for distributed VPLS can be made in terms of IP/MPLS backbone Access network VPN A S1 VPN B S1 VPN B S2 VPN A S2 VPN A S3 Provider network Full mesh Full mesh connectivity connectivity between PEs between L2PEs (via PEs on same VPLS) IP—Internet protocol LAN—Local area network MPLS—Multi-protocol label switching PE—Provider edge VPLS—Virtual Private LAN Service VPN— Virtual private network Figure 5. D-TLS reference model. 72 Bell Labs Technical Journal Customer edge device Layer 2 PE device PE device Provider core device Access network IP/MPLS backbone Access network VPN A S1 VPN B S1 VPN A S2 VPN A S3 Provider network Meshed connectivity between L2PEs Full mesh connectivity between core PEs Customer edge device PE - Edge device (L2PE) IP—Internet protocol L2PE—Layer 2 PE MPLS—Multi-protocol label switching PE—Provider edge VPN—Virtual private network PE - Core device Provider core device Figure 6. Logical PE reference model. underlying network topology (connectivity), ease of configuration, and data forwarding characteristics. • Connectivity: All of the extension proposals for VPLS are based on the concept of a full mesh of transport tunnels, and associated control channels, between PE devices. VPN membership discovery and VC label distribution functions use a single protocol, but not necessarily the same (LDP vs. MP-BGP). H-VPLS and D-TLS also assume point-to-point connectivity between hub and spoke PE devices. LPE, however, allows for multipoint connectivity between “spoke” PEs and their hub PE. • Configuration: In H-VPLS and LPE, configuration data for each new spoke PE device is provisioned only at its hub PE. This information is then propagated automatically to other PEs supporting the same VPNs, via either a common protocol—MP BGP in H-VPLS—or separate membership distribution protocols for spokes and hubs PEs in LPE. In D-TLS, however, configuration information is required at all hub PEs for each new spoke PE introduced into the network. • Forwarding: In D-TLS and LPE, VFS instances exist only in the spoke PE devices. Thus, MAC address learning and VLAN tag processing need to happen only at the spoke PE devices. Since LPE supports multipoint connectivity, traffic between spoke PEs need not traverse the hub PE device. In H-VPLS, however, VFS instances—and hence MAC and VLAN tag processing—must be supported at both hub and spoke PEs. VPWS/VPLS in SONET/SDH Transport Networks The VPWS model, as discussed earlier, is mainly geared toward simplifying the operational procedures for service providers in establishing multisite full mesh connectivity. Partial mesh connectivity can also be achieved by applying policies/constraints to the control plane procedures used for neighbor discovery and VC/tunnel setup. Given the point-to-point nature of the VPWS model, it is easily extensible to be applicable within a transport network. Indeed, the generalized port-based VPN [17] procedures recently proposed in the IETF have this as the objective. It must be noted here that the multisite VPN achieved using such procedures does not make any assumptions on what is carried in the bearer transport. What is achieved is essentially a set of TDM pipes interconnecting multiple customer sites. Ethernet LAN service over a transport network may also be feasible in the decoupled VPLS model Bell Labs Technical Journal 73 and other similar models where the transport network may also be involved in address learning and broadcasting. Here the VC tunnels are built out of SONET/SDH channels. Packet-level multiplexing can be supported either via Ethernet VLAN tags or the generic framing procedure (GFP) tags [4]. The impediment to Ethernet LAN offerings over a transport network may be the perceived lack of statistical multiplexing gains in this model (except at the hub nodes). While this concern is true in current implementations, it may not be a concern in the context of next-generation SONET/SDH metro networks. Studies reported in [12] can serve as a basis for addressing this concern. QoS support in L2VPNs Although VPN architectures proposed so far have mostly focused on providing a scalable control plane solution, the issue of QoS support has received limited attention. It is noteworthy that QoS support is a fundamental issue not only because it is a key feature/ requirement for commercial deployment but also because its potential architectural impact due to the interplays/tradeoffs between QoS support and scalability. For instance, in the H-VPLS architecture, the presence of multiple endpoints of the same VPN at a local PE is hidden from other remote PEs to enhance signaling scalability. Such design, however, could hinder the support of the QoS pipe model between all pairs of endpoints within a VPN. Below, we highlight some of the QoS-related open issues that can have considerable impact on the current L2VPN proposals: • The lack of QoS signaling support for inner VCs between VPN endpoints. Most current solutions, such as the Martini approach, only apply aggregated resource reservation/QoS signaling (via RSVP-TE or CRLDP) for the outer tunnel between a source/ destination PE pair. The outer tunnel, which usually carries multiple inner VCs, is terminated at the loopback address of the destination PE. A separate signaling protocol, such as LDP and BGP (with appropriate extensions), is used to setup the inner VCs between the actual VPN endpoints, 74 Bell Labs Technical Journal but without the ability to specify any QoS related attributes. As a result, inner-VC-level resource reservation cannot be signaled/made within the destination PE. This, in turn, prevents the support of a truly single-sided provisioning of L2VPNs under the QoS pipe model. • The lack of binding between inner VCs and their outer tunnel. The use of different signaling protocols for the setting up of inner VCs and outer tunnels make it difficult for the destination PE to establish such binding. Notice that this binding is essential for the support of parallel traffic engineering tunnels between a pair of PEs. It would also play a key role in supporting fast restoration/re-routing of outer-tunnels upon network failure. • The support of multiple QoS within a VPN facilitating, e.g., coexistence of guaranteed bandwidth and best-effort services within a VPN. In summary, much work remains to be done for the full support of QoS in managed VPNs. We refer readers to [2] for ongoing developments and future directions on this subject. Closing Remarks In this paper we have discussed the basic functional components of an L2VPN service. We also reviewed and critiqued various solutions currently under consideration under the IETF Provider Provisioned VPN Working Group. Although proprietary implementations of some of the various proposals are already hitting the marketplace, most of the standardization work is still in progress and further technical refinements to the various proposals should be expected. Given the strong interest from the vendor and service provider communities, and the pace of evolution of the L2VPN service definition, solution requirements, and framework, a comprehensive solution to L2VPN services is likely to emerge in the not-too-distant future. References [1] W. Augustyn (ed.), “Requirements for Virtual Private LAN Services (VPLS),” Internet Engineering Task Force, Mar. 2000, <http:// www.ietf.org/internet-drafts/ draft-ietf-ppvpn-vpls-requirements-01.txt>. [2] F. Chiussi, J. Clerc, S. Ganti, W. Lau, B. Nandi, N. Seddigh, and S. Van den Bosch, “Framework for QoS in Provider Provisioned VPNs,” Internet Engineering Task Force, June 2002, <http://www.ietf.org/internet-drafts/draftchiussi-ppvpn-qos-framework-00.txt>. [3] J. Heinanen, “DNS/LDP Based VPLS,” Internet Engineering Task Force, Jan. 2002, <http://www.ietf.org/internet-drafts/ draft-heinanen-dns-ldp-vpls-00.txt>. [4] E. Hernandez-Valencia, “Hybrid Transport Solutions for TDM/Data Networking Services,” IEEE Commun. Mag., 40:5 (2002) 104–112. [5] Institute of Electrical and Electronics Engineers, IEEE Standard for Information Technology, Telecommunications and Information Exchange Between Systems, IEEE Standard for Local and Metropolitan Area Networks, Common Specifications, Media Access Control (MAC) Bridges, IEEE 802.1D, 1998 ed. (ISO/IEC 15802–3:1998). [6] S. Khandekar (ed.), “Hierarchical Virtual Private LAN Service,” Internet Engineering Task Force, June 2002, <http://www.ietf.org/internet-drafts/draftkhandekar-ppvpn-H-VPLS-mpls-00.txt>. [7] K. Kompella (ed.), “Decouple Virtual Private LAN Service,” Internet Engineering Task Force, May 2002, <http://www.ietf.org/internet-drafts/ draft-kompella-ppvpn-D-TLS-01.txt>. [8] M. Lasserre (ed.), “Transparent VLAN Services over MPLS,” Internet Engineering Task Force, Aug. 2001, <http://www.ietf.org/internet-drafts/ draft-lasserre-tls-mpls-00.txt>. [9] M. Lasserre and V. Kompella, “Virtual Private LAN Services over MPLS,” Internet Engineering Task Force, Mar. 2002 <http://www.ietf.org/internet-drafts/ draft-lasserre-vkompella-ppvpn-vpls-01.txt>. [10] W. Lau and F. Chiussi, “Extensions for QoS Support in MPLS-based Transparent LAN Services,” Internet Engineering Task Force, Mar. 2002, <http://www.ietf.org/internet-drafts/ draft-lau-ppvpn-qos-tls-mpls-00.txt>. [11] L. Martini (ed.), “Transport of Layer 2 Frames over MPLS,” Internet Engineering Task Force, July 2002, [12] [13] [14] [15] [16] [17] <http://www.ietf.org/internet-drafts/ draft-martini-l2circuit-trans-mpls-07.txt>. P. Molinero-Fernandez and N. McKeown, “TCP Switching: Exposing Circuits to IP,” IEEE Micro, Jan./Feb. 2002, 2–9. E. Rosen, “Single-Sided Signaling for L2VPNs,” Internet Engineering Task Force, Feb. 2002, <http://www.ietf.org/internet-drafts/draftrosen-ppvpn-l2-signaling-01.txt>. E. Rosen (ed.), “An Architecture for L2VPNs,” Internet Engineering Task Force, July 2001, <http://www.ietf.org/internet-drafts/ draft-ietf-ppvpn-l2vpn-00.txt>. H. Ould-Brahim (ed.), “Using BGP as an Auto-Discovery Mechanism for NetworkBased VPNs,” Internet Engineering Task Force, Aug. 2002, <http://www.ietf.org/internet-drafts/ draft-ietf-ppvpn-bgpvpn-auto-03.txt.>. H. Ould-Brahim, B. Radoaca, M. Chen, and P. Menezes, “VPLS/LPE L2VPNs: Virtual Private LAN Services Using Logical PE Architecture,” Internet Engineering Task Force, Apr. 2002, <http://www.ietf.org/internet-drafts/ draft-ouldbrahim-l2vpn-lpe-01.txt>. H. Ould-Brahim and Y. Rekhter (eds.), “GVPN: Generalized Provider-provisioned Port-based VPNs using BGP and GMPLS,” Internet Engineering Task Force, Mar. 2002, <http://www.ietf.org/internet-drafts/draftouldbrahim-ppvpn-gvpn-bgpgmpls-01.txt>. (Manuscript approved December 2002) ENRIQUE J. HERNANDEZ-VALENCIA is a distinguished member of technical staff in the ONG CTO Group at Lucent Technologies in Holmdel, New Jersey, where he works on architecture and system engineering aspects of optical networking systems. He received his B.Sc. degree in electrical engineering from the Universidad Simon Bolivar, Caracas, Venezuela, and his M.Sc. and Ph.D. degrees in electrical engineering from the California Institute of Technology, Pasadena, California. Dr. Hernandez-Valencia has many years of experience in the design and development of high-speed communications systems and services. He is a member of IEEE, ACM, and Sigma Xi. Bell Labs Technical Journal 75 PRAMOD KOPPOL is a technical manager in the Networking Research Laboratory at Bell Labs in Holmdel, New Jersey. He received a Ph.D. degree in computer science from North Carolina State University in Raleigh. His current research is in highly scalable and robust IP routing protocols and their implementations. In a prior assignment, as the director of software, he was responsible for the overall development and testing of software for an IP services switch. Before that, as a distinguished member of technical staff, he was actively involved in systems software architecture and development for ATM and IP/MPLS switches. Dr. Koppol’s research interests are in networking systems software, architecture and development of large-scale software systems, routing and signaling in the Internet, network management, and network architectures. WING CHEONG LAU is a member of technical staff in the Performance Analysis Department at Bell Labs in Holmdel, New Jersey. He received a B.S. degree in electrical engineering (with honors) from The University of Hong Kong and M.S. and Ph.D. degrees in electrical and computer engineering from The University of Texas at Austin. At Bell Labs, he serves as a technology/performance consultant and a technical contributor to various business units and product development teams. His research interests include high-speed protocol design, network resource management, traffic characterization, network planning and optimization, queuing theory, system modeling, and performance analysis. Dr. Lau has more than ten U.S. patents filed/issued. He is a senior member of IEEE and a member of ACM and Tau Beta Pi. ◆ 76 Bell Labs Technical Journal