Optical BGP Networks Discussion Paper Contributors: Bill St. Arnaud, CANARIE Inc., bill.st.arnaud@canarie.ca Rene Hatem, CANARIE Inc., rene.hatem@canarie.ca Wade Hong, Carleton University, xiong@physics.carleton.ca John Coulter, Cisco Canada Inc., jcoulter@cisco.com Marc Blanchet, Viagenie Inc., marc.blanchet@viagenie.qc.ca Abdul Abdalla, Bell Nexxia Inc., abdul.abdalla@bellnexxia.com Ian MacDonald, JDS Uniphase Inc., ian.macdonald@ca.jdsunph.com Florent Parent, Viagenie Inc., florent.parent@viagenie.qc.ca Tom Tam, Bell Nexxia Inc., thomas.tam@bellnexxia.com Mike Weir, Globalknowledge Inc., mike.weir@globalknowledge.com DRAFT—DRAFT--DRAFT March 20, 2000 Revised Draft July 16, 2000 Abstract As customer owned dark fiber networks combined with low cost WDM systems become prevalent, “customer empowered” optical network architectures where the customer controls the optical routing using exterior routing protocols may be possible. OBGP is a proposed extension to BGP for the manipulation of OXCs to permit them to be automatically setup and configured as BGP speaking devices to support multiple direct optical lightpaths between many different autonomous domains. With OBGP the routing component of a network may be distributed to the edge of the network, while the packet classification and forwarding is done in the core. OBGP may also allow customers at the edge to control a subset of lightpaths within another network’s wavelength cloud so that they can manage their own lightpath routing within that cloud. With the large number of adjacencies possible using OBGP lightpaths themselves may be used as a direct peering and transit mechanism between consenting ISPs. As a result the big driver for large capacity systems DWDM will may not be bandwidth but wavelength adjacency and transit services. The exchange of lightpaths may also allow for a simpler mechanism to allow for settlement in peering and transit between ISPs. In future there may even be wavelength commodity markets where ISPs can trade wavelengths and adjacencies on the open market. The proposed protocol extensions may also allow for the deployment of “carrier free” networks, where the customers at the edge control and route lightpaths across a carrier’s optical wavelength cloud. Definitions In this paper the works wavelength and lightpath have different meanings. A wavelength is a single instance of a wavelength on point to point Dense Wave Division Multiplex (DWDM) or Coarse Wave Division Multiplex (CWDM) system. A lightpath is a contiguous optical path which may be made up of one or more wavelengths connected across optical switches. Dark fiber is a fiber strand that is provisioned by a carrier, or other supplier, with no equipment attached to the ends of the fiber. It is the customer’s responsibility to attach the appropriate equipment at the end of the fiber and “light” it for whatever purpose they so choose. A dim wavelength is similar to dark fiber, except that the wavelength is provisioned and managed by a carrier. The customer still has the responsibility of attaching their own equipment at the end of wavelength. Objectives Traditionally large enterprise customers and smaller Internet Service Providers (ISPs) had only one default path to a much larger downstream ISP or carrier who aggregated their traffic and managed the interface to other networks. With the explosion of customer owned dark fiber and Wave Division Multiplexing (WDM) systems large enterprise customers and ISPs may soon have the choice of multiple paths between themselves and other networks. They network managers will now have the capability to manage their own interconnection directly to other customers and networks independently of an intermediate carrier or service provider. BGP is the protocol of choice for the management of links between autonomous management domains. BGP management can become quite complex, particularly when a customer is trying to manage many independent links. As such the automatic setup and configuration of BGP becomes increasingly more important as the number of possible number of direct interconnections increase. In essence that is the main objective of this paper – to propose a mechanism for the automatic configuration and setup of BGP to support many independent and parallel optical links between autonomous Internet domains. One possible solution is to treat each optical cross connect as a direct path between a pair of BGP speakers. However, this significantly increases the complexity of any single BGP session, particularly where there may be many parallel lightpaths. The alternate solution is to treat each optical cross connect as an independent virtual BGP router with only one input port and one output port. A virtual BGP router can then be set up for each optical cross connect and separate BGP sessions initiated with its peers. This approach is believed to be much more scalable as each virtual BGP router configuration can be easily cloned from other virtual BGP routers. To date there has been little effort in addressing the requirement to configure, setup and manage wavelengths between domains or to allow enterprises at the edge to manage their own wavelength configuration across a “wavelength cloud”. The conventional solution to date is for a carrier to operate a wavelength cloud and offer a managed lightpath service to the customers at the edge as shown in Fig 1. Enterprise networks or smaller ISPs that are connected, at the edge to these optical clouds will generally have limited view into the network and virtually no control of how their lightpath is routed. A number of mechanisms have been proposed for the management and control of such “wavelength cloud” systems [LRA00]. Most of these systems have been designed on variations of link state interior routing protocols such as OSPF, IS-IS [KRA00] and PNNI [ATM96] or complementary extensions of MPLS such as MPLmS [ARD99]. For complex single domain networks these protocols allow for the optimized configuration and establishment of lightpaths across a single management domain. Because these networks provide “common carriage” to many downstream customers they require survivable, fast restorable lightpaths. An essential attribute of these networks is the capability to instantiate and route “end to end” optical channels in near real time and to provide capabilities that enhance network survivability [LRA00]. To date there has been little work done on developing protocols for the management of wavelengths between separate management domains. In addition, customer control and routing of wavelengths across a wavelength cloud is still not possible. This paper attempts to address these issues by proposing extensions to the well know BGP routing protocol for the management and signalling of wavelengths between autonomous domains. In addition OBGP enterprise customers at the edge will be able to manage their own direct peerings with other networks across a wavelength cloud as shown in Fig 2.0. Rather than aggregating traffic the central network cloud allows customers at the edge to manage their own wavelength routing by controlling specifically assigned cross connects on the optical switches within the cloud and thereby extend their own network domains across the central network cloud to directly connect and peer with other network domains attached to the cloud. The are a number of objectives to this exercise: 1. To provide enterprise customers and ISPs a set of tools for the automatic setup and configuration of multiple BGP sessions to other autonomous domains; 2. To allow the setup of optical cross connects with other peers as a local optimization issue without the prior involvement or notification of the BGP peers in configuration setup; 3. To allow autonomous Internet domains to establish multiple lightpaths between each other with well known quality of service characteristics for the support of advanced services such as voice and video as well as new protocols such as Ipv6; 4. To allow the autonomous Internet domains to automatically transit lightpaths across their network in support of external peering relationships; 5. To provide a mechanism for the management of large single domain wavelength clouds by breaking the larger cloud into smaller clouds and use OBGP techniques to manage a much smaller subset of wavelengths between the clouds; 6. To allow the deployment of low cost terabit routers that use integrated optical switches rather than complex electrical routing engines for the forwarding of packets; 7. To reduce the number of existing multiple network layers and their supporting protocols to one simple universal protocol OBGP for managing the physical and routing layers; 8. To allow edge customers to manage their own optical paths across a lightpath cloud managed by another network entity; 9. To provide a mechanism for distributed Internet Exchange facilities using the exchange and trading of lightpaths between networks to minimize the need for hierarchical network architectures to interconnect peering networks; and 10. To define a mechanism for the exchange and trading of lightpaths as a commodity on an open market. Differences between OBGP and other wavelength management protocols It is important to stress that purpose of OBGP is not intended to replace existing proposed mechanisms for managing wavelength clouds. It is seen as an enhancement to those protocols and to provide solutions not addressed by those mechanisms. The following points may clarify the differences: 1. OBGP is not intended as a signalling protocol to support end to end establishment of lightpaths. Rather it is intended for short range establishment of lightpaths to firstly optimize local router traffic flow, and once that requirement has been to secondly allow external peers to establish light paths to optimize their traffic flows. The end result may be an end to end lightpath but that is a coincidental benefit and not a primary objective. 2. In municipal and regional dark fiber networks where the individual participating organizations are interconnected by customer owned fiber, multilateral and transit peering is required. Many of these networks are looking at using WDM systems to support direct peers between themselves and other similar minded organizations. Without direct peers a central wavelength management authority would be required to manage the aggregation of traffic and re-routing to the edge. OBGP is intended to minimize the need for any central networking administrator, but instead allow connections to be direct pair wise peerings. 3. OBGP is intended give the customer some reasonable control of the routing of their lightpaths through another entity’s optical wavelength cloud, perhaps as an overlay to an interior wavelength management protocol. For example, a carrier may have a large managed wavelength cloud, but rather than hiding the routing of the wavelengths from the customer, the customer may be given a limited view of the network topology or a choice of possible routes which are subset of all possible routes. In addition, the customer may have optical routes transiting two separate carrier networks and may wish to interconnect its routes through these clouds at some mid point. As a consequence the customer’s ideal optical wavelength topology may be at variance to the ideal optimized topology of the individual carrier networks. OBGP may allow the customer’s topology to take precedence over the carrier’s preferred topology. 4. Large single domain wavelength clouds simply may become unmanageable and too difficult to optimize for traffic engineering purposes as a single domain. The common solution is to break those single domains into many smaller domains which individually can be optimized. Between the domains a more modest optimization mechanism like OBGP could be used. Background In the past, for regulatory and other reasons it was very difficult for individual organizations or ISPs to acquire their own dark fiber. More recently however, municipalities, carriers and consortium of various organizations are undertaking open access dark fiber builds in a variety of jurisdictions around the world. In addition a number of carriers are starting to sell “dim” wavelengths to individual customers and ISPs. ISPs or enterprise customers can now build their own optical network by interconnecting “dim” wavelengths and dark fibers from a number of different carriers. For example, a large enterprise customer may own several strands of dark fiber to interconnect facilities across a city as well as lease dim wavelength to interconnect their campus networks with a distant point of presence which may be half way across the country. The initial driver for the deployment of these customer owned dark fiber and dim wavelength networks is the significant cost savings over managed services from an incumbent carrier. But a secondary benefit is the development of new applications and services that would not be possible from a traditional carrier managed service. This model of optical networking is rapidly evolving in the university research network environment in Canada and the USA. Many of the advanced research networks in Canada which are part of the CA*net 3 [BTB00] Optical Internet program are deploying their own dark fiber and wavelength networks. To date management and configuration of wide area optical networks have been the purview of large carriers. Bandwidth was a scarce and expensive commodity and building wide area networks required special and highly qualified engineering and management skills. But with standardization of Gigabit Ethernet and soon 10Gigabit Ethernet as a simple and effective protocol for carrying traffic in the WAN the management of wide area networks has become relatively trivial. With the recent availability of low cost, long reach lasers Gigabit Ethernet segments can now easily extend 40 to 100 km. Gigabit Ethernet and 10Gigabit Ethernet can also be directly mapped to dim wavelengths on a long haul transparent optical network. Now that Gigabit Ethernet can be easily mapped to dark fiber strands or wavelengths it is trivial task for a LAN manager to extend the enterprise network across the WAN. The same LAN management tools and techniques that are used to manage the complex enterprise network can now also be used to manage the relatively simple LAN extensions across the city or eventually across the country. Rather than expecting a carrier to build out a WAN optical network to the customer, customers are electing to build out their LAN to the carrier, (or to the ultimate destination network and bypassing the carrier altogether) via a combination of dark fibers and dim wavelengths. Simply put, the LAN is invading the WAN. As the result of the availability of low cost dark fiber, inexpensive long reach lasers and standardization on Ethernet as the protocol for wide area and local area networking, enterprise customers are starting to lead the drive for DWDM optical internet architectures. While the immediate impact of DWDM and CWDM technology will be to dramatically increase bandwidth the long term benefit of such technology will be to “empower” customer to deploy their own self managed wavelength wide area and long haul networks. These architectural approaches of “customer empowered” networks may have fundamentally different architecture requirements than that of traditional carriers. For example enterprise customers understand that caching and multi-homing can provide greater reliability than fast restoral and protection on individual optical links. Interconnection and peering to many other enterprise networks may also allow the enterprise or small ISP network to bypass more traditional hierarchical carriers and Internet service providers and establish direct peering with destination ISPs. Currently, optical networks are primarily used for the interconnection of large network domains such as enterprise networks, ISPs, GigaPOPs and so on. Most of these networks already use external routing protocols such as BGP to manage the interconnection of their respective networks. More importantly these large enterprise customers and ISPs are the one swho would likely be the first to acquire dark fiber and operate DWDM or CWDM networks. It would seem logical then routing protocols for inter-domain networking might also be useful for interconnecting optical networks. Advantages of using BGP for optical network configuration and routing Many of the issues that external gateway protocols such as a path vector protocol like BGP were designed to deal with are similar to the management of multiple wavelengths in an optical network, particularly between management domains. BGP has a basic architecture and tool set that is premised on the assumption that it will be primarily used for the establishment of links between independent, autonomous management domains. Although other interior protocols can be used for the management of wavelengths including the management of wavelengths between domains their architectural premise is based on a single management authority i.e. building another network cloud between independent domains. BGP routing normally only conveys reachability information. It does not convey any information about the optimal topology, quality of service or bandwidth of a particular route. But optical lightpaths (as opposed to SONET channels) are generally of fixed bandwidth – typically 1 and 10 Gbps for CWDM systems or 2.5 and 10 Gbps for DWDM systems. The physical characteristics of a lightpath give it an intrinsic capability of being a “poor man’s” logical switched path with a predefined Quality of Service. Sophisticated intrinsic parameters such as Quality of Service, Restoral or Protection that is commonly available on other types of circuits such as MPLS, Frame Relay, SONET or ATM may be not needed in optical BGP network with DWDM wavelengths between all BGP nodes. An optical DWDM network can be regarded as simply as a more complex inter-domain BGP environment where there exists multiples paths of fixed known bandwidth between network neighbours. Because a path vector protocol like BGP lists the domains or Autonomous Systems (ASs) that a packet must traverse on an advertised route, the path information enables a customer’s router to perform rudimentary traffic engineering on an inter-domain basis. This form of traffic engineering is not as rigorous or complete as MPLS Traffic Engineering. However MPLS-TE, to date only works within a single domain. Interdomain MPLS will in theory will allow traffic engineered links across domain, but the negotiation and transfer of RSVP and LDP message request across domains will be complex and most likely be mired in non-technical peering and transit business issues. One of the very useful features of BGP is that it uses TCP (port 179) for all communications between BGP speaking peers. This means that any type of communications channel can be established between BGP speaking peers. The BGP speaking routers or switches do not have to use the actual data forwarding lightpath that for the communication of routing information. The BGP routing information can use any out of band communications channel, including the Internet itself for the communication of BGP routing information between peers. Because BGP uses TCP for all communication BGP switches and routers do not necessarily have to talk directly to each other. In a complex BGP topology all routing update can be forwarded through a central route server sometimes also referred to as a route arbiter or route reflector. The router arbiter eliminates the need to set up a mesh of BGP links. All updates and changes to the BGP routing can be announced to and from the route arbiter. Route servers and arbiters are quite commonly used at major Internet exchange points for this very purpose. The other advantage of BGP is that it designed to support unicast routes. The Internet is fundamentally a unidirectional network. Traditional telecommunication networks on the other hand assume that all links are bi-directional and most routing and configuration protocols in use on these network do not distinguish between forward and return paths [CCM98]. BGP is usually configured to work with single point to point connections or one common LAN segment. Optical wavelengths and lightpaths are essentially point to point links. Optical lightpaths, because they are limited by the physical properties of light are unlikely to be as flexible and complex as electronic equivalents such as ATM or SONET circuits. Hence optical lightpaths can be easily be managed by a point to point protocol such as BGP. A simple, “good enough” and well proven exterior gateway protocol BGP with perhaps some refinements for managing wavelengths and/or lightpaths may be all that is necessary for customer controlled optical networks. As amply demonstrated in the Ethernet LAN world simplicity and low cost always wins over complexity and costly sophisticated solutions. Making OXCs into BGP devices If a traffic on a router is evenly distributed between input ports and output ports than an electronic forwarding engine is required to sort the packets and forward them to the appropriate destination port. The processing power required to forward packets in such a configuration is a function of the square of the number of ports and linear multiple of the speed of the data. However, in many router configurations it is quite common that most of the input traffic is directed to one or two output ports. Rarely is there an even distribution of data from input ports to output ports. For example, a mid level ISP would forward most of its traffic to the port that is connected to its upstream larger ISP. A significantly smaller portion of the input traffic would forwarded to the networks that are on the subtending side of the router. A regional network or a municipal that connects up several universities and other institutions to a commercial ISP is a good example of this type of configuration. The regional network core routers would see most of the packets forwarded to the port that faces the upstream ISP and significantly less data is forwarded between the universities and institutions themselves. In such a configuration it would be advantageous to use an optical cross connect to support the large data flows that are directed from a single input port to a single output port. Routing vendors already deploy this strategy in their products by fast forwarding “flows” in the electrical domain packets destined for a common output port. Another technique is the use of ATM virtual circuits combined with virtual or “one-armed” routing to support large flows. Next Hop Resolution Protocol (NHRP) was one proposed technique for fast forwarding data across multiple routers, but has been largely displaced by MPLS solutions. Most of the flow based “cut thru” technologies have largely abandoned in favour of traffic engineering approaches to managing such traffic. OBGP is proposed alternative technique to MPLS traffic engineering for inter-domain applications. As a first step it is proposed that optical cross connect switches be integrated into BGP routers. The electronic routing engines can be very simple devices that have minimum of forwarding capacity in the electrical domain. The optical cross connects will, by necessity, include optical multiplexers and demultiplexers. The optical filters can be passive or active. As such, the owner of an optical router will know beforehand which wavelengths can be cross connected. External peers will not be required to know the physical port identifiers (unlike ATM or SONET) of the host or whether it can support a given set of wavelengths. Although optical cross connects are very simple devices and can be usually managed through a simple serial interface it would be useful if each optical cross connect could be managed as independent IP devices. As we shall see later in this paper, it will be advantageous to hand off management of one or more optical cross connects to an external router. This will allow an external customer to mange their own optical cross connect and direct the wavelengths to the peer of their choosing. Mapping of ITU Wavelengths to IP addresses If the wavelengths in an OBGP router are based on the ITU grid then a commonly agreed mapping can be made between wavelengths and IP port addresses. Since tuneable lasers and filters have a limited range different suffixes can be used to indicate the appropriate wavelength range. For example addresses in the optical “C” band maybe signified with the suffix “x.x.1.x/24”. This would allow for almost 250 wavelength mappings. Addresses in the L band may be signified by the prefix “x.x.10.x/20” which would allow for 16,000 possible wavelength mappings and so forth. It is not the intent of this paper to suggest an appropriate mapping scheme at this time, but to indicate that it would be useful mechanism for the establishment of OBGP routers. Configuration of OBGP routers OBGP routers will have multiple paths between each other and any cross connected optical path will be given preference over any path that goes through an electrical forwarding engine using standard BGP techniques for selecting shortest AS path, MEDs, local prefs, etc. In Fig 3.0 a typical BGP routing configuration is shown. Each router is connected by one pair of wavelengths. Router A, B and C have been set up as neighbours to each other in a standard BGP configuration. Assume now that Router B is in fact a combination of an optical switch and traditional multiplexing router as shown in Figure 4.0. The optical switch components are shown in more detail in Fig 5.0. Optical BGP only requires a simple optical cross connect switch. Fast switching is not essential so a number of optical switch technologies are feasible including mechanical switches, MEMs, and other devices. To multiplex and demultiplex wavelengths Router B must use optical filters that can separate out the individual wavelengths. These can be tuneable or fixed filters. With each colour of light a default port address can be assigned, or alternatively a direct mapping can be made between the ITU grid frequency and an IP address as described previously. By using a simple optical switch the individual light path ports can be treated in effect as an alternate path to router B. In this example another set of wavelengths are established from Routers A and C to B. There are 2 routes between Router A and Router B– the directly connected path through the switch or the multiplexed path through the router. Router B must now advertise to Router C the best route to Router A. In a normal routing situation only one of the routes between Routers A and B would be advertised to router C. But because one of the paths is through the optical switch with no multiplexing, advertising only one route could inadvertently “black hole” Router A, as seen by Router C. There are two ways of configuring Router B. The first more complex method is to use the cross connect to establish a direct path between two routers. There are well known parallel path BGP configuration techniques between two routers and most routers can support up to 6 parallel paths. But as described in the following section it is believed that this approach has many deficiencies and will not scale. The second and simpler approach is to treat each and every cross connection as an independent virtual BGP router. This described in much further detail under the heading “Virtual BGP routers”. Multi parallel path BGP connections One technique for taking advantage of optical cross connects is to configure multihoming from Router B to single source Router A. There are a number of well established techniques for the establishment of parallel paths between routers for load sharing and configuring backup links [H97]. BGP will support up to 6 parallel links. In a normal BGP configuration Router B would only advertise the only “best” route for Router A to Router C i.e. Router C see only one path past Router B. An optical BGP router on the other hand would have to be aware that a direct optical path has been established across the internal optical switch and advertise Router A’s routes across the direct optical lightpath. Router C would then see Router A’s routes via the direct optical lightpath and also via the path that goes through the normal routing engine of Router B. Well known metrics could be used such that Router A’s routes would be preferenced over the direct optical lightpath. If the direct optical lightpath between Router B and Router A should fail it is critical that the interface on Router A signals link failure to the BGP session. Router A would then send a standard BGP update to Router B notifying Router B of the loss of that link. In a normal BGP routing situation Router B would still assume that both optical connections from Router B to Router C were functional. In that case additional data fields would have to be incorporated into the Routing Information Database establishing links between the various paths. As well, there is no existing signalling protocol for the reestablishment of an optical cross connect should the original cross connect fail. The other disadvantage of this approach is that the routers that want to be cross connected must have prior knowledge of each other. This is not a major issue in a small networking environment but becomes more significant in a multi-routing environment. New or existing topology discovery protocols would have to be deployed to extract topology and configuration information. For these reasons it is believed that setting up virtual routers is a more scalable solution. Virtual BGP routers The basic concept of virtual BGP routers is to treat each and every optical cross connect as a separate BGP router. The virtual router would have only one input port and one output port. It would also advertise itself independently of Router B with its own loop back address, and its own set of IP address for its interfaces . Contrary to a normal BGP multi-router configuration the virtual BGP router would not establish any IBGP connectivity even though it might be within Router B’s AS. It would act and behave as an independent router carrying its own set of routes, metrics, etc The use of a virtual router for each cross connect allows the use of standard BGP routing with virtually no modifications necessary to support optical lightpaths. In fact the virtual BGP router could be assigned its own private (or public) AS such that AS path metrics could be used for basic traffic engineering. By instantiating a virtual BGP router, the owner of the OXC can firstly establish optical cross connect between neighbours that reduce the load on its electrical forwarding engine. Over time it can reconfigure the virtual BGP router to interconnect with other neighbours if traffic patterns change. More importantly, the owner of the OXC can establish optical cross connects between neighbours without involving the neighbours in the configuration or decision process. More intriguingly the virtual BGP router can also be easily reassigned into other router’s AS domains. For example at an optical Internet Exchange point, each of the optical cross connects, operating as virtual BGP router could be assigned to the participating ISP’s AS domains rather than being part of a centrally administered AS. The main purpose of the BGP optical cross connect would be announce routes, perform route filtering and classification and provide standard BGP traffic network engineering capabilities to BGP peers. As there is only one input and output port, there is no need to create a forwarding table within the OXC. In essence with OBGP the BGP routing process is distributed around the edge of the network and is physically detached from the data forwarding process. A network’s routing processes are placed at the first optical ingress point in the network, while the packet classification and forwarding is done in the core of the network. One way of looking at this is that the wavelengths “stretches” the IO port from its physical location on the router to some ingress point on an optical switch. The principle advantage to this approach is that given a choice of multiple paths the virtual BGP router will advertise only the “best” path to following router based on well known BGP metrics. In the event of path failure, the BGP virtual router will then recalculate the next best path and advertise that new path through an NLRI UPDATE message. Static Configuration of Virtual BGP Peers Initially virtual OBGP routers could be established during network configuration as is commonly done today in BGP network environments. The routers could be configured with their appropriate peers, lightpaths, etc with no changes to the existing BGP protocols. All that would be required to create virtual BGP routers is to modify some open source BGP routing code so that it can directly reference an optical cross connect switch. Only one processor would be required but there may be several BGP processes running at the same time, each process representing a separate virtual BGP router for each optical cross connect. It is expected that for small network configurations traditional BGP static configurations would be used. However, as the networks grow in complexity it would be advantageous to have some degree of automatic configuration of the optical cross connect as virtual BGP routers. Automatic setup and configuration of BGP will also be useful for enterprise network managers as many of them do not have the skill sets to configure complex multi-linked, multi-homed BGP peerings. The focus of this paper, therefore, will be on the dynamic configuration of BGP peers. Dynamic Configuration of Virtual BGP Peers To date all BGP configuration is done manually when a network is setup. Obviously this approach will not scale for hundreds, if not thousands of BGP sessions, real or virtual. Dynamic establishment of BGP peering and configuration is required for large scale networks and peering sessions. Again it is important to stress that if a single management domain exists there are much better techniques such as MPLambdaS for the configuration and management of wavelengths. However, where there is no central network management authority, as for example, between autonomous domains at an optical IX or in a municipal dark fiber network then another approach is required. Large scale is very much a relative term. In a BGP large scale may mean only a dozen independent connections. OBGP networks are never expected to scale to the same size and complexity of interior networks of large service providers with thousands of switches and routers. BGP, as it is currently designed, does not setup or tear down lightpaths. The essence of this discussion paper is to propose that BGP peering request be used to setup optical lightpaths in recognition that multiple paths may exist between routers. The following scenario outlines one possible method of how a virtual BGP router would be dynamically established and configured: 1. Initially a set of “real” BGP routers is configured manually as is currently done today. 2. When the initial BGP OPEN session occurs between two manually configured routers, the optional information data field in the OPEN message would be used to exchange information about the number of lightpaths between 2 routers, the IP address of the optical ports (or alternatively the ITU frequency mapped to a predefined IP address), the framing protocol, the preferred destination and other relevant information. 3. Once either “real” router had determined there is a valid optical path between itself and 2 other routers, it could then instantiate a “virtual” BGP router. The virtual router independently would then establish a BGP peering session between itself and the other 2 routers by sending a standard BGP OPEN message. The virtual router would have its own interface and loopback addresses so that independent BGP sessions could be established with its BGP port 179. 4. If there a virtual peering session could not be established, or if there was a link failure the virtual BGP session can be left in IDLE mode waiting for another OPEN message, or the virtual BGP router can be terminated. 5. The same external NLRI address would be then advertised by both the real router and the virtual router. Addresses that are only advertised by the real router would not be advertised by the virtual router, because there would be no iBGP connectivity between the real and virtual router. 6. The routers on either side of the “virtual” router could then preference routes advertised by the “virtual” router, or alternatively the “virtual” router could assign higher metrics to its optical links than the optical links used by the real router. Let us look at this dynamic configuration scenario with a real world example. Returning to the configuration in Fig 5.0 assume initially the “real” Routers A and B establish a standard BGP peering session through some commonly agreed TCP channel such as an optical signalling channel or default wavelength. Initially each router has a verby simple configuration that will set up a single BGP session over the signalling channel or default wavelength as per the following setup: Router A Configuration: router bgp 100 network 170.10.10.0/24 interface Ethernet 0/1 (Default connection to Router B) ip address 1.1.1.2/30 (by definition red uses suffix x.x.x.2) neighbor 1.1.1.1 remote-as 200 (blue uses suffix x.x.x.1) interface Ethernet 0/2 (Connection to OXC) ip address 3.3.3.3/30 (by definition green uses suffix x.x.x.3) neighbor 3.3.3.4 remote-as unknown (yellow uses suffix x.x.x.4) Router B Configuration router bgp 200 network 180.10.10.0/24 interface Ethernet 0/1 (Default connection to Router A) ip address 1.1.1.1/30 (by definition blue uses suffix x.x.x.1) neighbor 1.1.1.2 remote-as 100 (red uses suffix x.x.x.2) interface Ethernet 0/2 (Default connection to Router C) ip address 2.2.2.3/30 (by definition red uses suffix x.x.x.2) neighbor 2.2.2.1 remote-as 300 (blue uses suffix x.x.x.1) Virtual Router Initial Configuration (potential OXC for green and yellow ) router bgp 200 interface loopback0 ip address 5.5.5.2/32 interface oxc 0/1 (green cross connect) ip address x.x.x.3/30 (by definition green uses suffix x.x.x.3) neighbor x.x.x.4 remote-as unknown interface oxc 0/2 ip address y.y.y.4/30 (by definition yellow uses suffix x.x.x.4) neighbor y.y.y.3 remote-as unknown Router C Configuration: router bgp 300 network 190.10.10.0/24 interface Ethernet 0/1 (Default connection to Router B) ip address 2.2.2.1/30 (by definition blue uses suffix x.x.x.1) neighbor 1.1.1.2 remote-as 200 (red uses suffix x.x.x.2) interface Ethernet 0/2 (Connection to OXC) ip address 4.4.4.4/30 (by definition yellow uses suffix x.x.x.4) neighbor 4.4.4.3 remote-as unknown (green uses suffix x.x.x.3) It is assumed that the virtual router is part of a Router B’s Peer Group and therefore can share the same update and routing policies in terms of advertisement of routes and so forth. In fact if the instantiation of the virtual router is on the same platform as the real Router B, it would make sense that both the real and virtual router share the same routing information database. In this example, let us assume the red/blue wavelengths are the default wavelengths. In the BGP OPEN session Router A would advertise to Router B that it has an IP port and green wavelength receiver and a yellow wave transmitter using a given framing protocol such as PPP over Ethernet, ATM or SONET. The BGP OPEN message from Router A may convey information something along the following lines. From Router A to Router B: OPEN AS 100 Loopback 6.6.6.1/32 (optional data fields in OPEN message - illustrative purposes only) (actual information would be abstracted to numeric data fields) interface Ethernet 0/2 (Connection to OXC) ip address 3.3.3.3/30 neighbor 3.3.3.4 remote-as unknown Router A does not generate an OPEN message for its unknown neighbor at this time. Similarly Router C in its BGP OPEN message can advertise to router B that it has an IP port and a yellow wavelength receiver and a green wavelength transmitter using a given framing protocol such as PPP over Ethernet, ATM or SONET. From Router C to Router B: OPEN AS 300 Loopback 7.7.7.1/32 (optional data fields in OPEN message - illustrative purposes only) (actual information would be abstracted to numeric data fields) interface Ethernet 0/2 (Connection to OXC) ip address 4.4.4.4/30 neighbor 4.4.4.3 remote-as unknown Router C does not generate an OPEN message for its unknown neighbor at this time. Asynchronously Router B in its OPEN messages to A and C conveys information that it is able to support optical cross connects. The message format might convey information along the following lines: From Router B to Router A OPEN AS 200 Loopback 5.5.5.1/32 (optional data fields in OPEN message - illustrative purposes only) (actual information would be abstracted to numeric data fields) interface loopback0 (indicates the presence of a virtual router) ip address 5.5.5.2/32 interface oxc 0/1 ip address x.x.x.4 (it can accept yellow ) interface oxc 0/2 neighbour x.x.x.3 update source loopback (looking for green receiver) From Router B to Router C: OPEN AS 200 Loopback 5.5.5.1/32 (optional data fields in OPEN message - illustrative purposes only) (actual information would be abstracted to numeric data fields) interface loopback0 (indicates the presence of a virtual router) ip address 5.5.5.2/32 interface oxc 0/1 ip address x.x.x.3 (it can accept green ) interface oxc 0/2 neighbour x.x.x.4 update source loopback (looking for yellow receiver) If the OXC supports tuneable lasers and filters then the “ip address” and “neighbour” address can be signified by using larger prefixes as described previously on using tuneable devices in OXCs. Upon receipt of both the BGP OPEN messages from Routers A and C, Router B can asynchronously decide to set up an optical cross connection between the 2 routers assuming there is a match in wavelengths and framing protocols. If the optical switch has wavelength tuneable lasers and filters then Router B can also decide create an optical cross connect by tuning a laser or filter on the optical switch to match the appropriate wavelengths. Rather than modifying the existing BGP code on Router B, it is envisaged that upon detecting the optional fields in the OPEN message the BGP process would spawn a new process that would instantiate the establishment of the virtual router and the necessary optical cross connects. For the purposes of this paper we call will call this process the Lightpath Router Arbiter – LRA. In this example then, the LRA process running on Router B sees that Router A is able to receive green wavelength on 3.3.3.3 and Router C is transmitting a yellow wavelength to an unknown neighbour 3.3.3.4. Similarly the LRA sees that Router C is able to receive a yellow wavelength on 4.4.4.4 and that Router C is transmitting a green wavelength to an unknown neighbour at 4.4.4.3. Router B’s LRA process then instantiates an optical cross connect between Router A and C by spawning “virtual” BGP router process on its CPU. Router’s B LRA would then create a configuration file for the virtual router from the information it received in the OPEN messages from Routers A and C. The configuration file for the virtual router might look something like the following: Virtual Router Configuration as created by LRA interface loopback0 ip address 5.5.5.2/32 interface oxc 0/1 ip address 4.4.4.3 (green from Router C) neighbour 4.4.4.4 update source loopback (yellow to Router C) interface oxc 0/2 ip address 3.3.3.4 (yellow from Router A) neighbour 3.3.3.3 update source loopback (green to Router C) While Router B is configuring it’s new virtual router, the LRA processes in routers A and C are rewriting their configuration statements using in the information provided in the options field of the OPEN message from Router B. The critical information is that exchanged is the “remote-as” number. Routers A and C update their configuration data with the “remote-as” number for router B and then perform a “soft re-boot” of the BGP process. (Many vendors’ BGP processes allow configuration information to be changed on the fly without resetting the entire BGP process) The virtual BGP router can then establish standard BGP peering sessions with Routers A and C. If the TCP session cannot be established or a BGP session cannot be established with either Router A or Router C, Router B can either decide to leave the virtual BGP router in IDLE mode or terminate the virtual BGP router entirely. If the establishment of the BGP session is successful the normal BGP UPDATE message would be used for the exchange of NLRI address updates. For example Router C would only see Router A’s routes (170.10.10.0) advertised through the virtual router B. Through the real router B it would see advertised both Router A’s routes (170.10.10.0) and Router B’s routes (180.10.10.). Router C could decide to preference routes from the virtual router or the virtual router could advertise its routes with a higher metric than routes from the real Router B. Data traffic then from Router C bound for Router A would flow over the optical path and data traffic bound for Router B would flow over the original optical path. OXC Link Failure It is important to note that loopback address for the virtual router which is used for BGP connectivity is not the same as the data forwarding address of the OXC. As such, under normal circumstances the virtual router would not be aware if there was a failure anywhere on the optical cross connect link between Routers A and C. It is important therefore that if a link failure is detected by the interface card on either Router A or C, that it immediately terminate the BGP session with its neighbour – the virtual router. The session can be terminated by either router A or C sending a BGP NOTIFICATION message to the virtual router. The virtual router, can then update its routing information database and send NLRI UPDATE messages to the other edge router indicating that the those addresses are unreachable across the OXC.. Once the problem has cleared, usually indicated by the availability of the previously failed interface the router can try to re-establish the link through the optical cross connect. To re-establish the link three routers can re-initiate the BGP sessions between the virtual router and routers A and C. The re-initiation of the BGP sessions can start immediately after the receipt of the NOTIFICAION message even before the link failure has been cleared. Because NLRI UPDATE messages can only originate from Routers A and C, the establishment of a link from Using BGP Updates for Topology Information In Fig 7.0 we see a more complex configuration. In this case the virtual router has to make a decision as to whether to cross connect Router D or C to Router A. Router B which controls the OXC would make the initial decision of whether to connect Router D or Router C to Router A. This decision could be made as to which cross connection will reduce the load on the electrical forwarding engine on Router B. However, if all else is equal and Router B has no preference on connecting either Router C or D to Router A then it would ideal if Router A could signal its own preference. A complicating factor is that Router A may not even know of the existence of Router C and Router D before the initial configuration. There are a number of possible techniques for Router A to signal to Router B its preferred connection: 1. 2. 3. 4. Static configuration at setup Establish at configuration knowledge of the destination router Use BGP UPDATE information such that Router A can make a dynamic decision Let Router A control the OXC on Router B and advertise a virtual router that is part of Router A’s domain Options 1 is clearly straight forward and the connection can be established as part of the standard configuration. Option 2 requires prior knowledge of the destination router by Router A but requires no further configuration setup. In this case Router A can signal to Router B in its OPEN message its desirability to connect to a specific router. Router B still retains the decision authority as to whom it will cross connect to serve its own basic needs. However, everything else being equal Router B can designate the virtual router to cross connect the routers as indicated in the OPEN message from Router A or C. The Loopback address or the actual interface of the designated router can be used to indicate the required destination. However it would be attractive to have a signalling protocol that allows Router A to indicated a cross connect preference to Router B without any prior knowledge of the other routers attached to Router B. More importantly it would be ideal if Router A can also change its preference over time. One possible technique is after initial configuration, Router A performs a route flap. If initially Router A is connected to Router C, but instead it would prefer to be connected to router D, it can terminate the existing BGP process with the virtual router. Router A only knows Router C and D by their ASs. Router C and D in effect could be an abstraction of an entire network. Router A then initiates a new BGP process by sending a BGP OPEN message to the virtual router but indicating in the options field that its preference is to connect to Router D ( and any other subsequent routers further down the path). The virtual router, in turn, would then close down its BGP session with Router C and initiate a BGP session with Router D. Recall at the very beginning that Router D has done a “soft reboot” of its BGP process and since then has been trying to establish a BGP session with the virtual router controlling the OXC. If Router D were also an integral OXC and router like Router B it could further propagate any special routing requests from Router A using the same technique of route flaps. Router B in its OPEN message with Router D could carry a list of ASs that Router A would prefer to have a direct optical cross connection. Using ASs for topology information means that in fact Router D could be in fact an abstraction of an entire cloud of routers. Router D then would carry Router A’s route preference across an iBGP cloud or MPLambdaS LSP to the egress router on the other side of the network cloud. Propagating that route preference can be done by a centralized LRA for the whole network cloud. The fourth alternative is for Router B to assign control of the actual OXC to router A and the choice of whether to connect to Router C or Router D can be made by Router A as now its network cloud touches directly the Router C and Router D network clouds. This is described further in the following section. Multiple OXCs and Assigning OXCs to other ASs One of the unique features of this configuration is that the one of the optical cross connects that is physically controlled by Router B is actually managed as Virtual Router in Router C’s AS space. The instantiation of the virtual router can be run on Router B’s or on Router C’s processor. However, in the later case a protocol would have to be established so that Router C could physically control the optical cross connect. TCP/IP management of each optical cross connect would be advantageous so that router in control of the optical switch can configure or control the optical cross connect, or hand it off management to another party. Router B also controls two other optical switches which may located elsewhere in the network matrix. By prior arrangement Router B would hand over to Router C the management of the optical cross connect related to the green and yellow wavelengths on both the input and the output of the optical switch. In this case Router C would advertise that it has yellow and green wavelengths to its virtual router VR1. The virtual router VR1 in turn would advertise that it can cross connect yellow, green, red and blue to both Router C and virtual router VR2. Similarly VR2 would advertise to VR3 that it can cross connect yellow and green. Finally VR3 would advertise to Router A that it can cross yellow, green, red and blue. The initial OPEN messages might look like as follows: From Router C to Router B: OPEN AS 300 Loopback 5.5.5.1/32 (optional data fields in OPEN message - illustrative purposes only) (actual information would be abstracted to numeric data fields) interface loopback0 (indicates the presence of a virtual router) ip address 7.7.7.3/32 interface oxc 0/1 ip address x.x.x.3 (indicating that it can cross connect green) neighbour x.x.x.3 update source loopback interface oxc 0/2 ip address x.x.x.4 (indicating that it can cross connect yellow) neighbour x.x.x.4 update source loopback interface oxc 0/3 ip address x.x.x.1 (indicating that it can cross connect blue) neighbour x.x.x.1 update source loopback interface oxc 0/4 ip address x.x.x.2 (indicating that it can cross connect red) neighbour x.x.x.2 update source loopback For Router B: OPEN AS 400 Loopback 12.12.12.1/32 (optional data fields in OPEN message - illustrative purposes only) (actual information would be abstracted to numeric data fields) interface loopback0 (indicates the presence of a virtual router) ip address 7.7.7.3/32 interface oxc 0/1 ip address x.x.x.3 (indicating that it can cross connect green) neighbour x.x.x.3 update source loopback interface oxc 0/2 ip address x.x.x.4 (indicating that it can cross connect yellow) neighbour x.x.x.4 update source loopback interface oxc 0/3 ip address x.x.x.1 (indicating that it can cross connect blue) neighbour x.x.x.1 update source loopback interface oxc 0/4 ip address x.x.x.2 (indicating that it can cross connect red) neighbour x.x.x.2 update source loopback In this example Router C would initiate a BGP OPEN message with VR2 and VR1. VR1 would establish a BGP OPEN message with VR2. VR2 would recognize from the BGP OPEN messages that it receives that it can only do a cross connect between Router A and VR2. VR2, therefore would not establish a BGP peering session directly with VR3 in Router C’s AS. Private ASs, confederations and route reflectors are a number of well known techniques that could be used for managing large cloud of BGP speakers. Lightpath Route Arbiter The LRA is the process that establishes the control of virtual BGP routers and the priorities for establishing the optical cross connects. The LRA can be integral to the “real” BGP process for small network configurations or be a completely separate entity running elsewhere on the network. The LRA can also represent to any other external networks a multiple router cloud as single router abstraction. In a small network configuration all the “real” routers can control the establishment of their collocated virtual BGP routers. A LRA process would be needed to keep track of how many cross connects are made to each peer. For example it would not make sense to have multiple cross connects between the same adjacent 2 routers. Using a set of metrics determined by the network operator the LRA might first establish one virtual router for every pair wise set of adjacent routers. If there were additional wavelengths or optical cross connects the LRA may establish cross connects and virtual routers for those links that required a direct multi-hop path for QoS or traffic engineering purposes. In large complex networks it maybe more convenient to have the establishment of the optical cross connects and the virtual BGP peering sessions under the control of a centralized Lightpath Route Arbiter (LRA). It is proposed that a LRA be established to act first in the role as traditional route server but also to assign lightpath cross connections and support the virtual BGP peering sessions. The virtual BGP peering sessions could actually run on the LRA and remotely control the optical cross connect switch from a distance. Initially it is postulated that individual BGP speakers will notify the LRA of their desired peering and interconnection requirements. This could be done manually In Fig 2.0 for example router A will advertise that it has at least 2 routes to Router B. Router B can also advertise to the LRA that a preferred path exists across the direct optical switch connection. In this situation Router B is in control of whether the more direct connection will be advertised to the LRA. This basic type of optical architecture has been used in IP over ATM network clouds to interconnect a number of BGP speaking routers. In the case of the ATM network the switched virtual circuits were not established until the routers had to forward a data packet. With a LRA and an optical network the light paths are setup when the initial BGP peering is established. In some cases the lightpaths can be established using Multi Protocol Lambda Switching. Upon receipt of either the BGP “OPEN” or “UPDATE” message the LRA can establish the appropriate light path between the BGP peers. The LRA confirms the establishment of the connection with the BGP “OPEN CONFIRM” message. It might also be possible for the LRA to use a dynamic host addressing mechanism to assign address to the various links between the BGP speaking peers. Upon receiving notification of peering requests from the various BGP speaking peers the LRA could establish the optical links that support those peering requests and then assign IP addresses to the end points of the link. The LRA could then also send the new BGP updates to all of the BGP peers. The LRA does not have to exist as a single entity. It can be a distributed function, but a master-slave relationship would probably be required to insure synchronization between distributed entities of the LRA. OBGP for traffic engineering across large networks Although OBGP could clearly work with the interconnection of a small number of autonomous systems the biggest challenge that remains is the management of large optical clouds operated by carriers and Tier 1 ISPs. Like any autonomous system it is assumed first that the operator will optimize the OBGP lighpath architecture to minimize load on their own routers and provide the most efficient routing for their own internal traffic needs. However if a network operator has a surplus of optical lightpaths or has established a business relation with other service providers to provide lightpath transit, it would then be useful to have a signalling protocol that would allow the automatic establishment of lightpath across several different management domains. These light paths could then be used for traffic engineering purposes or to support advanced Quality of Service delivery, because the path is made of up all optical cross connects. As mentioned previously there are several proposed techniques for the management and configuration of lightpaths within a single domain. IGP protocols are useful in gathering topology information within a single cloud. IGP protocols could be used to support OBGP within a single domain to determine the lightpath topology across a single network between all the real routers. Given that IGP topology a LRA could then try to establish the optical cross connects as BGP virtual routers in order to establish an all optical transit path between the egress and ingress points of the network. Alternatively the LRA could establish an MPLambdaS path between an OXC at the ingress and egress points of its network. The virtual BGP routers at the edge established in this manner could then be assigned to an external network’s AS. The external network would then not even see the intervening network as network at all! The external network could then touch directly external peers of the intervening network.!! Communication of Link Failure by TCP One of the disadvantages of virtual BGP routers is the fact that routing “keep alive” messages do not transit the same path as the data link. As such when a link failure occurs, the virtual router will have no knowledge of the failure and continue to advertise to the rest of the world that the data forwarding path is still valid. This results in a condition called “black holing”. To avoid “black holing” it is critically important that any link failure detection be communicated back to the virtual router so that it can quickly update the routing tables and propagate the routing changes across the network. Most link layer devices whether they be Gigabit Ethernet devices or SONET interfaces only communicate link failure information to the other end of the link through a private channel or through an absence of some sort of “keep alive” messaging protocol. In all-optical networks of the future where the switching and routing is done in the optical domain it will becoming increasingly important to communicate link failure information at the TCP layer back to the data forwarding router instead of through proprietary communication channels. . Future Optical Network Architectures based on BGP These new customer driven architectures using OBGP can be broadly categorized into 3 separate architectures: 1. Point to point mesh overlay on a dark fiber network; 2. Optical Internet Exchange; and 3. Distributed Optical Internet Exchange. They are described more fully in the following sections. Point to point mesh overlay on a dark fiber network To date, most carrier DWDM network solutions in the metro area have been intended primarily for relief of fiber congestion. However, another application for wavelength networks is not only to carry high bandwidth traffic but to establish mesh peering arrangements between consenting peers which would be impractical or impossible with other technologies. This concept is illustrated in Fig 9.0. University X in this example has purchased a dark fiber connections to a local ISP A. The institution is also dual home to another ISP B, who may be located in a different city altogether. In this case the dark fiber is mapped to an optical switch that then forwards the traffic over a wavelength to ISP B. However the fiber to ISP is shared with another institution and a mesh of wavelengths is established between the 2 institutions and the ISP. An optical add drop mux is configured as a virtual BGP router where the interconnection is made to University Y. This use of a virtual router technique provides all 3 organizations with alternate paths in the event of a failure in any of the directly meshed wavelengths between the institutions. The virtual router can be managed by either University X or University Y and be considered part of one of the institution’s domain. The physical location of the switch can be at any location where the physical fibers are connected to a router or switch. Another example of this type of architecture is where a number of institutions such as universities, hospitals and research institutions collaborate on the construction of a community based dark fiber network. Generally these institutions want to interconnect to each other for the exchange of traffic. In the past there were two ways of doing this: the first method was to build a mesh of physical fiber connections, but this approach would not scale beyond a small number of institutions; the second approach was to have centralized routing and switches for the termination of the fiber connections, but this required a central management organization to maintain and operate the devices. DWDM (or CWDM) provides for a third option. With multiple lightpaths institutions that are part of a community dark fiber network can set up a full or partial mesh of connections between themselves. With a small number of institutions sharing the same fiber ring the optical switching function can be distributed as show in Fig 10.0. The network may start off with only a partial mesh of wavelengths, but using virtual BGP routing additional light paths can be set up independent of any central administrative domain. Each of the Optical cross connects may be operated autonomously and independently of each other. It is assumed that the participating organizations will be motivated to establish direct optical cross connects rather than transiting another institutions traffic through and electrical forwarding engine router. A simple low cost router with an integrated optical switch would be less costly than a high performance router that aggregates and forwards all packets. The actual mesh architecture is indicated by the dotted lines. Note that some of the links are asymmetric in that only the forward path has direct optical cross connection. Institutions A and D as well B and C are directly cross connected. Traffic from A to C would have to go through the Router B rather than being cross connected. Traffic from C to A would take a completely different route through router D. Optical Internet Exchange Another variation on this type of architecture is an Optical Internet Exchange (OIX). Currently most Internet Exchange points use either combinations of ATM virtual circuits or switched Ethernet channels for the establishment of peering connections. These connections must be implemented by a central administrative organization across a switch matrix. An OIX can be configured as a ring structure with no central or hub switch for controlling the wavelengths as described previously or as star configuration with a simple central optical switch. The switch elements in this case be either centrally managed or individually controlled by the participating peering networks of the OIX as shown in Fig 10.0. The BGP configurations in Fig 10 and Fig 11 are identical. But now the virtual BGP optical cross connect router is located on a central switch but remains in the participating institution’s AS!!. The switch ports are, in effect an extension of the participating institutions AS and most importantly are under their respective control. An LRA is an optional device for large optical IXs where individual peers communicate with a central LRA to find out what lightpaths might be available across the switch. The LRA would assign the appropriate cross connection and run the virtual BGP process itself, or assign it to one of the participating real routers. The LRA establishes the cross connections between the respective ASs based upon the initial BGP peering requests either established with the BGP OPEN message or during the BGP configuration. Distributed Optical Internet Exchange and wavelength peering In this model a number of autonomous optical networks are interconnected as shown in Fig 12.0. Each optical switch (or group of switches) is configured as a BGP private or public autonomous system. Exterior connected networks can then establish wavelength routing policy across multiple optical domains using metrics such as shortest AS path and other common BGP attributes. In this architecture each autonomous system “exchanges” wavelengths with its neighbour and thereby extends its number of peers to many other networks even though they may not be physically adjoining each other. This exchange of wavelengths is generally only possible with networks that control and manage their own dark fiber and have deployed CWDM (or DWDM) systems. For example, one of the first proposed practical demonstrations of OBGP is shown in Fig 13.0. In this architecture the RISQ network would offer a transit wavelength using OBGP across its network to ONet such that ONet can directly peer with networks in New York and ONet in turn offers an OBGP transit wavelength across its network such that RISQ can directly peer with networks in Chicago. If RISQ could configure its optical switches such that “virtual” BGP routers could be in ONet AS. ONet could then end up with a direct peering relationship with networks in New York. In addition ONet may offer OBGP transit wavelengths to both RISQ and BCnet such that they can directly peer with each other. Again ONet, if it so choose could assign the “virtual” BGP routers to RISQ and BCnet’s respective ASs such that it would appear to their respective networks that have physically adjoining networks. In Fig 13.0 also shows a proposed real world example of a distributed management domain. The STAR TAP currently is single exchange point for research networks. It has been proposed that it become a distributed facility in order to offer trans-continental interconnection for research networks between Europe and Asia. However, the STAR TAP itself does not operate any trans-continental circuits. In order to accomplish this it is proposed that CA*net 3 in partnership with its regional networks would offer “lightpath transit” to New York and Settle. The same STAR TAPAS 10764 would be distributed in 3 separate physical locations and its routing ingress points would be physically distributed to the OXCs at New York and Seattle. . Wavelength Exchange, Transit and Peering The Internet in the year 2000 is increasingly being dominated by a small number of major network Tier 1 ISPs. In addition to these developments there still remains no satisfactory solution for equitable settlement and cost sharing between interconnected ISPs whether they are Tier 1 or not. There is a concern that the richness and creativity of the Internet due to a large and varied number of participating networks may be disappearing. The rapid evolution of the Internet from a global set of autonomous peering networks to a very small set of hierarchical Tier 1 Internet service and content providers threatens the very essence of the Internet that made it such a powerful agent of innovation and social change. As David Farber, Chief Technologist for the FCC described in a recent interview there is a danger of a new digital divide, but not just the familiar kind that involves race and social class. Of equal concern, is the possibility that economically disadvantaged entrepreneurs would be prevented from succeeding on the Web. As single companies begin to control both Internet content and systems for gaining access to that content, such as cable television lines, the Web could resemble a shopping mall with no prime space available for start-ups with no money. CWDM (DWDM) networks and optical BGP (OBGP) may have the potential to circumvent these developments. Recent advanced in optical technology will soon allow optical fibers to carry hundreds, if not thousands of wavelengths. As well all extreme wide band optical technologies may enable long haul DWDM networks that require no electrical repeaters. Extending these technologies to the enterprise customer through wide spread deployment of dark fiber may allow network customers to be directly interconnected and operate their own wavelength services between themselves without any central carrier managed network services. With OBGP, wavelength topologies are established by the customer during configuration of the establishment of BGP peers or neighbours. Wavelength networks can be established without any interior routing protocol or network management service. The optical network can be configured using well known techniques of BGP, private AS domains, iBGP route re-distribution and so on. Rudimentary network policy and traffic engineering can be achieved with well known use of BGP metrics. As such traditional network carrier services such as restoral, protection, network configuration, management and provisioning may not be necessary in future optical Internet networks. Future telecommunication networks may be simply only require an operator to maintain the integrity of the fiber and any all optical repeaters that may exist along the path of the fiber infrastructure. These new approaches in optical network outlined in this paper, combined with the readily availability of dark fiber may allow Tier 2 and Tie 3 Internet service providers to interconnect with each other and largely bypass the large dominant existing Tier 1 service providers. University research and education networks may be in the best position to first to exploit these new concepts in optical networking. Many research and education networks in Canada are in the process of deploying their own fiber networks and operating their own CWDM networks. Rather than deploying a traditional hierarchical network architecture to interconnect these networks (e.g. CA*net 3) they may be able to exchange wavelengths for both interconnection and mutual exchange of transit traffic to other optical networks. The issues of equitable transit and peering charges between service providers may also be mitigated through the use of OBGP routing and exchange of wavelengths. Because wavelengths are of fixed bandwidth the usual difficulties of guaranteeing transit traffic across an intermediate ISP become less onerous. In fact wavelength exchange may be a much simpler and effective accounting mechanism at ISP interconnection and peering points. For example peering ISPs may agree to provide to each transit wavelengths by distributing an eBGP network across an iBGP optical cloud. The only settlement issue is the difference in length and width of the wavelength that is transited across the respective networks. Settlement costs can be easily calculated for this type of service. Through the large interconnection of many adjacent BGP speaking peers of “customer empowered” autonomous optical networks the original vision of the Internet may eventually be realized. By extending this model to a much larger scale of interconnecting hundreds of Tier 2 and Tier 3 ISPs and major customer networks a global mesh of optical Internets may be possible such that no single network has a dominant market position for the delivery of Internet services. Conclusion OBGP may prove to be a simple and effective way for customers to manage and operate their own optical network independent component of a larger carrier’s optical network infrastructure. BGP networking is more in with the general Internet engineering principle that “good enough” is usually more economical and effective than more complex network management and configuration. Many questions remain on the practicality and scalability of these concepts. Perhaps these concepts will only work on a small scale between trusted network entities like university and research networks. Fundamental flaws may also exist that will simply undermine any practical deployment of these networks. It is hoped that this short discussion paper will provoke debate and discussion and a more thoughtful analysis within the Internet technical community. . References: KRA00 Kompella, K., Rekhter, Y., Awduche, D., et al, “Extensions to ISIS/OSPF and RSVP in support of MPLambdaS”, draft-kompella-mplsoptical-00.txt, http://www.ietf.org/ietf/1id-abstrsacts.txt ARD99 Awduche, D., Rekhter, Y., Darke, J., Coltun, R., “Multi-protocol Lambda Switching: Combining MPLS Traffic Engineering Control with Optical Crossconnects”, draft-awduche-mple-te-optical-01.txt, http://www.ietf.org/ietf/1id-abstrsacts.txt ATM96 ATM Forum, “Private Network-Network Interface Specification: Version 1.0”, March 1996, http://www.atmforum.org BCK98 Bates, T., Chandra, R., Katz, D., Rekhter, Y., “Multiprotocol Extensions for BGP-4”, RFC 2282, February 1998. STB00 St. Arnaud, B., Turcotte, B., Bjerring, A., “CA*net 3 Customer Empowered Networks”, LaRecherche, February 2000, www.canet3.net CCM98 Chung, TW, Coulter, J, Fitchett, J, Mokbel, S, St. Arnaud, Bill, “Architectural and Engineering Issues for Building an Optical Internet”, September 1998, http://www.canet3.net/frames/papers.html H97 Halabi, B., Internet Routing Architectures, Cisco Press, Indianapolis, IN, 1997 HC99 Handley, M., Crowcroft, J., “Internet Multicast Today”, Internet Protocol Journal, Volume 2, Number 4, December 1999 LRA00 Luciani, J., Rajagopalan, B., Awduche, D., Cain, B., Jamoussi, B., “IP over Optical Networks – A Framework”, draft-ip-optical-framework00.txt, September 10, 2000, http://www.ietf.org/ietf/1id-abstrsacts.txt RS98 Ramaswami, R., Sivarajan, K., Optical Networks: A Practical Prespective, Morgan Kaufmann, San Francisco, CA, 1998