Advanced Optical Technologies and Their Potential Deployment Timeframes Peter Tomsu (1), Graca Carvalho (2) (1) Cisco Systems Austria, Email: ptomsu@cisco.com (2) Cisco Systems Portugal, Email: gcarvalh@cisco.com Table of Contents Table of Contents ............................................................................................................ 1 Summary......................................................................................................................... 1 Motivations For New Advanced Optical Technologies ..................................................... 2 Why Optical rather than IP Networks for Grids ? ......................................................... 3 Dynamic Optical Networks as a Grid Service Environment .......................................... 4 Optical Technologies Suitable for Grid Requirements ..................................................... 6 Dynamic Wavelength Switched Optical Infrastructures for Grid ................................... 6 Optical Wavelength Switching with Overlay Control ...................................... 6 Optical Wavelength Switching with Extended GMPLS Control ...................... 7 Active Optical Burst Switching for Grid Services Support ............................................ 8 References.................................................................................................................... 11 Summary Next generations of advanced optical networks will not only see a tighter coupling of the control plane with IP, they also need to satisfy all requirements coming from bandwidth intensive users and applications, as well as distributed computing and computational Grids. This paper discusses the change from today’s widely used static Optical Wavelength Switching (OWS) networks to different new architectures based on overlay controlled OWS networks, to peer controlled OWS networks and finally highlights the approach to further increase in the granularity of network resources by the introduction of Optical Burst Switching (OBS). The most important developments and extensions to existing technologies are listed as well as timeframes for the maturity and deployment of these new advanced technologies are described. Motivations For New Advanced Optical Technologies During the past years it has become evident to the technical community worldwide that computational resources cannot keep up with the demands generated by some scientific applications. As an example, particle physics experiments [i,ii] will produce more data than can be realistically processed and stored in one location (i.e. several Petabytes/year and soon Exabytes/year, i.e. ATLAS- http://atlas.ch CMS - http://cmsinfo.cern.ch EGEE - http://public.eu-egee.org In such situations where intensive computation of shared largescale data is needed, one can use accessible computing resources distributed in different locations (Grid computing). Distributed computing and the concept of a computational Grid is not a new paradigm but until a few years ago networks were too slow to allow efficient use of remote resources. As the bandwidth and the speed of networks have increased significantly, the interest in distributed computing has been taken to a new level. The use of the available fiber and DWDM infrastructure for the global Grid network is an attractive proposition ensuring global reach and huge amounts of cheap bandwidth [iii,iv]. It is important to understand the potential applications and the community that would make efficient use of an optical infrastructure. Currently, Grid computing using optical network infrastructure is dedicated to a small number of wellknown organizations with extremely large jobs (e.g. large data file transfers between known users or destinations [iii]). Due to the static or semi-static nature of this type of Grids, long-lived wavelength paths between clients and Grid resources with centralized job management strategies are usually deployed (i.e. Lambda Grids, OptIPuter- http://www.optiputer.net/) where the optical network act as a backplane interconnecting computing resources. This type of Grid networking relies on carrier provision of optical network resources while the Grid users have no visibility of the lambda infrastructure. In other words, the Grid user is not able to setup paths over the optical Grid network. As Grid applications evolve, the need for user controlled network infrastructure is apparent in order to support emerging dynamic and interactive services [v]. Examples of such applications may be e-health, immersive interactive learning environments, highdefinition interactive TV, high resolution home video editing, real-time rendering, visualization, etc. These applications need infrastructures that makes vast amount of storage and computation resources potentially available to a large number of users. Key for the future evolution of such networks is to determine early on the technologies, protocols, and network architecture that would enable solutions to these requirements. Why Optical rather than IP Networks for Grids ? In order to understand why optical networking is important for Grids, we need also to understand the current limitations of packet switching for Grid and data-intensive applications. The current Internet architecture is limited in its ability to support Grid computing applications and specifically to move very large data sets. Packet switching is a proven efficient technology for transporting burst transmission of relatively short data packets, e.g., for remote login, consumer oriented email and web applications. It has not been sufficiently adaptable to meet the challenge of large-scale data as Grid applications require. Making forwarding decisions every 1500 bytes is sufficient for standard applications like emails or 10k -100k web pages. If we have to deal with typical Grid applications, this is not the optimal mechanism as soon as we have to cope with data size of six to nine orders larger in magnitude. For example, copying 1.5 Terabytes of data using packet switching requires making the same forwarding decision about 1 billion times, over many routers along the path. Setting circuits or burst switching over optical links is a more effective multiplexing technique. It can also be assumed that similar types of large data volumes will be used by some standard business applications in the future, so the need for new technologies to deal with optimized forwarding of high amounts of data will arise from different sides, not only Grid networking. TCP is designed for relatively small data packets and thus works well in small Round Trip Time (RTT) and small pipes. It was designed and optimized for LAN or narrow WAN. TCP limitations in big pipes and large RTT are well documented [vi,vii]. The responsiveness is the time to recover from a single loss. It measures how quickly it goes back to using a network link at full capacity after experiencing a loss. For example, 15 years ago, in a LAN environment with RTT=2ms and 10Mbs the responsiveness was about 1.7 ms. In today’s 1Gbs LAN with RTT, if the maximum RTT is 2ms, the responsiveness is about 96 ms. In a WAN environment where the RTT is very large the RTT from CERN to Chicago is 120ms, to Sunnyvale it is 180ms, and to Tokyo 300ms. In these cases the responsiveness is over an hour. In other words, a single loss between CERN and Chicago on a 1Gbs link would take the network about an hour to recover. Between CERN and Tokyo on a 10GE link, it would take the network about three hours to recover [viii]. In order to address some of the above packet switching limitations, new transport protocols have started to evolve. Examples are GridFTP FAST, XCP, Parallel TCP, and Tsunami. The enhancements in these protocols are done via three mechanisms: adjusting the TCP and UDP settings transmiting over many streams sending the data over UDP while the control is done in TCP. Transmitting over TCP without the enhancements results in about 20Mbs over the Atlantic. Recent tests have seen GridFTP to achieve 512Mbs , Tsunami at 700Mbs , and in April 2003, FAST achieved 930Mbs from CERN to SLAC [ix]. None of the above protocols can fully utilize OC-192 links, which can only be achieved by statistically multiplexing multiple streams using the above protocols. GMPLS controlled optical networks, on the other hand, do not impose any limitations due to transmission speed, so today they can already directly support OC-192 and can offer the capability to provide bandwidth services, dynamically controlled by individual users/applications at the wavelength or sub-wavelength level. Based on this capability, it is likely that future data-intensive applications will request the optical network to provide a connection on a private network (ie. an Optical VPN (OVPN)) and not on the public Internet. The network infrastructure will have the intelligence to connect over the IP network (packet) or to provide a λ (circuit) to the applications. The lambda switched optical network will also offer improved security compared to the traditional IP network. A l service provided through OGSI [x] will allow Virtual Organizations (VO) and virtual laboratories (i.e eVLBI) to access abundant optical bandwidth through the use of optical bandwidth on demand to data-intensive applications and compute-intensive applications. This will provide essential networking fundamentals that are presently missing from Grid Computing research and will overcome the bandwidth limitations, making VO a reality. [viii] S. Ravot, Y. Xia, D. Nae, X. Su, H. Newman, J. Bunn,” A Practical Approach to TCP high Speed WAN Data Transfers”,in proceeding of Broadnet 2004 conference, Oct 2004, San Jose, CA, USA [ix] “Research & technological development for a Data TransAtlantic, Grid,” See www.datatag.org [x] Tuecke, S., Czajkowski, K., Foster, I., Frey, J., Graham, S., Kesselman, C. and Nick, J., Open Grid Service Infrastructure. See: http://www.gridforum.org/ogsi-wg/ Dynamic Optical Networks as a Grid Service Environment Optical networks can be viewed as "network resources" to be offered as services to the Grid like any other resources such as processing and storage devices. The Open Grid Services Architecture (OGSA) initiative [xi] defines the semantics of Grid service instances including service instance creation, naming, lifetime management and communication protocols (this architecture can be seen in Figure 1). Figure 1: Grid Services as defined by OGSA Up to now, few people in the Grid community thought of the network as a resource in the same way as processing or storage. They are inclined either to view the network as a bottleneck or, if bandwidth resources are plentiful, to take the network for granted without the need for bandwidth reservation for their applications [iii]. If optical networks are considered as network resources to be shared among Grid users in a collaborative manner, one needs to specify exactly what are meant by optical network resources, how to encapsulate these resources into services, how to manage these services and how to share these resources among VOs. Figure 2 shows the encapsulation of network resources in different network services. Figure 2: Encapsulation of Optical Network Resources Currently, Grid applications can have visibility of some IP network parameters such as available bandwidth, delay, packet loss, packet reordering and jitter. An example of such an approach is the GARA broker http://www.globus.org/retreat00/presentations/i_volker_qos_slides/tsld004.htm. In optical networks, resources may include a much broader set of parameters, such as optical cross-connect (OXC) or another photonic switching device (i.e. OBS, OPS), a fibre, a wavelength, a waveband, a generalized label, an optical timeslot, an interface, etc. These and other choices are normally coupled tightly with the intended application. Whatever the choices, it is recognised that an optical resource (as defined) will involve two or more network entities, not wholly contained within a single network element. This makes the situation more complex than at present, since any reservation and allocation must involve the cooperation of more than one network element. Other Grid services such as processors, storage devices can be simply controlled and allocated (booked, reserved) by one network element without external constraints. Optical Technologies Suitable for Grid Requirements Dynamic Wavelength Switched Optical Infrastructures for Grid The first approach to achieve the required functionality for Grids is to deploy extensions to existing commercial wavelength switched networks aiming to facilitate user controlled bandwidth provisioning to data intensive Grid users. This can be realised by the development and deployment of user controlled optical switching, distributed optical control plane and dedicated user and application interfaces. Optical Wavelength Switching with Overlay Control One possible solution is to manage the Grid and optical layer resources separately in an overlay manner in the state of the art way of Grid networking today. In this scenario, Grid resources are managed by Grid middleware, which is be responsible for Grid resource brokering and reservation. Once the Grid resources are identified, the Grid applications request high-bandwidth connections to reach the required resources via a standard Optical UNI (OUNI) in a typical overlay implementation. The provisioning of the light path is then managed in the optical network layer by the GMPLS enabled optical control plane, which supports distributed control of the optical network and end to end dynamic bandwidth provisioning. An Application Program Interface (API) provides connectivity between the Grid middleware and optical control plane (e.g. GMPLS). The Grid middleware acts as an intermediary between user applications and wavelength-routed network control plane. “A traditional” Grid broker discovers and schedules resources and services not taking into account optical network resources. Currently, Grid computing using lambda optical infrastructure is dedicated to a small number of wellknown organizations with extremely large jobs (e.g. large data file transfers between known users or destinations). Due to the static or semi-static nature of this type of Grids, long-lived wavelength paths between clients and Grid resources with centralized job management strategies are usually deployed (Lambda-Grids). This type of Grid networking relies on carrier provision of optical network resources while the Grid users have no visibility of the lambda infrastructure. In other words, the Grid user is not able to setup paths over the optical Grid network. As Grid applications evolve, the need for user controlled network infrastructure is apparent in order to support emerging dynamic and interactive services. Examples of such applications may be high resolution home video editing, real-time rendering, highdefinition interactive TV, e-health and immersive interactive learning environments. These applications need infrastructures that makes vast amount of storage and computation resources potentially available to a large number of users. Key for the future evolution of such networks is to determine early on the technologies, protocols, and network architecture that would enable solutions to these requirements. Figure 3: OWS OTN running GMPLS and interfacing via OUNI In this scenario Grid resources are managed by the Grid middleware, which is responsible for Grid resource brokering and reservation. Once the Grid resources are identified, then a standard OUNI is used to request high-bandwidth connections to reach the resources (see Figure 3). The provisioning of the light path is then managed in the optical network layer by the optical control plane (GMPLS), which supports end to end dynamic bandwidth provisioning. In this scenario an Application Program Interface (API) has to be deployed to provide connectivity between the Grid middleware and the optical control plane (e.g. GMPLS) through an OUNI. Optical Wavelength Switching with Extended GMPLS Control A further improvement of the optical infrastructure is to extend the optical control plane (GMPLS) for Grid resource provisioning. In this approach the bandwidth reservation and provisioning mechanism within the optical control plane (e.g. GMPLS) must be extended to perform both optical resource and Grid resource brokering and reservation. In this scenario an enhanced OUNI at the network edges (called the Grid Optical User Network Interface - G-OUNI) participates on behalf of applications in resource discovery and allocation mechanism functions supported by the control plane of the core wavelengthrouted network. Thus an API is necessary for user and applications to access the resource reservation and discovery mechanism in the optical control plane. The user in this scenario can trigger a unified optical and Grid resource provisioning mechanism via the G-OUNI under control from the Grid connectivity API. The Grid information system is integrated with the GMPLS control plane (GMPLS plane extension) and the Grid broker is extended with optical network resources. The Grid broker schedules resources and services taking into account the optical network resources. In OUNI controlled OTNs users and applications do not have detailed view of the optical network resources, thus it is impossible to handle resource information (optical network and Grid resources) in an efficient way. The use of G-OUNI is a natural evolution to OUNI alone, where optical network resources and Grid resources are provisioned together. In this scenario (see Figure 4) users are granted control of light-path provisioning. This can be achieved through the development of: Extended O-UNI (G-OUNI, Grid-OUNI) Extension to GMPLS to incorporate Grid resource information Dedicated API between the Grid broker and the GMPLS plane The G-OUNI on behalf of the application participates in resource discovery and bandwidth allocation functions supported by the control plane of the core wavelengthrouted network. An API is necessary for users and applications to access a unified optical and Grid resource provisioning mechanism. Resource reservation and discovery is performed by the optical control plane, bandwidth allocation and reservation is supported through the G-OUNI. Figure 4: OWS OTN running Extended GMPLS and interfacing via G-OUNI For each request submitted to this network by the application/user, the extend control plane discovers optical and Grid resources and informs the user set of available solutions and their associated cost. The Grid user then can ask for particular resources to be provisioned by optical network through the API and G-OUNI. The two aforementioned scenarios rely on a dynamic optical layer deploying OXCs that can switch wavelengths all-optically between trunks and tributary fibers. One advantage of these solutions is that a lot of the required hardware (lambda XC) and control (GMPLS) technologies exist already today, or are low risk extensions of current technologies. The drawback is that these scenarios offer bandwidth granularity and switching at wavelength level and thus are still only suitable for data-intensive applications with long lasting bandwidth relationships. They are not suitable for interactive applications involving a large number of nodes or applications with short-lived Grid relationships. Active Optical Burst Switching for Grid Services Support In order to address the scalability issues associated with lambda switched networks, optical burst switching (OBS) needs to be implemented in the Grid environment. Many experts in the networking research community believe that Optical Burst Switching (OBS) can meet the needs of the scientific community in the near term, where these requirements will appear first (in 2-3 years). For clarification, the 2-3 years timescale is relevant to early adopters such as Universities, government institutions, research networks (usually the same organizations pushing the technology envelope to meet their un-met applications' requirements), pre-standardization. The Grid community seems to fit this definition. Large carrier deployment for the public arena will come later, in practice, since network management and standards need to be in place prior to widespread deployment. OBS combines the advantages of circuit and packet switching technologies [xii]. The fundamental premise of OBS is the separation of the control and data planes, and the segregation of functionality within the appropriate domain (electrical or optical). Prior to data burst transmission a Burst Control Packet (BCP) is created and sent towards the destination by an OBS ingress node (edge router). The BCP is typically sent out of band over a separate signalling wavelength and processed at intermediate OBS routers. It informs each node of the impending data burst and initiates the setup an optical path for its corresponding data burst. Data bursts remain in the optical plane end-to-end, and are typically not buffered as they transit the network core. The bursts’ content, protocol, bit rate, modulation format, encoding are completely transparent to the intermediate routers. The main advantages of the OBS in comparison to the other optical networking schemes are that: Unlike the optical wavelength switched networks the optical bandwidth is reserved only for the duration of the burst Unlike the optical packet switched network it can be buffer-less. The OBS technology has the potential to bring several advantages for Grid networking: Native mapping between bursts and Grid jobs: the bandwidth granularity offered by the OBS networks allows efficient transmission of the user’s jobs with different traffic profiles Separation of control and data plan: this allows all-optical data transmission with ultra-fast user/application-initiated light-path setup Electronic processing of the burst control packet at each node: this feature can enable the network infrastructure to offer Grid protocol layer functionalities (e.g. intelligent resource discovery and security). Both OWS with OUNI and with G-OUNI are relying on wavelength granularity to support user demands. As mentioned earlier there are emerging Grid applications that could benefit from dynamic optical networking at sub-wavelength granularity. This solution utilizes optical burst switching (OBS) and active router technologies (see Figure 5). It helps to provide a physical infrastructure able to fulfill both existing data-intensive and emerging Grid application requirements and make efficient use of network resources. In this network scenario the optical network topology can be programmed by Grid users and services. The architecture is based on the novel concept of using active OBS routers for resource discovery and routing of the Grid jobs to the appropriate resources across the network. Further extensions to the APIs need to be developed to fit with the active OBS concept. Furthermore, the unique features of an active OBS network (i.e. programmability) can be used to extend the OBS transport protocol and offer light-path and bandwidth resources with more attributes (i.e. Constraint based routing, dynamic resilience) to Grid applications and services. Figure 5: OBS Network running modified JET and interfacing via an OBS dedicated GOUNI Beyond the existing OBS concept, new solutions in the area of photonic Grid networking can bring extensive enhancements. These solutions utilize optical burst switching and new ways to control the OBS network via active router technologies. It allows a physical infrastructure to fulfill both existing data-intensive and future Grid application requirements and makes efficient use of network resources. References [i] Information about the Large Hadron Collider at CERN: lhc-new-homepage.web.cern.ch, Information about the BarBar experiment: www.slac.stanford.edu/BFROOT/ [ii] Harvey B. Newman, Mark H. Ellisman, and John A. Orcutt, “Data Intensive E-Science Frontier Research,” Special Issue, Communications of the ACM, “Blueprint for the Future of High Performance Networking,” Nov. 2003, Vol. 46, No. 11, pp. 68-77 [iii] D. Simeonidou et al., “Optical Network Infrastructure for Grid”, Grid Forum Draft , GFD-I.036, Oct 2004 [iv] “The GRID2, Blueprint for a New Computing Infrastructure”, 2nd Edition, Ian Foster and Carl Kesselman, Eds, Morgan Kaufmann Publishers, Elsevier Press, 2004 [v] E. Van Breusegem, et. al., An OBS Architecture for Pervasive Grid Computing, The IEEE Global Telecommunications Conference (Globecom) 2004, Dallas Texas [vi] W. Feng and P. Tinnakornsrisuphapá, “The Failure of TCP in High-Performance Computational Grids,” Proceeding of SC2000: High-Performance Network and Computing Conference, Dallas, TX, November 2000 [vii] S. Floyd, “HighSpeed TCP for Large Congestion Windows,” IETF RFC 3649, December 2003 [viii] S. Ravot, Y. Xia, D. Nae, X. Su, H. Newman, J. Bunn,” A Practical Approach to TCP high Speed WAN Data Transfers”,in proceeding of Broadnet 2004 conference, Oct 2004, San Jose, CA, USA [ix] “Research & technological development for a Data TransAtlantic, Grid,” See www.datatag.org [x] [xi] Tuecke, S., Czajkowski, K., Foster, I., Frey, J., Graham, S., Kesselman, C. and Nick, J., Open Grid Service Infrastructure. See: http://www.gridforum.org/ogsi-wg/ Foster, I., Kesselman, C., Nick, J., Tuecke, S., “The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration,” Open Grid Service Infrastructure WG, Global Grid Forum, June 22, 2002 [xii] C. Qiao, M. Yoo, “Optical Burst Switching - A new Paradigm for an Optical Internet”, Journal of High Speed Networks, Spec. Iss. On Optical Networking, vol. 8, no. 1, Jan. 2000, pp. 3644