10Gigabit Ethernet, LAN, MAN, and WAN Dr. Paul Chen paulpchen2k@yahoo.com 10 Gigabit Ethernet, LAN, MAN, and WAN NTU video students can call in for questions or live discussion at 214-768-3068 ? Summer 2004 Dr. Paul Chen 2 Points of Contact Electronic distribution and collection of homework, examination, term project report, grades: Mr. Gary McCleskey in 329B Caruth Hall, 214-768-3108, garym@engr.smu.edu TA: Please Turn Off Your Cell Phone and Pager During the Lecture! Summer 2004 Dr. Paul Chen 3 Reference Books “Gigabit Ethernet”, by Jayant Kadambi, Ian Crayford and Mohan Kalkunte, Prentice Hall “Ethernet-based Metro Area Networks”, Daniel Minoli, Peter Johnson, Emma Minoli, McGraw-Hill Switched, Fast and Gigabit Ethernet, Robert Breyer, Sean Riley, Ziv Davis Press Summer 2004 Dr. Paul Chen 4 More Reference Books Ethernet: The Definitive Guide, by Charles Spurgeon, Chuck Toporek, published by O’Reilly & Associates Network Troubleshooting, by Othma Kyas, published by Agilent Technology, Jan. 2002 ATM Theory and Applications, David McDyson, Darren Spohn, McGraw-Hill Introduction to DWDM Technology, Stamatios V. Kartalopoulos, IEEE Press Summer 2004 Dr. Paul Chen 5 Course Grades Home work – 30% To help you understand the subject material, it is your responsibility to do the homework assignments. Examinations – 40% To verify how much you really understand. Research Paper – 30% Additional opportunity to show and share what you learned from the course. Summer 2004 Dr. Paul Chen 6 10Gigabit Ethernet Calendar - Summer, 2004 May 29 June 12 June 22 June 26 July 3 July 10 July 17 July 24 Aug 7 First Class, Introduction No life class, video tape will be sent Homework # 1 due HW # 1 Discussion Holiday (No class) Mid Term Exam HW # 2 due & Discussion Research paper due Last Day – research paper presentation Introduce yourself – background, major, work experiences, etc. Summer 2004 Dr. Paul Chen 7 Research Paper Subjects 10G Ethernet based Metropolitan Area Network 10G Ethernet vs.. SONET vs.. RPR 10G Ethernet based Wide Area Network Solution for a real life problem in your work or environment Summer 2004 Dr. Paul Chen 8 Research Paper Rule No more than 10 pages including the cover page, figure, table of contents. Clearly state the purpose or the goal of this research paper Summer 2004 Dr. Paul Chen 9 Reference Web Sites www.10gigabit-ethernet.com www.10GEA.org www.metroethernetforum.org www.rpralliance.org 10GEA site will also point to Gigabit Ethernet Forum Summer 2004 Dr. Paul Chen 10 Course Introduction This course provides technical details of several generations of Ethernet technology and their practical applications in the real world. With the increasing number of internet users and ecommerce, Ethernet has evolved to higher speeds (10 M/100 M/Gigabit/10 Gigabit) and higher performance / throughput (from shared to switched). Applications of Ethernet in LAN, MAN and even WAN are life examples of how a technology can be widely deployed due to its simplicity and compatibility of new and existing generation of Ethernet. Summer 2004 Dr. Paul Chen 11 Course Introduction (continued) Mixing and matching of different technologies to deliver end-to-end services. In another word, Ethernet, ATM, SONET and DWDM are major components of an integrated higher speed network to provide the efficient data services around the world. We will cover subjects such as background information of Ethernet and how it evolves to become the dominant technology of choice to serve the computer network needs. Summer 2004 Dr. Paul Chen 12 Use of LAN and MAN Network for companies to provide resource sharing, high reliability, money saving, convenience, portability (wireless and wire line), etc. Network for people to deliver e-mail, video conference, remote access, news group, work group, etc. With the introduction of voice over IP (packetized voice), the issues of Quality of Service (QoS), delay-sensitive vs.. nondelay-sensitive services, loss of packets, service protection through failure recovery schemes, etc. need to be addressed. Competition among leased lines, Ethernet based optical interface, ATM, and frame relay, etc. Summer 2004 Dr. Paul Chen 13 LAN LAN is a privately owned network within a single building or campus of up to a few kilometers. Major LAN technologies include: - Ethernet or CSMA/CD which operates at 10 M bps - Fast Ethernet that operates at 100 M bps - Gigabit Ethernet that operates at 1000 M bps - Token ring networks which operate at 4 and 16 M bps - Other variations such as token bus network which is no longer in use Summer 2004 Dr. Paul Chen 14 MAN MAN covers a group of corporate offices or a city and may be publicly owned. The well known MAN includes: - DQDB (Distributed Queue Dual Bus) by IEEE - SMDS (Switched Multi-megabit Data Service) by Bellcore (Telecordia). SMDS is based on IEEE DQDB technology Both DQDB and SMDS are not in use today. DQDB is used by equipment vendor for backplane interfaces. Gigabit and 10G Ethernet are targeting this market. Summer 2004 Dr. Paul Chen 15 WAN A WAN covers a large geographic area (a country or a continent). A WAN usually operates in point-to-point, storeand-forward, or packet switched. Examples of WAN services include: X.25, ISDN, etc. X.25 service is popular in Europe and Japan but failed to catch on in the States. ISDN services did not live up to the expectation. 10G Ethernet in combination with SONET and RPR is targeting this market. Wireless WAN is out of scope for this course. Summer 2004 Dr. Paul Chen 16 Relationship of OSI and IEEE Reference Model IEEE 802.3 CSMA/CD Model OSI 7-Layer Ref Model Higher Layers Application Presentation Logical Link Control (LLC) Session DTE with Exposed AUI Physical Signaling (PLS) Transport Network Attachment Unit Interface (AUI) Data Link Physical Medium Attachment (PMA) Physical Medium Summer 2004 Dr. Paul Chen 17 Medium Attachment Unit (MAU) DTE with embedded AUI Media Access Control (MAC) Functions of Each CSMA/CD Subsystem PLS and AUI subsystems support the signaling between MAC and MAU. MAU (including PMA) is responsible for the physical and electrical interface to/from the medium. PLS is implemented locally to the MAC in silicon. Summer 2004 Dr. Paul Chen 18 Functions of Each CSMA/CD Subsystem AUI defines an interface to allow a special cable and connector assembly to connect PLS to MAU. This allows MAC/PLS to be located remotely from the MAU and the medium. LLC is implemented in software Host Bus interface, MAC and PLS are implemented in a single chip. Summer 2004 Dr. Paul Chen 19 Typical Ethernet Adapter Implementation AUI Connector 15-pin D-type Buffer Memory IEEE Address EPROM LAN Controller (MAC+PLS) Isolation Transformer Transceiver Cheapernet BNC Connector Isolation Transformer UTP RJ45 Connector AUI Transceiver CPU I/O Bus Ethernet controller implements the functions of MAC and PLS (e.g. Manchester Encoder / Decoder for 10 M Ethernet). Please note that location of isolation transformer for UTP vs.. D-type and BNC connectors. Summer 2004 Dr. Paul Chen 20 IEEE 802.6 MAN Reference Model MAC service Connection Isochronous Oriented To LLC Services Data Service DQDB Layer Physical Layer Bus A Bus A Bus B Bus B DQDB offers a high speed service (DS-1 and DS-3) Summer 2004 Dr. Paul Chen 21 Origin and Evolution of Ethernet ALOHA System Xerox 3M Ethernet (R. Metcalfe and D. Boggs) 10 M Ethernet 100 M Fast Ethernet 1000 M (Gigabit) Ethernet 10 G Ethernet Summer 2004 Dr. Paul Chen 22 ALOHA System It used the ground-based broadcasting. The basic idea can be applicable to any system in which uncoordinated users compete for the use of a single shared channel. Two versions of ALOHA system: pure and slotted In slotted ALOHA, time is divided into discrete slots into which frames must fit. Pure ALOHA does not require global time synchronization while slotted ALOHA does. Summer 2004 Dr. Paul Chen 23 ALOHA System (continued) Pure ALOHA is similar to Ethernet in user access and collision resolution. The max throughput for pure ALOHA is about 0.184. In another word, the best channel utilization is 18%. Slotted ALOHA divided time into discrete intervals, each interval corresponding to one frame. One way to achieve synchronization is to have one station emit a pip at the start of each interval, like a clock. With slotted ALOHA, the max throughput is increased to 37%. Summer 2004 Dr. Paul Chen 24 ALOHA System vs.. CSMA/CD CSMA/CD introduces two major improvements over ALOHA: - CSMA/CD ensures that no station begins to transmit when it senses the channel busy - Stations shall abort their transmission as soon as they detect a collision. Throughput of CSMA/CD is equal or better than slotted ALOHA system. Summer 2004 Dr. Paul Chen 25 First Generation Ethernet The 1st Ethernet is credited to Robert Metcalfe and David Boggs, at Xerox PARC in 1973. A total of 5,000 computers were connected via 3M bps controllers. This experience base was key to the industry acceptance for 10 Mb/s when it was developed. The initial Ethernet standard was developed by Digital, Intel, and Xerox (DIX) consortium in 1979. The Ethernet “Blue Book” was published in 1980. Summer 2004 Dr. Paul Chen 26 First Generation Ethernet The Ethernet standard was submitted to IEEE Project 802 under 802.3 CSMA/CD (Ethernet) committee. Other IEEE LAN standards committees are 802.4 Token Bus and 802.5 Token Ring in addition to 802.11 wireless LAN which is outside the scope of our course. Summer 2004 Dr. Paul Chen 27 Ethernet Rules The World 802.3 committee developed a series of specifications for 10 Mb/s Ethernet to support different kinds of media: thick and thin coaxial cable, unshielded twisted pair, and fiber optic cable. Token Ring fell behind CSMA/CD due to relatively high license cost and late to the market. Today, only a few sites are using token ring (mostly IBM camp). Token Bus was targeting the manufacturing automation market but failed to materialize due to high cost and compatibility issues between old and new specifications. Summer 2004 Dr. Paul Chen 28 Turning Point for 10 M Ethernet Adoption of 10 M Ethernet over UTP, 10Base-T, causes a massive surge of Ethernet installation due to UTP’s low cost and easy cable installation. Rapid increase in bandwidth demand and low silicon implementation for complex system lead to two trends in early 1990. - Migration from shared Ethernet to switched Ethernet topology - Development and deployment of 100 M Fast Ethernet, 100Base-T Summer 2004 Dr. Paul Chen 29 100 M Fast Ethernet In 1982, proposals were made in IEEE802 committee on a 100 M interconnect standard. Most IEEE LAN committees were busy working on existing standards work. FDDI in ANSI took the initiative to work on 100 M network for backbone applications. Standard for 100 M Ethernet was introduced in 1995. 10 M and 100 M Ethernet uses the same frame format. With auto-negotiation to detect and select the proper speed, 100 M capable network adapters can be deployed in a vast 10 M Ethernet installed base. Summer 2004 Dr. Paul Chen 30 100 M Fast Ethernet (continued) Subjects to be discussed on Fast Ethernet include: - Media type - Full / Half duplex (FDX/HDX) and flow control - VLAN tagging - 10/100 M b/s (auto-negotiation) capable devices Summer 2004 Dr. Paul Chen 31 1000 Mb/s Gigabit Ethernet Standard work on IEEE802.3 committee started in late 1995 and was approved in June, 1998. Gigabit Ethernet is targeting the backbone networks and emerging bandwidth-intensive applications. Major differences between 100 M and 1000 M Ethernet other than its speed: - Gigabit media independence interface (GMII) - Adoption of Fiber Channel encoding - Modified CSMA/CD operation and preference for FDX - Modification of auto-negotiation for fiber Summer 2004 Dr. Paul Chen 32 Major Differences between Gigabit and Fast Ethernet GMII transmit and receive data path were widened to 8 bits (from MII’s 4-bit path) to allow frequency clocks and data path transition frequencies. Fiber Channel encoding scheme was adopted. Manchester encoding was used for 10 M, Non-Returnto-Zero (NRZ) was used on 100 M, and 8B/10B encoding is used for Gigabit Ethernet. Summer 2004 Dr. Paul Chen 33 Major Differences between Gigabit and Fast Ethernet To meet the round-trip delay constraint, the slot time was changed from 512 bits (10 M and 100 M) to 512 bytes. Fiber Channel signaling scheme was adopted to allow exchange of FDX/HDX information prior to data transfer. Summer 2004 Dr. Paul Chen 34 10 G Ethernet - Next Generation Gigabit Ethernet targeted three areas: - Service provider data center and enterprise LAN where high bandwidth is demanded - Metropolitan Area Network (MAN) and Storage Area Network (SAN) - Wide Area Network (WAN) and inter-operate with SONET and DWDM backbone network Summer 2004 Dr. Paul Chen 35 Repeater Definition A device that allows extension of the physical network topology beyond the normal range imposed using a single cable segment in terms of distance and node count. An Ethernet repeater can only interconnect Ethernet segments of identical speed. To connect dissimilar-speed networks, a bridge, switch or router is required. Data received on one port is repeated to all ports except the active receiver, with signal amplitude and timing restored on the re-transmitted (repeated) waveforms. Summer 2004 Dr. Paul Chen 36 Repeater Definition If the repeater detects receive activity from two or more ports, it constitutes a collision. The repeater will send a jam pattern on all ports, including the active receive port. Repeaters introduce delay, which must be factored into the round-trip delay. The variability of delay path through the repeater for back-to-back packet causes “inter-packet gap shrinkage”. Summer 2004 Dr. Paul Chen 37 IPG Shrinkage Example Repeater Unit MAU MAU MAU MAU 96 bit time IPG Packet 2 91 bit time IPG Packet 1 Packet 2 Delay through repeater set 15 bit time for packet 1 10 bit time for packet 2 Summer 2004 Dr. Paul Chen 38 Packet 1 Bridge Definition Bridge operates at the MAC sub-layer, while Repeater operates at the PHY layer. Bridge may connect identical MAC technology (Ethernet to Ethernet) or dissimilar ones (Ethernet to Token Ring). Bridge uses the source and destination address information to make intelligent forwarding decision. Summer 2004 Dr. Paul Chen 39 Bridge Definition (continued) A bridge performs Filtering, Learning, and Forwarding functions. Spanning Tree Algorithm permits bridges to dynamically discover a bridged topology and configure the network to ensure connectivity without looping. Summer 2004 Dr. Paul Chen 40 Bridge Definition (continued) A bridge will flood a frame to many ports if it is unknown or if it is a broadcast frame. This flooding may be replicated to another bridge, which may also replicate it. If these bridges have more than one interconnection (a loop), this may replicate to eventually consume all available bandwidth of a LAN. Bridged networks do not contain any hop count information. Routers keep track of the hop count. Router will delete any packet once its time-to-live counter expires. Summer 2004 Dr. Paul Chen 41 Bridge Definition (continued) Bridges use normal frames (with special content) to exchange Spanning tree configuration information. These frames are defined as Bridge Protocol Data Unit (BPDU). Bridges connect 802.3, 802.4, and 802.5 LANs. Many reasons force a single organization to have multiple LANs. Summer 2004 Dr. Paul Chen 42 Bridges Definition (continued) Many universities and corporate departments have their own LANs, to connect their own PC, workstations and servers. The goal of each department differs, so each department may choose different LAN. There is a need to interact with each other for whatever reasons. So bridges are needed. Summer 2004 Dr. Paul Chen 43 Bridges Definition (continued) The corporate organization may be geographically spread over several buildings separated by considerable distances. It may be more economic to have separate LANs in each building and connect them with bridges, and links rather than to run a single cable over the entire site or campus. It may be necessary to split a logically single LAN into multiple LANs to accommodate the load. Universities are typical examples. Summer 2004 Dr. Paul Chen 44 Bridges (continued) Bridge B Backbone LAN B Bridge B B LAN3 LAN4 Workstations LAN1 LAN2 Multiple LANs connected by a backbone via bridges to handle a total Load higher than the capacity of a single LAN. Summer 2004 Dr. Paul Chen 45 Bridges (continued) The physical distance between two most distant PC is too far (> 2.5 km for 802.3) and violates the round-trip delay requirement. Using bridges to connect separate LANs will cover the physical distance needed. Placing bridges at critical locations (fire doors in a building) can prevent a single malfunctioning PC or node from bringing down the entire system. ** Reliability** Bridges can contribute to organization’s security. Summer 2004 Dr. Paul Chen 46 Bridges (continued) Most LAN NICs have a promiscuous mode, in which all frames are given to the host computers, not just those addressed to it. By putting bridges at various locations and being careful not to forward sensitive traffic, we can isolate part of the networks so its traffic cannot escape and fall into the wrong hands. Summer 2004 Dr. Paul Chen 47 Bridges (continued) Host B Host A Bridge CSMA/CD LAN Token Ring LAN Bridges convert the frame format from that of 802.3 to 802.5 Summer 2004 Dr. Paul Chen 48 Transparent Bridges A transparent bridge operates in promiscuous mode, accepting every frame transmitted on all the LANs to which it is attached. When a frame arrives, a bridge must decide whether to discard or forward it, and on which LAN to forward. The decision is made by inspecting the destination address in a big hash table inside the bridge. The table lists each possible destination and tells which output LAN it belongs to Summer 2004 Dr. Paul Chen 49 Transparent Bridges (continued) Bridge Bridge B1 LAN 1 B2 LAN 2 LAN 4 LAN 3 Four LANs are connected by two bridges Summer 2004 Dr. Paul Chen 50 Transparent Bridges (continued) The algorithm used by the transparent bridges is backward learning. By inspecting the source address, bridges can tell which PC is accessible on which LAN. The network topology can change as PC and bridges are powered up / down and moved around. To handle dynamic topologies, whenever a hash table entry is made, the arrival time of the frame is noted in the entry. When a frame whose destination is in the table arrives, the entry is updated with the current time. Summer 2004 Dr. Paul Chen 51 Transparent Bridges (continued) Periodically, a bridge scan its hash table and purges all entries more than a few minutes old (non-active). If a PC is quiet for a few minutes, any traffic from that PC sent to the bridge will be flooded, until it next sends a frame. The procedures used by the bridge is as follows: - If the Destination and Source LANs are the same, discard the frame. - If the Destination and Source LANs are different, forward the frame. - If the Destination LAN is unknown, use flooding. Summer 2004 Dr. Paul Chen 52 Two Parallel Transparent Bridges Frame copied by B2 Frame copied by B1 F1 LAN 2 B1 F2 Bridges B2 LAN 1 F Initial frame Frame F with unknown destination address Summer 2004 Dr. Paul Chen 53 Looping Caused by Two Parallel Transparent Bridges Later, Bridge 1 sees F2 (with unknown destination) and generates F3 (not shown in the Figure). Bridge 2 sees F1 (with unknown destination) and generates F4 (not shown in the Figure). Now, Bridge 1 forwards F4 and Bridge 2 forwards F3 to LAN 1. This cycle goes on forever and looping occurs. Summer 2004 Dr. Paul Chen 54 Spanning Tree Bridges The solution is for bridges to communicate with each other and overlay the actual topology with a spanning tree that reaches every LAN. Some potential connections between LANs are ignored in the interest of constructing a fictitious loopfree topology. To build a spanning tree, the bridges have to choose one bridge as the root of the tree. The process begins by having each bridge broadcast its serial number, which is unique worldwide. Summer 2004 Dr. Paul Chen 55 Spanning Tree Bridges (continued) The bridge with the lowest serial number becomes the root. A tree of shortest paths from the root to every bridge and LAN is constructed. This is a spanning tree! If a bridge or LAN fails. A new spanning tree is computed or constructed. The result is that a unique path is established from every LAN to the root, thus to every other LAN. Even though the tree spans all the LANs, not all the bridges are present in the tree to prevent loops. This algorithm continues to run to auto-detect topology changes and update the tree. Summer 2004 Dr. Paul Chen 56 Spanning Tree Example C B D A C B D A F E F I G E H L N G H J L O K N J O K M M A spanning tree with Node I as the root A Subnet with each node or Bridge identified by a letter Summer 2004 I Dr. Paul Chen 57 Spanning Tree The Spanning Tree Algorithm and Protocol configure a simply connected active topology from the arbitrarily connected components of a Bridged LAN. Frames are forwarded through some of the Bridge Ports in the Bridged LAN and not through others, which are held in a Blocking State. Bridges effectively connect just the LANs to which Ports in a Forwarding State are attached. Summer 2004 Dr. Paul Chen 58 Spanning Tree The Bridge with the highest priority Bridge Identifier is the Root. Every bridge port in a bridged LAN has a Root Path Cost associated with it. This is the sum of path cost for each bridge port receiving frames forwarded from the root on the least cost path to the bridge. The Designated Port for each LAN is the bridge port for which the value of the Root Path Cost (RPC) is the lowest. If two or more ports have the same RPC value, the first Bridge ID and their Port ID are used as tiebreaker. Summer 2004 Dr. Paul Chen 59 Spanning Tree Each port on a bridge is associated with a Port ID and Path cost. Port 2 Port 1 Bridge LAN A Summer 2004 Dr. Paul Chen LAN B 60 Spanning Tree Bridges send a type of Bridge Protocol Data Unit known as a Configuration BPDU to each other in order to communicate and compute the above information (Root, Root Path Cost, etc.). Bridge ID format Bridge Priority MAC (includes VLAN field) 2 Bytes Summer 2004 Dr. Paul Chen 6 Bytes 61 Spanning Tree For each bridge and bridge port, three processes are required: - Elect one Root Bridge - Elect one Root Port for non-root bridge - Elect one Designated Port based on lowest cost Summer 2004 Dr. Paul Chen 62 Spanning Tree Path Cost calculation - Path Cost = 1000 Mbps / Bandwidth of the path in M bps e.g. for Fast Ethernet, Cost = 10 - If the bandwidth is above (including) 1 G bps BW Cost 1 Gbps 4 10 Gbps 2 Summer 2004 Dr. Paul Chen 63 Rapid Spanning Tree Protocol (RSTP) Rapid Spanning Tree Protocol is specified in IEEE802.1w. This is an improved version (version 2) over the original Spanning Tree Protocol (STP). The reconfiguration time (due to failure nodes in the tree) for STP and RSTP is 50 sec vs.. 10 sec or less respectively. RSTP uses a new type of BPDU (type 2). Bridges using RSTP can inter-work with bridges using STP.RSTP can support multiple VLANs and fast re-routes and prevent loops, but security remains a key issue. RSTP alone cannot provide the restoration of the network. It needs to work with others such as RPR technology. Summer 2004 Dr. Paul Chen 64 Router Definition Routers operate at the Network Layer (Layer 3). A router can connect different LAN technologies (Ethernet, Token Ring, FDDI, etc.) and different protocol types (IP, IPX, AppleTalk, etc.). Routers support complex protocols, normally executed in software by a CPU, to perform the routing (forwarding) decision between ports, maintain current state of the routing tables, which determine the optimal path for a packet to be routed. Summer 2004 Dr. Paul Chen 65 Router Definition (continued) High performance routers take advantages of special hardware to perform the routing of data packets while still rely on CPU to process protocol / control packets to perform the routing table updates. Summer 2004 Dr. Paul Chen 66 10 M Ethernet Physical Layer Thick Ethernet (10BASE5) and Cheapernet or Thin-net (10BASE2) are coaxial cable based. 10BASE5 specifies a maximum cable length of 500m, a maximum of 100 nodes, a minimum separation distance between MAU on the coax of 2.5m. The length and node count can be increased by the use of repeaters. Twisted pair Ethernet (10BASE-T) uses standard voice-grade telephone cable (22-26 gauge) with a target cable distance of 100m. Summer 2004 Dr. Paul Chen 67 10 M Ethernet Physical Layer (continued) Thin coax cable is more flexible due to its small diameter. But key electrical properties are degraded over the thin cable. 10BASE2 specifies a maximum cable length of 185m, a maximum of 30 nodes, a minimum separation distance between MAU on the coax of 0.5m. The 10BASE-T system uses a star topology with a repeater (or hub) at the center of the star. The repeater performs the signal-amplitude and timing regeneration. Summer 2004 Dr. Paul Chen 68 10 M Ethernet Physical Layer (continued) The 10BASE-T system provides a low cost and easy to install network solution. The point-to-point star topology eases the task of network management (fault isolation), cable administration, and reconfiguration due to moves, additions, deletions, or changes. 10BASE-T uses 100 ohm UTP cable and inexpensive RJ-45 telephone jack connectors. 10BASE-T can operate on other unshielded or shielded cable grades (120 and 150 ohm). Summer 2004 Dr. Paul Chen 69 10 M Ethernet Physical Layer (continued) Benefits of Fiber Optics include: very high bandwidth, low attenuation. Its drawbacks include: fiber optic cable and connectors are more expensive, required skilled (costly) installation personnel. Fiber Optic Inter-Repeater Link (FOIRL) is specified for a repeater-to-repeater link for a distance of up to 1 km. It was extended for repeater-to-DTE application. Separate TX and RX paths are used. 10BASE-FL was developed to supersede the original FOIRL. The max distance between MAUs is extended to 2 km. The cheaper Bayonet fiber optic plug and socket connectors are used to save cost. Summer 2004 Dr. Paul Chen 70 10 M Ethernet Physical Layer (continued) 10BASE-FB was designed to optimize the interrepeater link. 10BASE-FB MAU is embedded within a repeater. It was targeting a backbone technology and gained limited vendor support. 10BASE-FP uses a passive optical star approach. The star and fiber optic cabling form the overall medium. The star has no active component, and is not a repeater. It simply provides “optical mixing” of received signal. This is used for the case where no power is available or electrical signal is hazardous. Only physical layer interface changes for various medium types, which can be mixed in a network. Summer 2004 Dr. Paul Chen 71 10 M Ethernet Physical Layer (continued) Make sure that the network is NOT oversized. A collision after the slot time (512 bits or 51.2 us) results in a “late collision”. The late collision statistics is used to indicate that the network has become oversized. The roundtrip propagation delay is too large. Link Test is provided to ensure network integrity. Link Status allows simple diagnosis of the station, or repeater port state. Summer 2004 Dr. Paul Chen 72 Sample Problem 1 A 1-km-long 10-Mbps LAN has propagation speed of 200 m/usec. Data frame size is 256 bits, including 32 bits if header, checksum, and other unspecified overhead. The first time slot after a successful transmission is reserved for the receiver to capture the channel and send a 32-bit acknowledgement frame. What is the effective data rate, excluding overhead, assuming that there are no collisions? Summer 2004 Dr. Paul Chen 73 Solution to Sample Problem 1 The round trip propagation delay is 2 x 1000 m / (200 m /usec) = 10 usec. A complete transmission has four phases: transmitter seizes cable: 10 usec transmits data: 256 / (10 x 10**6) = 25.6 usec receiver seizes cable: 10 usec acknowledgement sent: 32 / (10 x 10**6) = 3.2 usec The total is 48.8 usec. The actual data of (256-32) 224 data bits is sent. The effective data rate is 224/48.8 usec = 4.6 Mbps. Summer 2004 Dr. Paul Chen 74 Sample Problem 2 Consider building a CSMA/CD network running at 1 Gbps over a 1-km cable with no repeaters. The signal speed on the cable is 200,000 km/sec. What is the minimum frame size? Summer 2004 Dr. Paul Chen 75 Solution to Sample Problem 2 For a 1 km cable, the one-way propagation time is 1 km / 200,000 km/sec = 5 usec, so round trip delay is 10 usec. This is much longer than 4.096 usec round-trip delay needed to make Gigabit CSMA/CD LAN work properly. Slot time = 10 usec. Frames must be transmitted in 10 usec. At 1 Gbps, all frames shorter than 10,000 bits can be transmitted in under 10 usec. Thus, the min frame size is 10,000 bits or 1250 bytes. Summer 2004 Dr. Paul Chen 76 Sample Problem 3 The min Ethernet frame size must be 64 bytes to ensure that the transmitter is still going in the event of a collision at the far end of the cable. Fast Ethernet has the same 64-byte min frame size, but can get the bits out 10 times faster than Ethernet. How is it possible to maintain the same min frame size? Summer 2004 Dr. Paul Chen 77 Solution to Sample Problem 3 The max distance (cable length) supported by Fast Ethernet is 1/10 as long as in Ethernet. i.e. Shorter reach!! Summer 2004 Dr. Paul Chen 78 Sample Problem 4 A device accepts frames from Ethernet to which it is attached. It removes the packets inside the frames, adds framing information around it, and transmits it over a leased telephone line (which only connects to the outside world) to an identical device at the other end. The far-end device removes the framing, inserts the packets into a token ring frame, and transmits it to a local host over a token ring LAN. What do you call this device? Summer 2004 Dr. Paul Chen 79 Solution to Sample Problem 4 Since the device connects to Ethernet (or Token ring) on one side and to the telephone leased line on the other side, there is no routing involved. The device is a half bridge (not even a full bridge). A full bridge can connect either two same or two different LAN technologies. Summer 2004 Dr. Paul Chen 80 Manchester Encoder Manchester encoder translates physically separate signals of clock and data into a single, self-synchronizing serial bit stream, suitable for transmission on the cable by the transmitter. For the 10M data rate, the bit cell time is 100ns. The encoder output timing must not exceed 0.5 ns. Summer 2004 Dr. Paul Chen 81 Manchester Encoder Input Data Stream 1 0 0 1 0 1 High level Low level Encoding Signal Pattern Summer 2004 Dr. Paul Chen 82 1 Manchester Encoder During the 1st half of the bit cell time, the serial signal transmitted is the logical complement of the bit value being encoded during that cell. During the 2nd half of the bit cell time, the uncomplemented value of the bit being encoded is transmitted. Thus, there is always a signal transition in the center of each bit cell. Summer 2004 Dr. Paul Chen 83 10/100 M b/s Ethernet Layer Model Higher Layers Logical Link Control (LLC) Media Access Control (MAC) Reconciliation Sublayer (RS) Physical Signaling (PLS) MII AUI Physical Coding Sublayer Physical Medium Attachment PHY Physical Medium Attachment (PMA) Auto Negotiation MAU MDI MDI Medium Medium 100 M b/s 10 M b/s Summer 2004 Dr. Paul Chen Physical Medium Dependent 84 100BASE-T (IEEE802.3u) Standard Overview Reconciliation Sublayer (RS) maps the MAC behavior to electrical signals of the MII. Specifically, it maps the new 4-bit data path and associated control signals of MII to the original PLS service interface, which is bit-serial. MII (18-pin) can be used as an interconnect at the chip, board, or physical device level. When used as inter-chip connection, it is implemented as printed-circuit board traces. Summer 2004 Dr. Paul Chen 85 100BASE-T (IEEE802.3u) Standard Overview MII Management interface consists of Management Data Clock (MDC) and Management Data Input Output (MDIO). MDC is used to synchronize the data transfer in and out of the PHY using MDIO pin. MDIO is a bidirectional signal which allows serial data to be clocked in and out of the PHY device. Reduced MII (RMII) reduces the number of pins from 18 to 9 to allow more port to fit into a box or chassis. RMII uses a 2-bit data path and operates at 50 MHz vs. 25 MHz used by MII. Summer 2004 Dr. Paul Chen 86 Major Differences Between 10BASE-T and 100BASE-T MII vs.. AUI - 4-bit MII interface replaces the the bit-serial AUI interface Addition of RS (Reconciliation Sublayer) - RS maps the 4-bit wide data path and associated control signals of MII to the original PLS service interface. In practice, RS is implemented as an integral part of MAC controller chip. Dual-Speed MAC Operation - To allow gradual upgrade and co-existence Replacement of Manchester Encoding by NRZ - To counter the EMI and RFI, NRZ is more suitable for 100 M b/s data rate. Summer 2004 Dr. Paul Chen 87 NRZ Encoding Non-return to zero encoding is commonly used in slow speed communications interfaces for both synchronous and asynchronous transmission. Using NRZ, a logic 1 bit is sent as a high value and a logic 0 bit is sent as a low value (the line driver chip used to connect the cable may subsequently invert these signals). Summer 2004 Dr. Paul Chen 88 NRZ Encoding A problem arises when using NRZ to encode a synchronous link which may have long runs of consecutive bits with the same value. The figure below illustrates the problem that would arise if NRZ encoding were used with a DPLL recovered clock signal. In Ethernet for example, there is no control over the number of 1's or 0's which may sent consecutively. There could potentially be thousands of 1's or 0's in sequence. If the encoded data contains long 'runs' of logic 1's or 0's, this does not result in any bit transitions. Summer 2004 Dr. Paul Chen 89 NRZ Encoding The lack of transitions prevents the receiver DPLL from reliably regenerating the clock making it impossible to detect the boundaries of the received bits at the receiver. This is the reason why Manchester coding is used in Ethernet LANs. A long run of bits with the same value results in no transitions on the cable when NRZ encoding is used Summer 2004 Dr. Paul Chen 90 NRZ Encoding Summer 2004 Dr. Paul Chen 91 NRZ Inverted Encoding A method for transmitting and recording data so that it keeps the sending and receiving clocks synchronized. This is especially helpful in situations where bit stuffing is employed -- the practice of adding bits to a data stream so it conforms with communications protocols. These added bits can create a long string of similar bits, which register to the receiver as a single, unchanging voltage. Since clocks adjust on voltage changes, they'll lag behind true time. Summer 2004 Dr. Paul Chen 92 NRZI Encoding NRZI ensures that after a 0 bit appears, the voltage will immediately switch to a 1 bit voltage level. These voltage changes allow the sending and receiving clocks to synchronize. Summer 2004 Dr. Paul Chen 93 Major Differences Between 10BASE-T and 100BASE-T (continued) Class I and II Repeater Specification Addition of Auto-Negotiation Full Duplex (FDX) Operation Different Cable Categories and Numbers of Pairs Used 100BASE-TX operates over two pairs of Cat 5 UTP (Cat 5 is better quality cable than Cat 3) 100BASE-T4 operates over four pairs of Cat 3 UTP 10BASE-T operates over two pairs of Cat 3 UTP Updated management Summer 2004 Dr. Paul Chen 94 Gigabit Optical PHY Layer 1000Base-SX supports a light source of 850 nm wavelength on MMF (core diameter of 50 or 62.5 um) at distance of 220 m to 550 m. 1000Base-LX supports a light source of 1300 nm wavelength on SMF (core diameter of 2 to 10 um) at distance of > 5 km. Summer 2004 Dr. Paul Chen 95 Gigabit Optical PHY Layer Gigabit Ethernet supports two types of MMF: 50 um (core diameter) and 62.5 um fibers. 62.5 um fiber has low modal bandwidth compared with 50 um fiber, especially for short-wave lasers. As such, the distance traversed by 62.5 um fiber is less than in 50 um fiber. Two types of light sources (transmitter) are used to transmit light over fiber: LED and laser diode. Summer 2004 Dr. Paul Chen 96 Gigabit Optical PHY Layer Optical Transmitter Parameters Wavelength Spectral Width Power Rise Time / Fall Time Extinction Ratio Jitter Relative Intensity Noise (RIN) Summer 2004 Dr. Paul Chen 97 Gigabit Optical PHY Layer Laser diodes are faster than LEDs. Typical rise time for laser and LED is 1 ns vs.. a few ns to 250 ns. Gigabit Ethernet requires a laser-diode type of transmitter. Signal loss in the optical fiber is minimal at wavelength of 850, 1300 and 1550 nm. MMF using short-wave lasers, the wavelength ranges from 770 to 860 nm. Summer 2004 Dr. Paul Chen 98 Gigabit Optical PHY Layer MMF and SMF using long-wave lasers, the range is 1270 to 1355 nm. Transmitters never emit light at a single wavelength. A range of wavelengths are produced, which is called “spectral width”. Wavelength = velocity of the light / the frequency The spectral width for laser-diode and LED type of transmitters is 1 – 5 nm vs.. 20 – 100 nm. Summer 2004 Dr. Paul Chen 99 Gigabit Optical PHY Layer Random timing errors (called jitters) build up through the length of optical link, which can cause incorrect interpretation of the signal at the receiver. Transmitter power minus loss due to transmission on the fiber must be >= minimum acceptable receive power. Rise time: time needed for the output power of the transmitter to rise from 20% to 80% of its final value when the input is a step current. Summer 2004 Dr. Paul Chen 100 Gigabit Optical PHY Layer Transmitter takes certain amount of time to reach maximum power (logic 1) and to fall (or reach) minimum power (logic 0). These are called rise and fall time. Extinction ratio := average optical energy in Logic 1 value / average optical energy in Logic 0 value Summer 2004 Dr. Paul Chen 101 Gigabit Optical PHY Layer : Relative Intensity Noise (RIN) A laser is a highly tuned quantum-effect oscillator. When a laser source is used to transmit light through fiber, a certain amount of optical power is reflected back into the laser device due to connectors and optical lens interfaces. The reflected optical power disturbs the purity of the laser oscillation, which appears as optical noise, (called RIN). This is similar to thermal noise in resistors. Summer 2004 Dr. Paul Chen 102 Class I and II Repeater for 100BASE-T Objectives are to optimize the signal delay and significant differences between the coding schemes used for different media types. Class I allows more generous delays to accommodate conversions between two coding schemes, and allow all media types to be connected to the repeater. Class II was defined with more stringent timing specification, requiring it to be optimized for one coding scheme, meaning that it could not support all media types. Summer 2004 Dr. Paul Chen 103 MAC Functions MAC Client Sublayer TX Data Encapsu RX Data Decapsu TX Media Access MGNT RX Media Access MGNT TX Data Encoding RX Data Decoding Physical Layer Signaling Summer 2004 Dr. Paul Chen 104 MAC Functions Data encapsulation (transmit and receive) - Framing (boundary delimitation, frame sync) - Addressing (processing source & destination address) - Error detection of physical medium transmission errors Media Access Management - Medium allocation (collision avoidance) - Contention resolution (collision resolution) Summer 2004 Dr. Paul Chen 105 MAC Byte to MII Nibble Mapping 1st Bit from MAC D0 D1 MAC’s Serial Bit Stream D2 D3 D4 D5 1st Nibble MSB D0 D2 D3 MII Nibble Stream Summer 2004 D7 2nd Nibble D1 LSB D6 Dr. Paul Chen 106 100BASE-T4 Cat-3 UTP 100BASE-T4 operates over four pairs of Cat 3 (or better) UTP cable. DTE-to-repeater distance is limited to 100 m. 100BASE-T4 uses a block coding scheme (8B/6T). Three of the four cable pairs are used for data transmission by either DTE or repeater, the remaining pair is used to detect simultaneous activity from the device at the other end of the link indicating a collision. Summer 2004 Dr. Paul Chen 107 100BASE-T4 Cat-3 UTP (continued) The PHY sub-layer takes two nibbles (4-bit each) from MII to form a byte, and converts to a 6-bit ternary symbol. Each symbol (data byte) is sent over one of the three pairs, with each data byte being encoded and transmitted on the pairs in a round-robin fashion. Summer 2004 Dr. Paul Chen 108 100BASE-T4 Cat-3 UTP (continued) 8B/6T encoding scheme maps the 256 8B (binary) data-byte to a subset of the available 6T (ternary) codes (36 = 729 available). Each of the 6-bit positions in the ternary code can take one of the three values, +1, 0, -1. This provides good clock transition density, which leads to simple receive clock recovery, and minimizes high-energy transitions (from +1 to -1 and vice versa) which reduce EMI/RFI. Summer 2004 Dr. Paul Chen 109 100BASE-T4 Cat-3 UTP (continued) The data rate on each of the three pairs is 33 Mb/s. The increased efficiency of 8B/6T coding allows the frequency on the line to operate at 25 MHz. Since 1000BASE-T4 uses a new PHY layer coding and signaling protocol, this requires a new IC to be developed to meet the requirement economically. This slows its deployment relative to 100BASE-TX, which was able to leverage existing PHY IC developed for FDDI over copper. Summer 2004 Dr. Paul Chen 110 100BASE-TX Cat-5 UTP 100BASE-TX uses 4B/5B coding scheme and operates over two pairs of Cat 5 cable. This coding scheme is full duplex (unlike 100BASE-T4, but identical to 10BASE-T), with one pair transmitting and the other pair receiving. The PHY takes a nibble from MII and converts this into 5-bit binary symbol. Due to EMI/RFI concerns for this high frequency (125 MHz) of data rate on the cable, additional steps are taken to reduce the spectral content of the transmission. Summer 2004 Dr. Paul Chen 111 100BASE-TX Cat-5 UTP Scrambling helps smooth the spectral content of the resulting transmitted waveform. MLT-3 (multilevel transmit ternary) encoding converts the binary 5B symbol to a ternary code and further reduces the content. The data rate on a single pair is 100 M b/s, scrambling and MLT-3 coding steps reduce the frequency on the line to 31.25 MHz. Summer 2004 Dr. Paul Chen 112 4B/5B and MLT-3 Encoding for 100BASE-TX PHY 125 MHz NRZI For 100BASE-FX MII TXD<3:0> 100BASE-TX 4-bit Data Current Nibble Summer 2004 4B/5B Encoding Parallel to Serial Conv Dr. Paul Chen Scrambler 113 MLT-3 Encoder 100BASE-FX Fiber Optic 100BASE-FX operates over two individual (multimode) fiber optic cables. Its PHY uses the same 4B/5B block coding scheme as 100BASE-TX, allowing the native full duplex operation of the link. Due to its higher cost in fiber optic, 100BASE-FX is typically used in long-distance, high-bandwidth, or security-conscious applications. Full duplex 100BASE-FX can support up to 2 km on multimode fiber, while single-mode can extend the distance further. Summer 2004 Dr. Paul Chen 114 100BASE-T2 Cat 3 UTP 100BASE-T2 was developed to operate over two pairs of Cat 3 cable. To provide full duplex operation, 100BASE-T2 operates full duplex on each pair using the “quinery symbol” coding. This is also referred as PAM 5x5 symbol code. The standard was developed, but the vendor interest wane due to the complexity and cost of the PHY implementation. There is No installed base for 100BASE-T2. Summer 2004 Dr. Paul Chen 115 MAC Sub-layer MAC sub-layer is the primary control entity for access to the network. On receipt of MA_UNITDATA.request primitive from the LLC, the MAC formats a data frame from the information in the provided parameters, adding its own header and error checking trailer. When the link goes quiet, it initiates and controls the transmission. Summer 2004 Dr. Paul Chen 116 MAC Sub-layer (continued) When it receives a frame, the MAC sublayer checks it for validity, strips the header and trailer, if no error was detected, generates an MA_UNITDATA.indication primitive that is passed to the LLC. The optional MAC Control sublayer allows flow control procedures and contains provisions for adding other control functions in the future. PAUSE frame is an example of MAC Control sublayer function. Summer 2004 Dr. Paul Chen 117 Water Mark Flow Control Buffers inside a switch will assign two water marks: high and low. High and Low water marks associate with a large and small timer values to be used when sending PAUSE flow control frames. The timer value is programmable. When frame buffers exceed the low-watermark threshold, the switch will generate a PAUSE frame and send it to the DTE, which will stop sending new frames until timer expires. Summer 2004 Dr. Paul Chen 118 Water Mark Flow Control If the congestion persists, the frames in the buffer will reach the High watermark. The switch will send a PAUSE frame with a large timer value assigned to it. When the congestion eases, the switch can send a PAUSE frame with timer value zero. Summer 2004 Dr. Paul Chen 119 Credit Based Flow Control A DTE can send a frame to Switch if it has positive credit. The switch will advertise to DTE about the number of credits available for DTE to send frames. For Ethernet, the credit needed to send one Max frame is 1518 bytes. The advertising of credit to DTE is performed through exchange of flow control frames. Summer 2004 Dr. Paul Chen 120 Credit-Based Flow Control The source can continue to transmit as long as its credit counter is greater than zero. The credit counter is initially set to zero. Each RTT, the controller sends a feedback message indicating the counter value for each source under its control. The controller dedicates a set of buffers to each connection and computes the credit as the number of remaining bits or packets in the buffer for each connection. Summer 2004 Dr. Paul Chen 121 Credit-Based Flow Control The credit flow control results in a bursty but regular transmission of data. This scheme operates in a region that keeps the buffer relatively full much of the time. When the RTT is small (dedicated buffer capacity is small), such as LAN traffic, credit flow control performs well. But the logic is complicated to implement, a large storage is needed for longer propagation delay, and the per connection message uses about 10% of the link bandwidth. Summer 2004 Dr. Paul Chen 122 Rate Based Flow Control The switch will signals to the DTE to send frames at desired rate using control frames. DTE can send frames faster or slower by adjusting the Inter-Frame-Gap (IPG) between frames. For 10M, 100M and Gigabit Ethernet, default IPG is 96 bit times (9.6, 0.96, 0.096 usec). When there is no congestion, the switch will signal DTE to send frames at max rate or minimum IPG (96 bit times). Summer 2004 Dr. Paul Chen 123 Rate Based Flow Control When the congestion occurs, the switch will send DTE a flow control frame with a new IPG value (> 96 bit times). IPG for different rates can be predefined. Unlike PAUSE frame flow control (XON, XOFF), rate based flow control does not exhibit all-ornothing behavior. There is no guarantee that frame will not be lost due to buffer overflow. Summer 2004 Dr. Paul Chen 124 Rate-Based Flow Control This is based on the Transmit Rate of the source. The unit of Transmit Rate is bits or packets per Round Trip Time (RTT). Every RTT, the controller provides feedback on whether the source should increase or decrease its rate. This scheme performs better if the feedback uses the rate of buffer growth instead of an absolute threshold. Summer 2004 Dr. Paul Chen 125 Rate-Based Flow Control The source must be able to control its transmit rate using some form of traffic shaping. The end result of this scheme is that the data transmission is more evenly spaced (based on simulation). Summer 2004 Dr. Paul Chen 126 LLC / MAC Group Service Primitives MA_UNITDATA.request MA_UNITDATA.indication Media Access Control (a) Without the optional MAC Control sublayer implemented MA_UNITDATA.request MA_CONTROL.request MA_UNITDATA.indication MA_CONTROL.indication MAC Control Sub-layer (optional) TransmitFrame (DA, SA, Len/type, Data) ReceiveFrame (DA, SA, Len/type, Data) Media Access Control (b) With the optional MAC Control sublayer implemented Summer 2004 Dr. Paul Chen 127 802.3 / Ethernet Frame Format Bytes 7 1 Preamble 2 or 6 2 or 6 DA SA Start of Frame delimiter 2 0 ~ 1500 0 ~ >= 1536 Data 0 ~ 46 Pad 4 Checksum Length of Data Field for 802.3 (value <= 1500) Type of Data Field for classical Ethernet (value >= 1536) DA: Destination Address SA: Source Address Preamble: each byte pattern 10101010 Start of Frame Delimiter: 10101011 A valid frame must be at least 64 bytes in length (from DA to checksum) Summer 2004 Dr. Paul Chen 128 MAC Sub-layer (continued) The standard allows 2- and 6-byte address. But CSMA/CD only uses 6-byte addresses. The high order bit of the DA is 0 for ordinary address and 1 for group address. A DA with all 1’s is a broadcast address. When the length of Data field is less than 46 bytes, the pad field is used to fill out the frame to the minimum size. Summer 2004 Dr. Paul Chen 129 MAC Sub-layer (continued) The minimum frame size of 64 bytes is to prevent a station from completing its transmission before the first bit has even reached the far end of the cable, where it could collide with another frame. PAUSE control frame (with a timer value specified) is used by full duplex station to control the number of inbound packets for congestion control. Summer 2004 Dr. Paul Chen 130 MAC Frame Address Format Transmission order, left-to-right, high-to-low order bit 6 Bytes I/G 46-bit Address U/L 0 = Universal (globally-administered) address 1 = Locally administered address 0 = Individual address 1 = Group address Summer 2004 Dr. Paul Chen 131 MAC Control Frame Format 2 octets Pre SFD DA SA Length/Type 2 octets 60 octets MAC Control Opcode Opcode Parameters FCS Pause_time 88-08 (Hex) 01-80-C2-00-00-01 (Hex) For PAUSE control frame Reserved (Set to 0) 00-01 (Hex) for PAUSE control frame Frames with the DA value of 01-80-C2-00-00-01 are filtered out by the receiving MAC and are not forwarded by switch or bridge ports. Summer 2004 Dr. Paul Chen 132 Relationship of Slot Time and Frame Size A starts transmission t1 t2 B starts transmission A completes transmission of a frame t4 t3 B detects collision and starts Jam Slot Time A detects a new frame End of Jam signal from B t5 A discards a frame due to FCS error Distance DTE A Summer 2004 Dr. Paul Chen Time DTE B 133 Relationship of Slot Time and Frame Size (continued) In previous viewgraph, frames sent by DTE A and B are of length less than the slot time. As such, neither A or B can send their frames successfully. This would render the operation of the network infeasible. Minimum transmission time on the network must be at least a slot time. Summer 2004 Dr. Paul Chen 134 Reason for Increasing the Slot Time for Gigabit Ethernet The minimum frame size for 10M and 100M Ethernet is 512 bits. The minimum transmission time must be at least a slot time. This implies that we will decrease the network size of Gigabit Ethernet to 20m if we keep the slot time at 512 bits. The network range of 20m will make Gigabit Ethernet impractical for the real world applications. As such, the slot time of Gigabit Ethernet is increased to 512 bytes (4096 bit times). Not exactly 5120 bits! Summer 2004 Dr. Paul Chen 135 Carrier Extension for Shared Gigabit Networks The minimum frame size for Gigabit Ethernet is still kept at 64 bytes in order to be compatible with other slower Ethernet. In a bridged network, a bridge must segment a frame from Gigabit network into 64-byte chunks. If a server has a Gigabit link, then each acknowledgement would be eight times longer than necessary. IEEE802.3z decided to adopt a technique called “carrier extension” to decouple the minimum frame length from the slot time for Gigabit half-duplex operation. Summer 2004 Dr. Paul Chen 136 Carrier Extension for Shared Gigabit Networks (Conti) When a DTE transmits a frame with length longer than slot time (4096 bits), the MAC returns the “transmitdone” status to the upper layer as before. If the frame length is less than the slot time, the transmit status is withheld and the physical layer transmits a sequence of special ‘extended carrier’ symbols until the end of slot time. These symbols are transmitted after the FCS which delimits the frame. The special symbols are not part of the frame and are handled in a different way at the receiver. If collision occurs during the data or extended carrier transmission, DTE will abort transmission and send a jam signal (32 bits). Summer 2004 Dr. Paul Chen 137 Frame Format with Carrier Extension Bytes 7 1 Preamble 2 or 6 2 or 6 DA SA Start of Frame delimiter 2 0 ~ 1500 0 ~ >= 1536 0 ~ 46 4 0-448 bytes Data Pad FCS Extension Length of Data Field for 802.3 (value <= 1500) Type of Data Field for classical Ethernet (value >= 1536) 64 bytes minimum 512 bytes minimum FCS coverage Duration of carrier event Summer 2004 Dr. Paul Chen 138 LLC Functions LLC (Logical Link Control) forms the upper half of the Data Link layer. It hides the differences between various kinds of 802 networks by providing a single format and interface to the Network layer. Network layer passes a packet to LLC using the LLC access primitives. The LLC sublayer then adds an LLC header, containing source and destination addresses, sequence and acknowledgement numbers. LLC provides three service options: unreliable datagram service, acknowledged datagram service, and reliable connection-oriented service. Summer 2004 Dr. Paul Chen 139 802.3 Performance Assumption: it is under heavy and constant load, k stations are always ready to transmit. For 10 Mb/s, the time slot is set to 512 bit times, or 51.2 us. It is set to accommodate the longest path allowed by 802.3 (2.5km and four repeaters). Summer 2004 Dr. Paul Chen 140 802.3 Performance (continued) If each station transmits during a contention slot with probability p, the probability A that some station acquires the channel in that slot is A = kp(1 – p)k-1 A is maximized when p = 1/k, with A Æ1/e as k Æinfinity. Summer 2004 Dr. Paul Chen 141 802.3 Performance (continued) The probability that the contention interval has exactly j slots in it is A(1 – A)j-1, the mean number of slots per contention is given by Sum jA(1 – A) j-1 = 1/A, where j = 0 to infinity Since each slot has a duration 2τ, the mean contention interval, w, = 2τ/A. Summer 2004 Dr. Paul Chen 142 802.3 Performance (continued) Assuming optimal p, the mean number of contention slots is never more than e, so w (mean contention interval) is at most 2τe = 5.4τ. If the mean frame takes P sec to transmit, when many stations have frames to transmit, channel efficiency = P / (P + 2τ/A) Since t is dependent on the maximum cable distance between any two stations, the longer the cable, the longer the contention interval. Summer 2004 Dr. Paul Chen 143 802.3 Performance (continued) Substituting with the frame length, F, the network bandwidth, B, the cable length, L, the speed of signal propagation, c, for the optimal case of e contention slot per frame, and with P = F / B; channel efficiency = 1/(1 + 2BLe/cF) Note: the mean frame takes P sec to transmit Summer 2004 Dr. Paul Chen 144 802.3 Performance (continued) Virtually all performance analysis on 802.3 assumes that the traffic is Poisson distribution. When research is done on the real traffic data, the network traffic is self-similar rather than Poisson. The average number of packets in each minute of an hour has as much variance as the average number of packets in each second of a minute. We will not cover the self-similar here. Summer 2004 Dr. Paul Chen 145 Efficiency of 802.3 at 10M b/s with 512-bit slot time 1.0 1024 byte frames 0.9 0.8 Channel efficiency 512 byte frames 0.7 256 byte frames 0.6 0.5 128 byte frames 0.4 64 byte frames 0.3 1 2 4 8 16 32 64 128 Number of stations trying to send Summer 2004 Dr. Paul Chen 146 256 Layer 2 and Layer 3 Switch Layer 2 Ethernet switch appeared in 1993 when Fast Ethernet was being developed by 802.3. Fast Ethernet (802.3u), Full Duplex Ethernet (802.3x), and VLAN Tagging (802.3ac) were all initiated and executed as a result of the industry movement to migrate high performance Ethernet from sharedmedium to switching. Layer 2 switch is functionally equivalent to a bridge. Bridges perform filtering, learning and forwarding functions in software while switches perform these functions in hardware to increase the throughput. Summer 2004 Dr. Paul Chen 147 Layer 2 and Layer 3 Switch Multiple ports on a switch can be active simultaneously and can operate in full or half duplex mode with 10/100 M b/s auto-sensing on a port-by-port basis. Full wire-speed forwarding and learning, and VLAN tagging are implemented in hardware. Switches operate in either store-and-forward mode (entire packet is received before forwarding is attempted), or cut-through mode (forwarding is commenced before entire packet is received). It may incorporate features like forwarding based on protocol type, broadcast domain (VLAN) filtering. Summer 2004 Dr. Paul Chen 148 Layer 2 and Layer 3 Switch (continued) A port on the Ethernet switch that connects to a repeater can only operate in half duplex mode because the repeater only operate in that mode. When 10 and 100 M b/s ports are equipped in a switch, the data packet must be transmitted using store-and-forward technique between two different speeds because cut-through mode does not perform speed adaptation. Vendors offer 100 M or Gigabit Ethernet switches on the market today. Summer 2004 Dr. Paul Chen 149 Layer 2 and Layer 3 Switch (continued) Layer 2 switch operates based on the MAC address (Layer 2) while Layer 3 switch operates on the network layer (Layer 3). First generation Layer 3 switch supports a limited number of the network layer protocols that were accelerated using hardware. The IP is the primary protocol used for corporate and WWW traffics. IPoptimized hardware-assisted router is used to handle the traffic aggregation. Summer 2004 Dr. Paul Chen 150 Multi-layer Switch Routers Multi-layer switch routers essentially combine the functions of a Layer 2 switch and a router. Since Layer 2 technology is used to forward packets between ports, combinations of ports can be treated as switched. Similarly, routing can be enabled between ports to gain the advantages of Layer 3 network segmentation where appropriate. Switch routers are also far less expensive than software-based routers, because they are based on specialized hardware (ASICs), not on a complex software architecture. Summer 2004 Dr. Paul Chen 151 Multi-layer Switch Routers In theory, switch routers can replace both switches and routers. However, due to a difference in price-per-port, switches will still be used in price-sensitive situations. But switch routers will dramatically replace routers, since there are very few downsides to switch routers over routers. Most switch routers support standard transparent bridging for Layer 2, and support the 802.1Q standard for private VLANs through the optical MAN. Switch routers generally have a high-speed backplane to forward traffic from one port to another at wire speed. Summer 2004 Dr. Paul Chen 152 Multi-layer Switch Routers The larger differences in switch routers are in the Layer 3 and 4 processing. Switch routers offer the ability to perform standards-based routing in order to forward IP packets. Two key differentiators are in the area of route processing: - Degree of decentralization - Performance with advanced features enabled Summer 2004 Dr. Paul Chen 153 Decentralization Degree of decentralization is important to service providers because it dictates whether or not there is a single point of failure in the network. Switch routers with highly decentralized route processing (including redundancy and hot-swap capability) have a big advantage here. Summer 2004 Dr. Paul Chen 154 Performance with Advanced Features Enabled A switch router must replace a router while offering tiered services that may be provisioned and billed. Features such as Bandwidth Provisioning, Server Load Balancing, and Access Control Lists (ACLs) to perform filtering and security functions are important in sophisticated networking environments—they should not hinder the performance of the switch router when turned on. ASICs in switch routers should perform at wire speed regardless of the features enabled. Summer 2004 Dr. Paul Chen 155 Application Awareness in Switch Routers Application awareness in switch routers means prioritization at Layers 2, 3, and 4, with the ability to perform priority-based queuing at the higher layers. Summer 2004 Dr. Paul Chen 156 Layer 2 Prioritization Most switch routers support the 802.1p standard for Layer 2 prioritization. This standard amounts to supporting additional header information in the Layer 2 packet (typically Ethernet). 802.1p specifies three bits (eight levels) of priority for Layer 2 packets. This isn't actually tied to an application, however. In fact, there are no set rules as to how the priorities are derived and assigned to Layer 2 frame headers. Summer 2004 Dr. Paul Chen 157 Layer 3 Prioritization Most switch routers have some form of Layer 3 prioritization, but this is typically in the form of a partial solution. The claim of support for Resource Reservation Protocol (RSVP) is an especially wooly area—many vendors who cannot offer bandwidth reservation from end to end over the wide area claim this support. More importantly, RSVP itself doesn't support traffic prioritization. In many situations, networking equipment will additionally need robust prioritization capabilities to provide the applicationacknowledged information transfer. Summer 2004 Dr. Paul Chen 158 Layer 3 Prioritization A second form of Layer 3 prioritization is referred to as IP flow mode: the source and destination are used in combination, which forms the basis of the prioritization. In some situations, this can provide a fairly decent match to application-based prioritization; in others, it cannot. Summer 2004 Dr. Paul Chen 159 Layer 3 Prioritization Prioritizing traffic by IP flows means that a given pair of IP addresses (source and destination) are given a certain priority. For instance, the flow from a range of addresses at a specific customer can be assigned one priority, while the flow from a range of addresses associated with a different customer can be assigned a different (and higher) priority. This would not allow you to offer a customer different priority levels on traffic of differing applications. You couldn't, for example, assign a high rate to VoIP traffic and a low rate to HTTP traffic. Summer 2004 Dr. Paul Chen 160 Layer 4 Prioritization Layer 4 is the key to application-aware networking. Using IP as an example, Layer 4 is based on a transport port (often referred to as a "socket") that is generally assigned by application. There are many Layer 4 protocols used in IP, but two very common ones are TCP and UDP; TCP is connection-based, whereas UDP is connectionless. Summer 2004 Dr. Paul Chen 161 Layer 4 Prioritization Specific Layer 4 port definitions are outlined in RFC1700; for example, ports 20 and 21 for file transfer (FTP data), 25 is for e-mail (SMTP), 80 is for web browsing (HTTP), etc. Switch routers that can interrogate Layer 4 information can perform intelligent, applicationaware prioritization. They can prioritize in this way regardless of whether or not multiple applications are running on the same server. Summer 2004 Dr. Paul Chen 162 Layer 4 Prioritization Layer 4 classification can be combined with Layer 3 information and Type of Service (ToS) bits to provide granular classification of data flows to specific priority levels. This is also known as class-based queuing. These classifications are usually defined using Quality of Service Access Control Lists (QoS ACL). Summer 2004 Dr. Paul Chen 163 Layer 4 Prioritization For switch routers servicing customers accessing the Internet (edge), a large number of ACLs need be supported to allow for proper SLA (service level agreement) support—perhaps tens of thousands per switch router. For enterprise environments, a few thousand ACLs per switch router is sufficient. Summer 2004 Dr. Paul Chen 164 Layer 4 Prioritization A measure of how robust a switch router is, and its ability to perform in a carrier environment, is the number of ACL’s it can support. ACL tables are usually stored in SRAM or CAM. Summer 2004 Dr. Paul Chen 165 Queue Prioritization Two queue levels are usually enough for a wire speed switch router, but four is ideal to offer tiered service levels. Anything more than four is overkill for a wire speed device. Summer 2004 Dr. Paul Chen 166 Strict Prioritization They will always forward the highest priority packet. If a certain set of applications is assigned the highest priority, and there is traffic for those applications, then all other applications could be starved: they will never be able to transfer information. Summer 2004 Dr. Paul Chen 167 Weighted Fair Queuing Prioritization Policies are set such that given applications receive a percentage of the available bandwidth. In many situations, this actually more closely models the real world. The only downside is that this may cause output queues to become oversubscribed. Summer 2004 Dr. Paul Chen 168 Weighted Random Early Detection Once a buffer is beginning to get full, we randomly drop new packets so that specific flows are not penalized, and upper layer applications do not overreact to an excessive number of packets dropped. Summer 2004 Dr. Paul Chen 169 Intelligence in Layer 4 Switch Routers A single customer generates a stream of packets. This stream, called a flow, can be identified at Layer 2, Layer 3 or Layer 4. Each layer provides more detailed information about the flow. The fundamental task in managing a network is controlling these flows of traffic through a Service Provider or MAN to the Internet. Summer 2004 Dr. Paul Chen 170 Intelligence in Layer 2 Switch Routers Each frame is identified by the MAC address of the source and destination devices. The ability to control the flow is thus limited to the broadcast domain. Products that switch traffic at Layer 2 deliver high performance but little functionality. MAC address is useful in an edge device, but once the packet has gone through a router en route to another router, the Layer 2 information loses importance. Summer 2004 Dr. Paul Chen 171 Intelligence in Layer 2 Switch Routers At Layer 3, flows are identified by source and destination network addresses. The ability to control the flow is limited to source/destination pairs. Some switch routers operate at this level of granularity. If a client is using several applications from the same server, Layer 3 information does not provide visibility into each application flow, so individual rules cannot be applied. Summer 2004 Dr. Paul Chen 172 Intelligence in Layer 4 Switch Routers Software-based routers used Layer 4 information to set security filters to control access for network traffic. But there is a penalty with software-based routers: when they read more of the packet information, performance can drop by as much as 70%, especially if security filters are enabled. Summer 2004 Dr. Paul Chen 173 Intelligence in Layer 4 Switch Routers Layer 4 coordinates communication between network source and destination systems. Each packet contains information that can be used to uniquely identify the application that generated the packet. TCP and UDP headers include "port numbers" that identify which application protocols are included in each packet. Summer 2004 Dr. Paul Chen 174 Intelligence in Layer 4 Switch Routers In combination, the port number information in the Layer 4 header and the source destination information in the Layer 3 header can be used to apply truly fine-grained control. Individual application conversation flows can be controlled between clients and servers, and if the switch router is fully functional, all this can be done at wire speed. Summer 2004 Dr. Paul Chen 175 Intelligence in Layer 4 Switch Routers In combination, the port number information in the Layer 4 header and the source destination information in the Layer 3 header can be used to apply truly fine-grained control. Individual application conversation flows can be controlled between clients and servers, and if the switch router is full function, all this can be done at wire speed even with the security feature (via ASCI) enabled. Summer 2004 Dr. Paul Chen 176 Components of a Layer 2 Switch Switching Element Control Process Switching Process Output Controller Input Controller Port 1 Summer 2004 Port 2 Dr. Paul Chen Port 3 177 Functions of Switch Components Input Controller functions include: - receive data frames - MAC layer processing - filter out invalid frames (frames that are shorter than 64 bytes or with CRC error) - switch between cut-through and store-and-forward modes - buffer incoming data while transmitting the received frame to Control Process - fragment the packet into cells if the cell switching is used by the switching element Summer 2004 Dr. Paul Chen 178 Functions of Switch Components (continued) Control Process functions include: - transmission process (verifies the received DA against the address table to determine the destination port. If not found, broadcast to all ports) - learning process (enter the new SA in the address table and perform aging process to remove outdated SA from the table) - forwarding process (once destination port is determined, perform the treatment for uni-cast, multicast and broadcast, forward the data to the switching element) Summer 2004 Dr. Paul Chen 179 Functions of Switch Components (continued) Output Controller functions include: - receive packet from the switching element - forward the packet to the destination port based on the header information - re-assemble the cells into packets if cell switching is used by the switching element - flow control monitors the output resources and send signal to the switching element if congestion is detected. The switching element will send a PAUSE frame to the source port to suspend the data transfer. Summer 2004 Dr. Paul Chen 180 Single Chip Layer 2 Switch System SDRAM SRAM Frame Buffer CPU 64-Bit N+1 Ethernet Switch N x 10/100 Fast Ethernet Summer 2004 Dr. Paul Chen Flash 1 GE 181 System Architecture of A Single Chip Layer 2 Switch Registers External SRAM Switch Control Memory (SRAM) Frame Buffer Memory Frame Memory Interface RISC based Switch Controller CPU Interface Search Engine Frame Engine GMAC LED Xinterf N x 10/100 MACs Summer 2004 Dr. Paul Chen GMII 182 Features of Single Chip L2 Switch Support N 10/100 Auto-sensing Fast Ethernet ports with RMII interface, a single Gigabit Ethernet port with GMII interface. Full wire speed, full duplex L2 switching Internal switch database maintains up to 2K MAC addresses Summer 2004 Dr. Paul Chen 183 Features of Single Chip L2 Switch With external buffer memory, it supports up to 16K MAC addresses Support flow control (802.3x) Support 256 port and ID tagged VLAN (802.1Q) - VLAN tag insertion and extraction Summer 2004 Dr. Paul Chen 184 Functional Description of Single Chip L2 Switch When frame data is received from a MAC port, it is temporarily stored in the MAC Rx FIFO until the Frame Engine moves it to the chip’s external memory one granule (128-byte-or-less fragment of frame data) at a time. The Frame Engine then issues the Search Engine a switching request that includes the source MAC address, the destination MAC address, and the VLAN tag. Summer 2004 Dr. Paul Chen 185 Functional Description of Single Chip L2 Switch After the Search Engine has resolved the address, it transfers the information back to the Frame Engine via a switching response that includes the destination port and frame type (e.g. uni-cast or multicast). Summer 2004 Dr. Paul Chen 186 Functional Description of Single Chip L2 Switch Switch Controller is designed to implement highly efficient management functions for the switching hardware, minimizing the management activity intervention during frame processing. There are two modes of operation: cut-through mode, store-and-forward mode. Summer 2004 Dr. Paul Chen 187 Forwarding Decision Time for Fast Ethernet For a workgroup switch (100/1000 switch) with eight 100M ports (downlink) and one 1000M port (uplink), each port can be individually configured as full or half duplex. Assume that all link are FDX, the total bandwidth supported is 8 x 2 x 100 + 1 x 2 x 1000 = 3.6 G bps (in HDX) Filtering and forwarding decision must be made in a short time for cut-through mode of operation. Summer 2004 Dr. Paul Chen 188 Forwarding Decision Time for Fast Ethernet Forwarding decision time is the ratio of Interarrival time for minimum frames (64 bytes each) at full load and summation of ports. Inter-arrival time between frames is the time interval between start of two frames in a back-toback transmit mode. The shortest Inter-arrival time is for minimum frame size (64 bytes). Summer 2004 Dr. Paul Chen 189 Forwarding Decision Time for Fast Ethernet One Minimum frame : 64 bytes Inter-frame Gap (IPG): 12 bytes (96 bits) Extra time (7-byte preamble + 1-byte SFD): 8 bytes Bit time: 0.01 usec Inter-arrival time between back-to-back 64-byte frames is (64 + 12 + 8) x 0.01 x 8 = 6.72 usec Summer 2004 Dr. Paul Chen 190 Forwarding Decision Time for Fast Ethernet One Gigabit link equals ten 100M ports. Equivalent number of 100 M ports is 8 + 10 = 18 ports Forwarding decision time = 6.72 usec / 18 0.37 usec or 370 nsec. There is plenty of time for a hardware based ASIC chip to accomplish. Summer 2004 Dr. Paul Chen 191 Layer 3 Switches Layer 3 switch can handle Layer 2 (Data Link) as well as Layer 3 (Network) capabilities. Layer 3 switch can switch data packets based on the destination IP address (4 octets) contained in the IP header field. Summer 2004 Dr. Paul Chen 192 Layer 3 Switches Destination IP address is matched against the network address table (routing table) to determine the output port that is associated with the next hop. Hardware based Layer 3 switch performed the address matching with CAM (Content Addressable Memory) to shorten the latency associated with CPU-based search. Summer 2004 Dr. Paul Chen 193 Layer 3 Switch (continued) Layer 3 switching provides three key benefits for applications in campus networks: - Scalability: It benefit from the integration of ATM with Layer 3 routing. Label swapping enables ATM switches to be fully integrated into IP-based core network without scalability problem of a pure Layer 2 network. Summer 2004 Dr. Paul Chen 194 Layer 3 Switch (continued) - Traffic Management: Layer 3 switching simplifies traffic management in router-based internet by integrating Layer 2 circuit capabilities. It is able to control the flow of packets across a Layer 2 infrastructure to support the load balancing. - Performance: Higher performance is achieved by simplifying the packet-forwarding and switching decision. Summer 2004 Dr. Paul Chen 195 Layer 3 Switch (continued) Through the use of dedicated hardware such as the network processor, which fully integrates MAC, framer, Classification, Traffic Management, Switch fabric control and host CPU interface into one IC, and CAM for speedy destination IP matching / filtering, Layer 3 switch can switch IP data packets and process the control packets for BGP and OSPF via CPU. This is the key in the design of terabit routers. Summer 2004 Dr. Paul Chen 196 Layer 3 Switch (continued) IP switching, Tag switching, and Aggregate Route-based IP switching lead to the MPLS (multi-protocol label switch) standardized by the IETF. This can be used in ATM as well as IP switching / routing. Summer 2004 Dr. Paul Chen 197 L3 Switch Implementation Ethernet Interface Network Processor Ethernet Interface Summer 2004 Dr. Paul Chen CPU CPU I/O Module 198 CPU with Memory CAM + Frame Buffer Network Processor Switch Module Silicon Switch Fabric CAM + Frame Buffer I/O Module Functions of Network Processor Packet assembly Packet recognition, L2 / L3 classification, filtering Packet queuing Traffic shaping and management Quality of Service, Class of Service processing Packet modification and segmentation Switch fabric interface CPU interface (optional) Summer 2004 Dr. Paul Chen 199 Virtual LAN (VLAN) VLAN allows network operators to configure and administer a corporate network as a single bridge-interconnected entity, while providing users the connectivity and privacy (or security) they expect from having multiple separate networks. 100BASE-TX/FX Ethernet Switch Workgroup 1 Marketing Summer 2004 Workgroup 2 Engineering Dr. Paul Chen Ethernet Switch Workgroup 3 Workgroup 1 Human Resource Sales 200 Workgroup 2 Payroll VLAN (continued) VLAN is a logical broadcast domain. Traffics sent to the broadcast address on a specific VLAN is only forwarded to the other port with membership of that VLAN such as Engineering Department. Most Ethernet switches (disregard of the speed) support multiple VLANs. Identification of the VLAN membership is provided by VLAN tagging. A VLAN tag is inside the MAC frame. Summer 2004 Dr. Paul Chen 201 VLAN (continued) VLAN association can be performed using different policies. A station can be identified as belonging to a particular VLAN by the port on a switch that it is connected to. This is “port” based VLAN. Another policy could use the station address that can be blocked from joining another VLAN or forwarding data to another member in that VLAN. Summer 2004 Dr. Paul Chen 202 VLAN-Tagged MAC Frame Format 46 ~ 1500 bytes Pre SFD DA SA Length/Type 2 bytes VLAN Tag Data Pad FCS 2 bytes 802.1Q Tag Type Tag Control Information Tag Value = 0x81-00 User Priority CFI Byte 1 (Most significant byte) VLAN ID Tag Control Information Byte 2 (Least significant byte) VLAN ID Bit 7 6 5 4 3 2 1 0 CFI: Canonical Format Indicator used by token ring This bit is not used by 802.3 device and should be sent and received as 0 Summer 2004 Dr. Paul Chen 203 VLAN-Tagged MAC Frame Format (continued) 4-byte VLAN Tag consists of Tag Type ID (2 bytes) and Tag Control Information (2 bytes). The Tag is inserted between SA and Length/Type fields of the MAC frame. CRC must be recalculated any time a VLAN Tag is inserted or removed. The legal MAC frame size is modified to allow from 64 (minimum frame size) to 1522 bytes. When the VLAN Tag is not present, the maximum MAC frame size is still 1518 bytes. Summer 2004 Dr. Paul Chen 204 VLAN-Tagged MAC Frame Format (continued) Three fields are defined for the Tag Control Information: - 3-bit user priority allows up to 8 levels of priorities (0 is the highest) to support the “class-of-service”. - 1-bit CFI is not used by 802.3 device. Instead, it is used by the token ring vendors. - 12-bit VLAN ID Summer 2004 Dr. Paul Chen 205 VLAN Administration VLAN association (or administration) can be based on port, MAC address, subnet and protocol fields. Majority of switch vendors supports port based VLAN. Separate mapping tables are maintained and updated periodically in the SRAM to support different VLAN associations. Summer 2004 Dr. Paul Chen 206 VLAN Administration Null VLAN ID (VID) indicates that the tag contains no VID information, only the priority information. This is referred as the priority tagged frame. A VLAN-aware bridge or switch will forward this frame only after either classifying an appropriate TCI at the output port, or stripping the VLAN tag and retransmitting the frame untagged. Summer 2004 Dr. Paul Chen 207 LAN / MAN Management The network management system consists of a “manager” which executes the managing process, an “agent” which interacts with the manager and provides an interface to the resource to be managed, and the “managed objects” which reside in a local system. A managed object is a resource, which can be a physical device or a logical construct or function. A managed object provides a means to identify, control or monitor a resource. An agent resides in a local device and collects the information from the managed objects. Summer 2004 Dr. Paul Chen 208 LAN / MAN Management LAN / MAN Management The agent based on external network command (from the manager) could query the managed object. A management protocol is used to monitor or control devices and gets information on the managed objects. SNMP uses a standard object definition language and encoding rules which is called Abstract Syntax Notation One (ASN.1). Summer 2004 Dr. Paul Chen 209 Interaction between Manager, Agent, and Objects Communicating Management Operations Performing Management Operations Agent Manager Notifications Notifications Emitted Management Station Local System Environment Managed Objects SNMP protocol Local System Environment is a Managed Node. Summer 2004 Dr. Paul Chen 210 SNMP ASN.1 abstract syntax is essentially a primitive data declaration language. It allows the user to define primitive objects and then combine them into more complex ones. The ASN.1 basic data types allowed in SNMP are shown in the following: Primitive Type Meaning Code INTEGER Arbitrary length integer 2 BIT STRING A string of 0 or more bits 3 OCTET STRING A string of 0 or more unsigned bytes 4 NULL A place holder 5 OBJECT IDENTIFIER An officially defined data type 6 Summer 2004 Dr. Paul Chen 211 SNMP MIB The collection of all possible objects in a network is given a data structure called the Management Information Base (MIB). A MIB specifies the different counters, status events, alarms, and notifications for each managed object. Clause 30 of IEEE 802.3z provides a standard for defining the MIB. Summer 2004 Dr. Paul Chen 212 SNMP MIB Clause 30 defines MIB objects, attributes, notifications, and behavior for: - 10 Mb/s DTE, 10 Mb/s baseband repeater and 10 Mb/s integrated MAU - 100 Mb/s DTE, 100 Mb/s baseband repeater and 100 Mb/s PHY - 1000 Mb/s DTE, 1000 Mb/s baseband repeater and 1000 Mb/s PHY Summer 2004 Dr. Paul Chen 213 SNMP Protocol SNMP manager sends a request to an agent asking it for information or commanding it to update its state. The agent just replies with information or confirms that it has updated its state. Data are sent using the ASN.1 transfer syntax. SNMP defines 7 messages that can be sent. Six messages are listed in the following with the 7th message being the response message: Message Description Get-request Request the value of one or more variables Get-next-request Request the variable following this one Get-bulk-request Fetch a large table Set-request Update one or more variables Inform-request Manager-to-manager message describing local MIB SnmpV2-trap Agent-to-manager trap report Summer 2004 Dr. Paul Chen 214 MAN As we mentioned earlier that 802.6 DQDB and SMDS (which is based on DQDB) are not in use today except in Europe. We would briefly discuss its principle of operation. We will discuss more on the applications of Gigabit and 10G Ethernet on MAN. Summer 2004 Dr. Paul Chen 215 MAN Majority of the MAN is based on SONET (Synchronous Optical Network) which is TDM based and is not the most efficient way to carry the IP or Ethernet traffic which is asynchronous based. Especially, until recently, all SONET rates are multiple of 4 above STS-3 (STS-3, STS-12, STS48, etc.). This leads to inefficient use of SONET payloads. Summer 2004 Dr. Paul Chen 216 MAN For example, if the IP or Ethernet traffic fits into STS-20 but must be carried in STS-48 payload. This issue is being addressed with the introduction of more flexible payload with STS-n where n can be any integer up to the SONET line interface rate. Each STS-n can be transferred over a different path. Summer 2004 Dr. Paul Chen 217 DQDB Dual Bus Architecture Head of Bus A Bus A S E Node 1 Node 2 Node 3 E Node 4 S Bus B Head of Bus B S: Start of Data Flow E: End of Data Flow Both buses operate simultaneously. The aggregate capacity of the network is twice the transmission rate of one bus. Summer 2004 Dr. Paul Chen 218 DQDB Slot Format A DQDB slot is 53 octets, which is divided into - 1 byte: Access Control - 4 bytes: Segment Header - 48 bytes: Segment Payload This format is similar to that of ATM cell. Summer 2004 Dr. Paul Chen 219 Node not queued to send on Bus A Bus A 0 - Access Unit (AU) Cancel one request for each QA (queued arbitrated) slot on Bus A Request Counter (RQ) + Bus B Count requests on Bus B 1 For each REQ that passes the AU on Bus B, RQ is incremented by one. RQ counter is decremented by one for each empty QA (Queued Arbitrated) slot that passes on Bus A. Summer 2004 Dr. Paul Chen 220 Node queued to send on Bus A Bus A 0 Cancel one request for each QA (queued arbitrated) slot on Bus A - Request Counter (RQ) Dump count to Join queue Countdown Counter (CD) + Count requests on Bus B Bus B 1 Summer 2004 Dr. Paul Chen 221 Node queued to send on Bus A When the AU issues a REQ on Bus B, it transfers the content of its RQ to CD and resets RQ to zero. This initializes CD with the number of downstream segments queued ahead of this AU’s segment. Transmission is allowed when CD equals zero. For example, the value of RQ was 2. When this was transferred to CD, it forces the AU to bypass 2 empty slots, which were reserved by downstream AU’s, before this AU can transmit. Summer 2004 Dr. Paul Chen 222 IP Over SONET Packet-Over-SONET/SDH (POS) is an emerging technology for carrying IP and other data traffic over the SONET/SDH backbone. Variable length data packets are mapped directly into the SONET Synchronous Payload Envelope (SPE). It may be used in layer 2 switches or layer 3 switches/routers depending on the specific implementation. POS provides reliable, high capacity, point-to-point data links using the SONET physical layer transmission standards. Summer 2004 Dr. Paul Chen 223 IP Over SONET Mapping into SONET using the Point-to-Point Protocol (PPP) was standardized in accordance with RFC 1619. SONET is a world-wide standardized transmission protocol for implementing a robust, scalable transport mechanism with industry standardized interfaces. It provides a standard operating environment with defined protocols for operations management, “provisioning”, and performance assurance. In an IP (through PPP) over SONET infrastructure, POS links provide high bandwidth pipes that can be used to interconnect high-speed routers. Summer 2004 Dr. Paul Chen 224 IP Over SONET In an IP (through PPP) over SONET infrastructure, POS links provide high bandwidth pipes that can be used to interconnect high-speed routers. Access Routers POS Link Backbone Routers POS Link SONET/SDH Backbone Network Summer 2004 Dr. Paul Chen 225 IP Over SONET (continued) PPP provides a standard method for transporting multi-protocol datagrams over point-to-point links. These links provide full-duplex simultaneous bi-directional operations, and are assumed to deliver packets in order. Delineation of PPP encapsulated IP datagrams is performed using Flag Sequence recognition and byte stuffing/de-stuffing techniques. Flag 0x7E Address 0xFF Control 0x03 Protocol 8/16 bits Information Padding PPP HDLC-like Frame Format Summer 2004 Dr. Paul Chen 226 FCS 16/32 Flag 0x7E STS- 1 Frame with IP 87 Bytes 3 Bytes 3 Bytes Section Overhead Line 6 Bytes Overhead 1X9 Byte 1X9 Byte 1X9 Byte P a t h F I X F I X O v e r h e a d S t u f f 51.84 Mbps Summer 2004 Dr. Paul Chen 227 IP Payload 48.384 Mbps S t u f f STS-3c Frame Structure 9 Bytes 261 Bytes H 3 Bytes 6 Bytes Section Overhead Line Overhead P a t h O v e r h e a d 155.52 Mbps Summer 2004 Dr. Paul Chen … H 228 IP Payload 149.76 Mbps … H STS-48c Frame Structure 15 Bytes 144 Bytes 4160 Bytes H 3 Bytes 6 Bytes Section Overhead Line Overhead P a t h F i x e d O v e r h e a d IP Payload 2.39616 Gbps S t u f f … 4176 Bytes 2.48832 Gbps Summer 2004 Dr. Paul Chen … H 229 H SONET Hierarchy Summer 2004 Dr. Paul Chen 230 IP Over SONET Protocol Stack IP datagrams are encapsulated into PPP packets, which are then framed into POS Frames using HDLC-like framing according to RFC 1662, and finally, mapped byte synchronously into the SONET SPE. Network Layer IP Datagrams Protocol encapsulation Error Control Link Initialization PPP PPP Packet delineation HDLC Framing Data Link Layer SONET Byte Delineation Physical Layer IP over SONET Protocol Stack Summer 2004 Dr. Paul Chen 231 Ethernet Over SONET To connect its head office and branch offices to the same LAN, there is an interconnection problem. To interface the Ethernet to the WAN provided by the Telco/PTT, historically, has required an inter-working protocol, as Ethernet is not directly supported over the SONET/SDH network. Two HDLC-like framing format are used to encapsulate the MAC frame. One is based on ITU-T X.86 Link Access Procedure – SDH (LAPS) while the other is based on ITU-T G.7041 Generic Framing Procedure (GFP). Majority of vendor equipment adopts the GFP framing format. Both framing formats are implemented by dedicated silicon. X.86 is mainly used in Europe. Summer 2004 Dr. Paul Chen 232 Public Transport Network Infrastructure Voice Data (IP, IPX) SAN Video Ethernet* Private Lines DVI* POS FICON* ESCON* FR RPR Fiber Channel* X.86 HDLC* ATM GFP SONET / SDH Under study WDM / OTN Fiber Summer 2004 Dr. Paul Chen 233 * : These types of traffic may also run directly over fiber Ethernet over SONET Using LAPS Framing MSB Flag (0x7E) LSB 1 octet MSB Address (SAPI, 0x0C) LSB 1 octet MSB Control (0x03) LSB 1 octet Octets within frame transmitted from top to bottom Destination Address (DA) 6 octets Source Address (SA) 6 octets 2 octets Length/Type MAC Client data 46-1 500 octets PAD MSB FCS of MAC 4 octets FCS of LAPS 4 octets LSB Flag (0x7E) MSB LSB Bit8 Bit1 Bits within an octet transmitted from left to right T0733630-00 (114882) The LAPS format which encapsulates IEEE802.3 MAC frame (shown in shaded area) Summer 2004 Dr. Paul Chen 234 Functions Performed by the LAPS Rate Adaptation is done by sending sequence(s) of {0x7d, 0xdd} during transmit process. The receive entity will remove the Rate Adaptation octet(s) "0xdd" within the LAPS frame when detecting sequence(s) of {0x7d, 0xdd}. LAPS Transmit Processing LAPS Receive Processing Summer 2004 Dr. Paul Chen 235 Functions Performed by the LAPS Error Frame Handling supports two options for aborting an erroneous frame: - The first option is to abort a frame by inserting the abort sequence, 0x7d7e. - The second option, the LAPS entity can also abort an erroneous frame by simply inverting the FCS bytes to generate an FCS error. Summer 2004 Dr. Paul Chen 236 Ethernet Over SONET Using GFP Framing Generic Framing Procedure (GFP) is a protocol for mapping packet data into an octetsynchronous transport such as SONET. Unlike HDLC-based protocols, GFP does not use any special characters for frame delineation. Instead, it has adapted the cell delineation protocol used by ATM to encapsulate variable length packets. Summer 2004 Dr. Paul Chen 237 Ethernet Over SONET Using GFP Framing A fixed amount of overhead is required by the GFP encapsulation that is independent of the contents of the packets. In contrast to HDLC whose overhead is data dependent, the fixed amount of GFP overhead per packet allows deterministic matching of bandwidth between the Ethernet stream and the virtually concatenated SONET stream. GFP, virtual concatenation must work with LCAS (Link Capacity Adjustment Scheme) and a distributed control plane (e.g. GMPLS) to make SONET more efficient. Summer 2004 Dr. Paul Chen 238 Ethernet Over SONET Using GFP Framing The GFP overhead consists of up to 3 headers: - a Core header containing the packet length and a CRC which is used for packet delineation; - a Type header identifying the payload type; - an Extension header, which is optional. Frame delineation is performed on the core header. The core header contains the two byte packet length and a CRC. The receiver would hunt for a correct CRC and then use the received packet length to predict the location of the start of the next packet. Summer 2004 Dr. Paul Chen 239 GFP Encapsulation Format GFP Frame Ethernet MAC Frame Octets Octets 7 Preamble 2 PLI 2 cHEC 2 Type 2 tHEC 0 - 60 1 SFD 6 DA 6 SA 2 Length / Type GFP Extension Header GFP Payload MAC client Data Pad 4 Bit # Summer 2004 FCS 0 1 2 3 4 5 6 Dr. Paul Chen 7 0 240 1 2 3 4 5 6 7 Frame-Based GFP Within GFP, there are two different mapping modes defined: frame based mapping and transparent mapping. Each mode is optimized for providing different services. Frame based GFP is used for connections where efficiency and flexibility are key. In order to support the frame delineation mode utilized within GFP, the frame length must be known and prepended to the head of the packet. In many protocols, this forces a store-and-forward encapsulation architecture in order to buffer the entire frame and determine its length. Summer 2004 Dr. Paul Chen 241 Frame-Based GFP This buffering may add undesirable latency. Frame based GFP is good for sub-rate services and statistically multiplexed services as the entire overhead associated with the line coding and inter-packet gap (IPG) are discarded and not transported. Summer 2004 Dr. Paul Chen 242 Transparent GFP Transparent GFP is useful for applications that are sensitive to latency or for unknown physical layers. In this encapsulation, all code words from the physical interface are transmitted. Currently, only physical layers that use 8B/10B encoding are supported. In order to increase efficiency, the 8B/10B line code are trans-coded into a 64B/65B block code and then the block codes are encapsulated into fixed sized GFP packets. Summer 2004 Dr. Paul Chen 243 Transparent GFP This coding method is primarily targeted at Storage Area Networks (SANs) where latency is very important and the delays associated with frame based GFP cannot be tolerated. Summer 2004 Dr. Paul Chen 244 Reference for GFP IEEE Communications Magazine, May 2002, Vol. 40, No. 5 “GFP and Data over SONET/SDH and OTN” Summer 2004 Dr. Paul Chen 245 IP Over Fiber (DWDM) SONET is a Physical Layer device, which schedules the IP packets to be transported by the way of time division multiplexing and provisioning. SONET is primarily designed for voice only system. IP ATM SONET DWDM Layer Protocol Stack for SONET over DWDM Summer 2004 Dr. Paul Chen 246 Problems and Overhead with SONET SONET is divided into four layers: Path, Line, Section, and Photonic. SONET relies on overhead bytes in Path, Line, Section layers to perform restoration in case of failure. The payload in currently installed SONET systems presents a low utilization / efficiency for IP traffic which is bursty. Data Bit Rate SONET Rate Effective Payload Rate Bandwidth Efficiency 10 Mbit/s Ethernet STS-1 ~48.4 Mbit/s 21% 100 Mbit/s Fast Ethernet STS-3c ~150 Mbit/s 67% 1Gbit/s Ethernet STS-48c ~ 2.4 Gbits/s 42% Summer 2004 Dr. Paul Chen 247 IP Over DWDM SONET is expensive in cost. It requires time consuming “provisioning” before it can be put into service to carry traffics. Multi-layer structure of SONET provides redundancy but presents functional overlapping in restoration. It also introduces undesired latency caused by framing and payload mapping. Transmitting IP directly over DWDM systems can increase the bandwidth and reduce the latency. Summer 2004 Dr. Paul Chen 248 IP Over DWDM DWDM system performs satisfactorily at high speeds of OC-192 (10 Gb/s). The overheads associated with ATM and SONET can be eliminated. With proper design, the new system (e.g. RPR) can facilitate faster restoration, provisioning, and path determination. So, we have an optical IP transport system. Summer 2004 Dr. Paul Chen 249 ATM Basics Asynchronous Transfer Mode (ATM) is a connection-oriented, cell-based switching technology that uses 53-byte cells to transport information. ATM does not transmit cells asynchronously, as the name suggests. ATM cells are transmitted continuously and synchronously, with no break between cells. When no user information is transmitted, empty or idle cells are sent instead. Summer 2004 Dr. Paul Chen 250 ATM Basics The asynchronous nature of ATM comes from the indeterminate time when the next information unit of a logical connection may start. Time not used by one logical connection may be given to other connections or filled with idle cells. This means that cells for any given connection arrive asynchronously. Small and fixed cell size facilitates simpler hardware implementation, efficient memory usage for buffering, efficient transport of constant, lowbit rate information such as voice. Summer 2004 Dr. Paul Chen 251 ATM and B-ISDN Relationship ATM is the foundation technology for BroadbandISDN B-ISDN is the universe of services that will be made possible by the use of ATM technology VOICE DATA VIDEO Summer 2004 Dr. Paul Chen 252 Broadband Protocol Model Signaling (VBR) CO (VBR) CBR Other VBR e.g. e.g. e.g. DS1 DS3 VBR Video Frame Relay X.25 Voice Other Services Upper Layer 2 La ye 2 r Co n P l t ro an l e User Plane S L a e rv ye ice r s Pr o ot r H oc ig ol he s r Management Plane AAL ATM PDH SONET/SDH Summer 2004 Dr. Paul Chen 253 r e y a L 1 Functions of ATM Layers End Station ATM Switch A A P A T H L M Y P A P H T H Y M Y End Station P A A H T A Y M L ATM Cells • ATM Adaptation Layer (AAL): Inserts/extracts information into 48 byte payload • ATM Layer: Adds/removes 5 byte header to payload • Physical Layer: Converts to appropriate electrical or optical format Summer 2004 Dr. Paul Chen 254 ATM Protocol Stack ISO Model (OSI) Layer 3 (Network) MAC •Service Access •Point (SAP) AAL - SAP (Not part of ATM) Service Specific Functions (SSCS) • Provide additional functions as required for specific services (can be null) Common Part Convergence Sublayer (CPCS) • Builds header and trailer records onto user data frame • Assures integrity at the frame level Sublayer Boundary Layer 2 (Link) Higher Layers ATM Adaptation Layer (AAL) Segmentation and Reassembly (SAR) • Converts CPCS frames into cells • Adds cell headers and trailers to provide integrity at the cell level Cell Switching Service Access Point (SAP) ATM Layer Transmission Convergence Sublayer • HEC generation and checking • Transmission frame adaptation Layer 1 (Physical) Summer 2004 • Cell delineation • Decoupling of Cell Rate (ITU systems) Physical Media Dependant Sublayer • Encoding for transmission • Timing and synchronization Dr. Paul Chen • Transmission (Electrical/Optical) 255 Physical Layer Comparison of ATM with other Technologies CONVENTIONAL LAN CONVENTIONAL TELECOM ATM TRAFFIC TYPE DATA VOICE DATA, VOICE, VIDEO TRANSMISSION UNIT VARIABLE PACKET FIXED FRAME FIXED CELL UP TO G BPS UP TO G BPS M BPS TO G BPS CONNECTION LESS CONNECTIONORIENTED BEST EFFORT GUARANTEED CONNECTIONORIENTED DEFINED CLASSES SHARED DEDICATED RATE CONNECTION TYPE DELIVERY OF TRAFFIC ACCESS Summer 2004 Dr. Paul Chen 256 DEDICATED Anatomy of an ATM Cell 8 Byte 1 Byte 2 Byte 3 Byte 4 7 6 5 4 3 2 1 VPI GFC (UNI) OR VPI (NNI) VCI VPI Header VCI VCI PTI CLP HEC Byte 5 Payload 48 Bytes Summer 2004 Dr. Paul Chen 257 Virtual Circuits First we have the cable... Next, ATM Addressing Defines Paths... • VP’s Then Channels. • VC’s Summer 2004 Dr. Paul Chen 258 SONET and ATM Channels Transport Overhead Transport Overhead Path Overhead Path Overhead STS-1 (DS3) STS-1 (DS3) VT1.5 DS1 STS-1 28 VT1.5 Summer 2004 Dr. Paul Chen 259 Virtual Paths & Virtual Channels VCs VP VCs VP Physical Transmission Link VP VCs VP VCs VPI: Virtual Path Identifier 4,096 at NNI and 256 at UNI VCI: Virtual Channel Identifier 65,536 Both used to route cells through network Unique on link-by-link basis Interpreted at each switch Summer 2004 Dr. Paul Chen 260 PVC - Manual Set Up Console or NMS GUI VPI/VCI 14/1055 14/1055 87/ 45 125 / 5 2 9/47 9/47 Summer 2004 Pre-established connections Permanent No signaling required Dr. Paul Chen 261 SVC - Automatic Set Up Connect to B OK OK Terminal B Connect to B OK Connect to B Terminal A Uses UNI 3.0/3.1 signaling OK VPI/VCI = 0/5 Automatic Transparent to User Summer 2004 Dr. Paul Chen 262 ATM - Operation and Maintenance Principles Fault Management, using AIS, RDI, continuity check and loopback OAM cells. Performance management, using forward monitoring and backward reporting OAM cells. Activation/deactivation of performance monitoring and/or continuity check, using activation/deactivation OAM cells. System management OAM cells for use by end-systems only. Summer 2004 Dr. Paul Chen 263 Concept -OAM Operations, Administration and Maintenance (OAM) ATM allows the maintenance/test operation to be performed on a VPC or VCC. These operations are performed on a selected basis; they can span segments or can be end-to-end. Types of maintenance/test operations: Performance Monitoring - a VPC or VCC is monitored to ensure the connection is not congested or has degraded (forward and backward monitoring are provided) Failure detection (AIS, RDI) PM and Failure Reporting (RDI, PM results) Facility Protection of VPCs Fault Isolation (continuity checks and loopbacks) Summer 2004 Dr. Paul Chen 264 Operation and Maintenance Flows Physical Layer Mechanism F1: SONET Section Level F2: SONET Line Level F3: SONET Path Level ATM Layer Mechanism F4: Virtual Path Level • End to end F4 flow • Segment F4 flow F5: Virtual Channel Level • End to end F5 flow • Segment F5 flow Summer 2004 Dr. Paul Chen 265 ATM Fault Management Example STE PTE LOS Terminal Repeater X F1 F2 (AIS-L) Using F1 - F5 Flows LTE ATM Switch ATM Switch ADM VP VC F3 (AIS-P) F4 (VP-AIS) F2 (RDI-L) F3 (RDI-P) F4 (VP-RDI) F5 (VC-RDI) RDI: Remote Defect Indicator Summer 2004 Dr. Paul Chen 266 F5 (VC-AIS) Example of Mechanism for OAM Flows VCC endpoint VP cross-connect VC cross-connect AAL Physical layer connecting point ATM PL PL PL VCC endpoint AAL ATM ATM ATM ATM ATM PL PL PL PL PL VCI 1 VCI 1 VCI 2 VCI 2 Virtual channel OAM cell indicated by PT identifier F5 VPI 1 VPI 1 VPI 2 VPI 2 Virtual path connection uses VCI(=3/4) for OAM F4 Transmission path F3 F1, F2 Summer 2004 F1, F2 Dr. Paul Chen Trans path F3 F1, F2 267 VPI 3 VPI 3 VPC - OAM F4 Trans path F3 F1, F2 Layered Model of AIS & RDI VC-AIS (F5) VP-AIS (F4) AIS-P (F3) AIS-L (F2) (F1) VC VC-RDI (F5) VP VP-RDI (F4) PATH RDI-P (F3) LINE RDI-L (F2) SECTION (BIP-8 PM, F1) PHYSICAL (Layer to layer indications) Summer 2004 Dr. Paul Chen (Peer to peer indications) 268 The ATM Adaptation Layer The AAL process is the most important feature of the ATM Communications process... How the Adaptation process is carried out depends on the type of service to be transported... AAL TYPE SERVICE TYPE COMMENTS AAL1 Isochronous Traffic like DS0, DS1s, DS3s to carry Voice For data services, compressed Audio / Video, etc. Bursty data over long periods AAL2 AAL3 AAL4 AAL5 Summer 2004 Constant Bit Rate CBR Variable Bit Rate VBR Connection-Oriented for VBR Data Transfer Connectionless VBR Data Transfer Simplified AAL Dr. Paul Chen For short, bursty data (SMDS…) Mainly for point-to-point 269 Classes of ATM Service CLASS A Timing Relation Between Source & Destination Bit Rate Required 1 Dr. Paul Chen CLASS D Variable CONNECTION ORIENTED AAL Types CLASS C Not Required Constant Connection Mode Summer 2004 CLASS B 2 270 CONNECTION-LESS 3/4, 5 3/4 The AAL Process AAL is divided into two sublayers: USER INFORMATION CS Process CS-PDU 1) CONVERGENCE SUBLAYER CS-PDU CS-PDU SAR Process 2) SEGMENTATION & REASSEMBLY SUBLAYER SAR-PDU SAR-PDU SAR-PDU SAR-PDU These two sublayers convert the user information into 48-byte cell payloads. Each sublayer produces a Protocol Data Unit (PDU). The CS-PDU is variable length while the SAR-PDU is always 48 bytes. Summer 2004 Dr. Paul Chen 271 AAL-1 Processing Payload Header SN Field 4 Bits 1 CSI SNP Field 4 Bits 1 2 3 2 3 4 Sequence Count CRC PDU Payload (47 Octets) 4 Parity SN: Sequence Number SNP: Sequence Number Protection CSI: Convergence Sublayer Indicator Summer 2004 Dr. Paul Chen 272 AAL-2 Processing CPS-Packet Header (3 octets) CPS-Packet Payload (1 to 45/64 octets) CPS-Packet Cell Header (5 octets) Start Field (1 Octet) CPS-PDU Payload( up to 47 octets and pad) CPS-PDU ATM Cell Each AAL2 user generates CPS packets with a 3-octet packet header and a variable length payload. The CPS sublayer collects CPS packets from AAL2 users multiplexed onto the same VCC over a specified interval of time, forming CPS-PDU, comprised of 48 octets worth of CPS packets. Summer 2004 Dr. Paul Chen 273 The AAL Process: AAL 3/4 CS-PDU CS-PDU CPI BTag BASize Information Pad AL ETag Length CPI: Common Point Indicator - 1 Byte BTag: Beginning Tag - 1 Byte BA Size: Buffer Allocation Size - 2 Bytes Info Payload: Length of Payload (Max: 65, 535 Bytes) Pad: Up to 3 Bytes - used to align CS-PDU length AL: Alignment - 1 Byte ETag: End Tag - 1 Byte Length: 2 Bytes Summer 2004 Dr. Paul Chen 274 AAL 3/4 CPI BTag BASize AAL SEVICE DATA UNIT AAL - SDU 44 Bytes BOM SequenceSequence Type Number 2 BITS 4 BITS MID Payload 10 BITS Length Indicator CRC 10 BITS 2 Bytes BOM: Beginning of message COM: Continuation of message EOM: End of message Summer 2004 44 Bytes 6 BITS 2 Bytes Convergence Sublayer Protocol Data Unit: CSCS-PDU Al Fill Length ETag Length 44 Bytes Payload COM Payload EOM Segmentation & Reassembly Protocol Data Unit: SARSAR-PDU MID: Message Identifier BASIZE: Buffer Allocation Size CRC: Cyclic Redundancy Check BTAG: Beginning Tag EOM: End of message ETAG: End Tag Dr. Paul Chen 275 The AAL Process: AAL5 CPCS-PDU CPCS-PDU CPCS-PDU Trailer CPCS-PDU Payload 1 - 65,535 PAD CPCS-UU CPI 0- 47 1 1 Length CRC 2 4 Unit: octets PAD: Padding UU: User-to-User Indication CPI: Common Part Indicator Summer 2004 Dr. Paul Chen LENGTH: CPCS-PDU Length CRC: Cyclic Redundancy Check CPCS: Common Part Convergence Sublayer 276 AAL-5 AAL Service Data Unit (SDU) AAL5-SDUs AAL5-SAP CPCS-PDUs octets CPCS-PDU Trailer 1-65,535 octets CPCS-PDU Payload PAD CPCS-UU CPI Length CRC 0-47 1 1 2 4 ••• SAR Payload SAR Payload Header 5 Payload 48 Summer 2004 Header Payload 5 SAR Payload ••• Payload Type= AAL_Indicate ••• 48 Dr. Paul Chen 277 Header Payload 5 48 SAR-PDUs ATM-SAP Cells Octets ATM Connections ATM is virtual connection-oriented; there must always be a virtual connection established before cells can be sent Connections can be established: ›› Administratively as PVCs – Lowest common denominator for Interoperability for devices not supporting UNI 3.x signaling ›› Dynamically as SVCs – Implies ATM signaling capability Summer 2004 Dr. Paul Chen 278 ATM Switches are easily Scaleable in Speed ATM protocol is connection-oriented once connection is set up, cells are quickly switched in hardware by using VPI/VCI at very high speeds Uses fixed cell length Allows switch hardware to be optimized around a fixed length cell Uses SONET as physical layer interface Scales to high speed and is defined and deployed at Gigabit rates Summer 2004 Dr. Paul Chen 279 Logical ATM Switch Fabric ATM Switch Ingress Path from interface PHY receive termination Connection Lookup OAM Processing Policing Buffering, Queuing & Scheduling to queue ATM Layer Processing Physical/TC Layer Processing ATM Switch Egress Path from queue Fabric receive termination Buffering, Queuing & Scheduling ATM Layer Processing Internal Loopback Summer 2004 Connection Lookup Dr. Paul Chen 280 OAM Policing (EFCI) to interface Concept – VPs and VCs in the Network VP2 Link 1 Link 1 NN 1 VP2 VP3 Link 2 Link 1 VP3 VC8 VC11 VC8 VC11 VC21 VC21 VC11 VC2 VC2 VC11 VP8 VP5 VP5 VP8 VP6 CPN 1 VP6 VC7 User/Network Interface (UNI) VC2 VP3 Link 3 Link 2 CPN 2 VP5 Link 4 Network Node Interface (NNI) User/Network Interface (UNI) Link 1 VP3 VP5 CPN 3 VC7 VP2 Link 2 Link 1 VP2 VC2 VC9 Routing Concept in an ATM Network Summer 2004 VC9 VP1 NN 2 Dr. Paul Chen 281 Link 3 Link 2 VP1 ATM vs.. Gigabit Ethernet Network Installed Desktops LAN protocols (IP, IPX) Scalability WAN QoS Multimedia Gigabit Ethernet Yes Yes Yes Emerging Emerging ATM Yes, but not much. With MPOS and LANE Yes, with MPOA and LANE Yes, with MPOA and LANE Yes Yes Summer 2004 Dr. Paul Chen 282 Gigabit Ethernet and ATM Feature Comparison Feature Gigabit Ethernet ATM Price/Performance/Bandwidth Low cost Moderate to high cost Quality of Service (QoS) RSVP, IEEE802.1Q/p, differential services Guaranteed QoS with traffic management User Applications High-speed data, voice/video over IP Data, video and voice Product Availability Since late 1997 Since early 1996 Network Applications Building backbone, campus backbone servers and risers WAN, building backbone, campus backbone servers and risers Summer 2004 Dr. Paul Chen 283 Bandwidth Overhead – ATM Cell Tax For a 1,500-byte IP datagram, Gigabit Ethernet adds 26 bytes of hearer, resulting in 1,526 bytes to transmit a 1500-byte IP datagram. ATM AAL5 layer adds an 8-byte trailer and a variable pad size to ensure that the AAL5 protocol data unit (PDU) is a multiple of 48 bytes. For a 1,500-byte IP datagram, this results in an AAL5 PDU equal to 1,536 bytes. AAL5 adaptation layer then segments the AAL5 PDU into 48-byte segments to be carried in 53byte ATM cells. Each cell has a 5-byte header. Total of 32 ATM cells (1696 bytes) are needed to transmit a 1500-byte IP datagram. Summer 2004 Dr. Paul Chen 284 Bandwidth Overhead – ATM Cell Tax The corresponding efficiencies are 98% for Gigabit Ethernet and 88% for ATM. Since ATM cells are carried inside the SONET payloads, additional 5% or more overhead and payload inefficiency need to be accounted for. Gigabit Ethernet over fiber is more efficient in this comparison. Summer 2004 Dr. Paul Chen 285 Optical Fiber Core Cladding Jacket Diameter of Core = d Diameter of Cladding is standardized at 125 um = D Summer 2004 Dr. Paul Chen 286 Optical Fiber (continued) Core diameter for single mode (SMF) and multimode fiber (MMF) is different. - SMF: 2 – 10 um (8.6 to 9.5 um commonly used) - MMF: 50 – 200 um SMF supports one ray (mode) due to small D/d propagate MMF supports many rays (modes) Summer 2004 Dr. Paul Chen 287 Optical Fiber (continued) MMF and SMF have different manufacturing processes, refractive index, dimensions, and therefore different transmission characteristics. So they find different applications. Summer 2004 Dr. Paul Chen 288 MMF It minimizes delay spread, although the delay is still significant. A 1% index difference between core and cladding amounts to 1 to 5 nsec/km delay spread. Easy to splice and couple light into it Bit rate is limited to 100 Mbps for up to 20km; shorter length supports higher bit rates. Fiber span without amplification is up to 20 km at 100 Mbps. Summer 2004 Dr. Paul Chen 289 SMF It almost eliminates delay spread. More difficult to splice and exactly align two fibers together. More difficult to couple all photonic energy from a source into it. It is suitable for transmitting modulated signals at 40 Gbps or higher and up to 200 km without amplification. Summer 2004 Dr. Paul Chen 290 DWDM Basics Two ways to increase the bandwidth in a single fiber - Increase the bit rate: transmitting a reliable signal at 40 G b/s is available today but quite expensive. - Increase the number of wavelengths in the same fiber: several wavelengths, each transporting at 10 G or 40 Gb/s will significantly increase the total bandwidth. Summer 2004 Dr. Paul Chen 291 DWDM Basics WDM and DWDM definitions - WDM couples many wavelengths in the same fiber, thus increases the aggregate bandwidth in a single fiber. - DWDM couples a larger (denser) number of (> 40) wavelengths into a fiber than WDM. However, several issues need to be addressed, such as channel width and spacing, total optical power launched in the fiber, cooling, non-linear effect, cross talk, span of fiber, amplification, etc. An early WDM with < 10 wavelengths and larger channel width and spacing is termed Course WDM (CWDM). Summer 2004 Dr. Paul Chen 292 DWDM Technology Enabler Successful deployment of DWDM is a result of several technologies: - Fiber with 1.3 and 1.55 um wavelength spectrum provides low loss and better transmission performance - Optical amplifiers with flat gains over a range of wavelengths eliminate the need for regenerators - Integrated solid-state optical filters on the same substrate with other optical components Summer 2004 Dr. Paul Chen 293 DWDM Technology Enabler (continued) - Optical MUX/DMUX is based on passive optical diffraction - Tunable filters can be used as optical add-drop MUX (OADM) - OADM components have made DWDM possible in MAN and long haul networks - OXC (optical cross-connect) made the optical switching possible. Summer 2004 Dr. Paul Chen 294 DWDM System Components DWDM technology requires specialized optical devices that are based on properties of light and on the optical, electrical, and mechanical properties of semiconductor material. These devices include: Optical transmitter, optical receiver, optical filter, optical modulator, optical amplifier, wavelength converter, OADM and OXC. Optical modulator controls the amount of continuous optical power transmitted in an optical waveguide. Summer 2004 Dr. Paul Chen 295 DWDM System Components (continued) Wavelength converter enables optical channels to be relocated OADM selectively drops a wavelength from a set of wavelengths in a fiber, thus drops the traffic on this channel. It then adds in the same direction of data flow the same wavelength, but with different data content. An OXC interconnects N optical inputs with N outputs using either hybrid or all optical approach. Each port handles a bundle of multiplexed single-wavelength signals. Summer 2004 Dr. Paul Chen 296 DWDM System Components (continued) An OXC supports network reconfiguration and allows network providers to transport and manage wavelengths efficiently at the optical layer. An OXC is most efficient when it contains bit-rate and format independent optical switch. It can perform signal monitoring, provisioning and grooming, restoration at the photonic layer itself. Loss due to fiber dispersion and non-linearity can be compensated through use of Dynamic Compensation (non-linearly chirped fiber Bragg grating). Summer 2004 Dr. Paul Chen 297 Structure of DWDM System Transmitters Receivers λ1 λ1 λ2 λ3 λ2 EDFA 48 DWD Mux Optical Fiber DWD DeMux Virtual Fibers λn λn EDFA: Erbium Doped Fiber Amplifier Summer 2004 λ3 Dr. Paul Chen 298 DWDM Optical Transmission The photonic layer of DWDM system is responsible for converting the electronic data to information in the light waves and sending it through the fiber. The channel spacing is bounded by the optical amplifier’s operational bandwidth and the receiver’s capability to identify two close wavelengths. ITU-T standard body specifies the a spacing of 100G Hz. Summer 2004 Dr. Paul Chen 299 DWDM Optical Transmission DWDM system can be Unidirectional or Bidirectional. The choice is based on the ability of fiber and the required bandwidth. Unidirectional DWDM requires two fibers for two-way communication while Bi-directional DWDM uses a single fiber for two-way communication. Summer 2004 Dr. Paul Chen 300 Optical Packet Switching DWDM can perform switching in the optical domain without having to convert the signal onto electrical domain. This reduces the delay at the switches and increases system throughput. Switching involves reading the packet header and altering the path of the signal (packet). In the course of altering, the switch may have to edit a part or whole of the header. All optical header replacement is the key to updating in the wavelength-based packets (e.g. modifying routing information). Summer 2004 Dr. Paul Chen 301 Optical Packet Switching SONET networks support the multiplexing of lower TDM rates onto higher rates. The ADM and transponders en route provide the much-needed synchronization to ensure quality and guarantee proper delivery of data. DWDM systems support multiplexing of wavelengths, no timing relation exists for the system. The need for a clocking system is absent. If synchronization is still needed, SONET terminals and ADM can support it by providing derived DS1 timing to customers. Summer 2004 Dr. Paul Chen 302 Optical Internet Fundamental properties of DWDM systems are exploited to form an all optical layer. Bits rate and protocol transparency enable transport of native data traffic like Gigabit Ethernet, ATM, SONET, IP etc. on different channels. The DWDM functions in the optical layer can be divided into two layers: Transport Layer and Service Layer. These two layers perform the functions of the four SONET layers. Summer 2004 Dr. Paul Chen 303 Optical Internet Transport Layer Service Layer Bandwidth, Reliability, Access speed, Usage Wavelength level traffic rates, Security, VoIP control Services, etc. DWDM Network Model Summer 2004 Dr. Paul Chen 304 DWDM Network Model An intelligent optical layer performs fast restoration and automated provisioning for end-to-end wavelength path and can appease the bandwidth demand. Restoration in the optical layer is performed rapidly and does not overlap with the service layer’s functions. Summer 2004 Dr. Paul Chen 305 DWDM Network Model Switching and bandwidth is furnished at the granularity of the wavelength. ATM’s virtual path becomes equivalent to a wavelength. MPLS divides the traffic engineering requirements between the IP layer and the Optical transport layer. Summer 2004 Dr. Paul Chen 306 DWDM Network Model In case of physical failure, the wavelength routing protocol must restore the paths across the network within a maximum of 50 ms. This is a SONET feature and it is being resolved in the standard body (IEEE and IETF). Summer 2004 Dr. Paul Chen 307 Inter-working of IP, ATM and DWDM Closed (A) Open (B) SONET Based IP/DWDM IP IP IP ATM SONET IP ATM Dense Wavelength Division Multiplexer Virtual Fiber DWDM Network Architecture Summer 2004 Dr. Paul Chen 308 IP/DWDM Architecture The closed (A) architecture was designed to serve the SONET system better. DWDM increases the capacity of SONET system. IP/DWDM systems adopt the open (B) architecture, which is not tied with SONET or other TDM systems. It reflects protocol transparency in all optical networks. The carrier is responsible for providing the actual interface to end users and the fault/failure protection work. The IP bits enter the DWDM and are transported “as is” over the high-speed optical connection. Summer 2004 Dr. Paul Chen 309 IP/DWDM Architecture DWDM can adopt either of the following network architecture: - Optical mesh transport where OXC and MUX can provide wavelength management and restoration - Wavelength transport where detection of failure and restoration is done at the service layer since there is no OXC and MUX. IP and ATM connect directly over the wavelength links. Summer 2004 Dr. Paul Chen 310 Issues Confronting IP/DWDM Error Detection – SONET can detect signal errors through its overhead bytes in the frame. This feature can be carried down to DWDM when SONET is used as the higher layer. Forward Error Correction is performed in the all-optical DWDM systems. Summer 2004 Dr. Paul Chen 311 Issues Confronting IP/DWDM (continued) Fault Tolerance – 1+1 optical multiplex section protection (MSP) is the strategy supported by the WDM system. It is similar to the 1+1 MSP in SDH / SONET. The WADM can accommodate more advanced optical layer protection switching. Summer 2004 Dr. Paul Chen 312 Issues Confronting IP/DWDM (continued) Wavelength Routing – The wavelength and origin of the signal decide the wavelength path of the signal through the optical network. Network Control & Management – GMPLS is developed to perform network control and management by directly communicating between the management system (in the Control plane) and the DWDM (in the Transport plane). Summer 2004 Dr. Paul Chen 313 Issues Confronting IP/DWDM (continued) Service Transparency – The network does not need any extra information about the signal it transports. Jitter introduced by the optoelectronic processing can be removed using a bit-rate independent optoelectronic regenerator with retiming functionality. Interoperability with backbone routers and to facilitate multi-vendor internetworking. Summer 2004 Dr. Paul Chen 314 Issues Confronting IP/DWDM (continued) Quality of Service - Work is underway to add QoS measures to IP routing protocol (e.g. OSPF) such that it carries not only the topology information but also the loading information. A study is needed to determine between a QoS based distributed routing scheme in the IP layer and an optical routing algorithm undertaking the IP/DWDM routing. Summer 2004 Dr. Paul Chen 315 The Driving Force for 10G Ethernet for LAN, MAN and WAN The need for 10G Ethernet is driven by the successful deployment of Gigabit Ethernet (costeffective now) and the aggregation of Gigabit links. The cost saving of 10G Ethernet WAN ($40,000/port) vs.. Packet over SONET ($300,000/interface) will entice the deployment of 10G Ethernet in MAN and WAN. Summer 2004 Dr. Paul Chen 316 The Driving Force for 10G Ethernet for LAN, MAN and WAN 10G Ethernet can provide a low-cost local connection to WDM-based transponders. Leverage on the huge existing installed base of 10M, 100M and popular Gigabit Ethernet. Easy migration and inter-working from the existing installation and no new network management training is required. Summer 2004 Dr. Paul Chen 317 10G Ethernet (802.3ae) Reference Model Higher Layers Logical Link Control (LLC) Media Access Control (MAC) Reconciliation Sublayer (RS) XGMII XGMII XGMII 64B/66B PCS WIS 64B/66B PCS PHY Physical Medium Attachment PHY Physical Medium Dependent 8B/10B PCS Physical Medium Attachment PHY Physical Medium Dependent Auto Negotiation Physical Medium Dependent MDI MDI MDI Medium Medium 10GBASE-R (LAN) 10GBASE-W (WAN) Medium 10GBASE-X (LAN over WWDM) WIS: WAN Interface Sublayer PCS: Physical Coding Sublayer Summer 2004 Dr. Paul Chen 318 XAUI XGXS: XAUI Extender Sublayer XAUI:10G Attachment Unit Interface XSBI: 10G Sixteen Bit Interface Reconciliation XGMII XGXS XAUI XGXS XGMII 8B/10B PCS 16-bit parallel (OIF) XSBI Physical Medium Attachment Retime, SerDes, CDR Physical Medium Dependent E/O MDI Medium XAUI functions as an extender interface between MAC and PCS. XGMII is a 74-pin signal (32-bit data path each for TX and RX) while XAUI is a 4-bit (4 serial lines) interface for chip-to-chip interface to save space. Each of the 4 serial line in XAUI operates at 2.5 Gb/s. Summer 2004 Dr. Paul Chen 319 LAN PHY vs.. WAN PHY for 10G Ethernet Stack 10GE LAN PHY 10 GE WAN PHY Serial WWDM Serial MAC 10 Gb/s 10 Gb/s 10 Gb/s PCS 64B / 66B 8B / 10B 64B / 66B SONET framing PMA Interface XSBI XAUI XSBI PMD 1550 nm DFB 1310 nm FP 850 nm VCSEL 1310 nm WWDM 1550 nm DFB 1310 nm FP 850 nm VCSEL Lin Rate 10.3 Gb/s 4 x 3.125 Gb/s 9.953 Gb/s CWDM: Course WDM XSBI: 10G Sixteen Bit Interface XAUI: 10G Attachment Unit Interface VCSEL: Vertical Cavity Surface Emitting Laser DFB: Distributed Feedback FP: FabryPerot laser Summer 2004 Dr. Paul Chen 320 Comparison of GE vs. 10GbE Characteristics Gigabit Ethernet 10Gigabit Ethernet Physical Medium Optical and Copper Optical only Distance LAN up to 5 km LAN up to 40 km. Direct attachment to SONET for WAN. PMD sublayer leverage Fiber Channel PMD sublayer Developed new optical PMD sublayer PCS Reuse 8B/10B coding New coding schemes: 64B/66B for –W and –R; 8B/10B for –X. MAC Protocol half and full duplex Full duplex only Summer 2004 Dr. Paul Chen 321 IEEE 802.3ae Port Types Device Range Media Optics PCS WIS Application 10GBase-LX4 300m/10km MMF/SMF 1310nm/WWDM 8B/10B No Enterprise 10GBase-SR 33m/300m 62.5µm/50µm MMF 850nm 64B/66B No Data center 10GBase-LR 10km SMF 1310nm 64B/66B No Enterprise/Metro 10GBase-ER 40km SMF 1550nm 64B/66B No Metro 10GBase-SW 33m/300m 62.5µm/50µm MMF 850nm 64B/66B Yes Metro/WAN 10GBase-LW 10km SMF 1310nm 64B/66B Yes Metro/WAN 10GBase-EW 40km SMF 1550nm 64B/66B Yes WAN 10GBase-CX4 15m Coaxial - 8B/10B No Data center 10GBase-T 100m Twisted pair - 8B/10B No Enterprise Summer 2004 Dr. Paul Chen 322 Emerging 10GE Applications High speed internet access that supports multi-media and QoS. Corporate LAN interconnect for distributed communication, remote services, and home office access. Back-end server connections to minimize congestion and delay. Inter- and intra-POP connections to enhance reliability and scalability. Real-time streaming that supports video and VoIP, etc. Summer 2004 Dr. Paul Chen 323 Emerging 10GE Applications (continued) Telecommuting that supports office LAN, metro and regional Ethernet connectivity. High speed data transport for large size data transfer over the network. Summer 2004 Dr. Paul Chen 324 10 GbE for Expanded LAN 10 G Ethernet can be used in service provider data centers and enterprise LAN environment. It can be used for - Switch to switch - Switch to server - Data centers - Between two buildings in a campus Summer 2004 Dr. Paul Chen 325 10 GbE for MAN Location A 10GbE 10GbE 10GbE 10GbE 10GbE MAN 10GbE Location C 10GbE Location B Summer 2004 Dr. Paul Chen 326 10 GbE for MAN Gigabit Ethernet is already deployed as a MAN. With 10G Gigabit Ethernet interfaces, optical transceivers, and single mode fiber, service providers can provide links reaching 40 km or more. This can serve the range required for MAN. Summer 2004 Dr. Paul Chen 327 10 GbE for SAN 10 G Ethernet can be used for the following applications in a SAN (Storage Area Network) environment: - Database servers - Technical and scientific computing - High resolution video transport - Local and remote data mirroring - Centralized backup Summer 2004 Dr. Paul Chen 328 10 GbE for WAN Service Provider Point of Presence Carrier Central Office Carrier Central Office Service Provider Point of Presence 10 GbE 10 GbE Core DWDM Optical Network 10 Gigabit Ethernet is compatible with the installed base of SONET OC-192. Summer 2004 Dr. Paul Chen 329 The Future of 10GbE An Ethernet-optimized infrastructure build out has already started. The metro areas are the current focus of network development to deliver optical Ethernet service to the business customers. Service providers like Telseon, Cox Communications, BT, and Qwest already deploy Gigabit Ethernet services. Summer 2004 Dr. Paul Chen 330 The Future of 10GbE 10GbE is on the roadmap of most, if not all, switch, router and metro optical system vendors to enable low-cost metro-based campus interconnection over dark fibers, and to provide end-to-end optical networks with common management systems. IEEE802.3ae 10GbE standard is standardized on June 12, 2002. Summer 2004 Dr. Paul Chen 331 RPR-Enabling Technology Any logical topology Sub-50 milliseconds protection Bandwidth management (QoS) Delay guarantees Loss guarantees Unicast, multicast and broadcast OAMP support RPR allows true convergence of service in metro networks Summer 2004 Dr. Paul Chen 332 RPR Application End-to-End IP RPR Core RPR 802.3 Metro Ethernet Access Summer 2004 Regional Metro Dr. Paul Chen IP/MPLS 333 Regional Metro 802.3 Metro Ethernet Access MPLS, Ethernet and RPR MPLS – End-to-end path set up – Service creation – Adaptation layer – Traffic engineering / Ethernet – Universal service interface – No service guarantees or bandwidth management RPR – Metro network – Convergence of services – Service enabling mechanisms Summer 2004 Dr. Paul Chen 334 RPR : Better Than Both Worlds SONET Ethernet RPR Fair access to ring bandwidth v High BW efficiency on dual ring topology v Full FCAPS with LAN-like economics v Controlled latency and jitter v v 50 milliseconds ring protection v v Optimized for data v v Cost effective for data v v Summer 2004 Dr. Paul Chen 335 RPR Value Proposition A layer 2 technology designed for metro transport Shared ring technology with spatial reuse Offers carrier class ring protection and resiliency for packet switched networks Multiple services over one layer with QoS Reduced cost of operations Improved service velocity Summer 2004 Dr. Paul Chen 336 Standards Bodies IETF – IPORPR – IP Over RPR IEEE – 802.17 ITU and ANSI – SONET, SDH, GFP, OCh Standards impact both data plane and management plane Summer 2004 Dr. Paul Chen 337 IETF Operation of IP over 802.17 – Representing link layer in routing database and multicast operation – Link metrics – Interaction of Layer 3 and 2 resiliency MPLS adaptation over 802.17 Management Information Base (MIB) Summer 2004 Dr. Paul Chen 338 ITU & ANSI SONET & SDH Generic Framing Procedure Summer 2004 Dr. Paul Chen 339 IEEE RPR MAC definition Transit path & fairness management Topology discovery Protection switching Adaptation to different physical layers Conformance to 802.1 Layer management Summer 2004 Dr. Paul Chen 340 Introducing RPR Resilient Packet Ring (RPR) is a new Layer 2 technology – optimized for MAN and WAN – RPR MAC is PHY layer agnostic: leverages the Ethernet or SONET physical layer – Full FCAPS management – increases bandwidth efficiency through statistical multiplexing Standard under development in IEEE 802.17 Summer 2004 Dr. Paul Chen 341 RPR MAC: Key Features Ring protection and fast restoration (<50ms) Support for multiple classes of service Controlled dynamic BW sharing on the ring – No wasted BW due to pre-allocation – Spatial reuse between other nodes Controlled latency and jitter Summer 2004 Dr. Paul Chen 342 RPR Frame Format RPR Header RPR header: 2 bytes Destination 6 bytes Source 6 bytes Type 2 bytes HEC 2 bytes Frame type, CoS, TTL, Ring-ID, In-Out profile indicator Compliance with 802.1 Support for transparent bridging with broadcast of unknown unicast addresses Payload Payload FCS Summer 2004 4 bytes Dr. Paul Chen 343 Fairness Support for 4 traffic types: Provisioned – no BW management High Priority – no BW management Medium Priority – Guaranteed Transmit Best effort – BW negotiated (equal/weighted) Bandwidth Management: Requires no provisioning or configuration for each node. Provides a flexible and reliable transport while protecting high priority service requirements. Over-subscription of bandwidth is easily supported. All nodes can dynamically compete for spare bandwidth. Works with large topologies and designs with many node. Summer 2004 Dr. Paul Chen 344 Protection Global – Steering –Nodes use topology map to avoid sending traffic over failed spans. Local – Wrap – Nodes at failure redirect traffic to alternate ring <50ms protection switch requirement ! Summer 2004 Dr. Paul Chen 345 Topology New nodes automatically trigger advertisement. All nodes periodically update their map. Topology map is kept by all nodes. – Determine optional path to any destination – Identify node capabilities (wrap/steer) Summer 2004 Dr. Paul Chen 346