The overview of Networking Technology & New Generation Processors Boxuan Gu Chi Chau CS-521 2-5-2004 Part 1 Networking Technology The lecture consists of two parts Network Architecture Ethernet technology Network Architecure-OSI reference model OSI The OSI model provides a conceptual framework for communication between computers, but the model itself is not a method of communication. Actual communication is made possible by using communication protocols. In the context of data networking, a protocol is a formal set of rules and conventions that governs how computers exchange information over a network medium. A protocol implements the functions of one or more of the OSI layers. OSI-Interaction OSI-Encapsulation TCP/IP TCP/IP-IP The Internet Protocol (IP) is a networklayer (Layer 3) protocol that contains addressing information and some control information that enables packets to be routed. IP has two primary responsibilities: 1. providing connectionless 2. best-effort delivery of datagrams IP Packet Format IP address format IP address… TCP/IP-TCP Transmission Control Protocol •The TCP provides reliable transmission of data in an IP environment. TCP corresponds to the transport layer (Layer 4) of the OSI reference model. Among the services TCP provides are stream data transfer, reliability, efficient flow control, full-duplex operation, and multiplexing. •TCP offers reliability by providing connection-oriented, end-to-end reliable packet delivery through an internetwork. TCP/IP-UDP User Datagram Protocol The User Datagram Protocol (UDP) is a connectionless transport-layer protocol (Layer 4) that belongs to the Internet protocol family. UDP is basically an interface between IP and upper-layer processes. UDP protocol ports distinguish multiple applications running on a single device from one another. UDP-packet header IPV6 Disadvantage of IP v4: 1. 32 bits address is limited 2. Routing is not efficient 3. Bad support for mobile device 4. Security needs grow IPv6 Packet Header Format 4bits version 8bits traffic class 16 bits payload length 20 bits flow label 8 bits next header 8 bits hop limit 128 bits source address 128 bits destination address IPV6 Version Number: The version is a 4-bit field as in IPv4. The field contains the number 6 for IPv6, instead of the number 4 for IPv4. Traffic Class: The Traffic Class field is an 8-bit field similar to the type of service (ToS) field in IPv4. The Traffic Class field tags the packet with a traffic class that can be used in Differentiated Services. The functionalities are the same in IPv4 and IPv6. IPv6 Flow Label: The Flow Label field can be used to tag packets of a specific flow to differentiate the packets at the network layer. Hence, the Flow Label field enables identification of a flow and per-flow processing by the routers in the path. Payload Length: Similar to the Total Length field in IPv4, the Payload Length field indicates the total length of the data portion of the packet. IPV6 Next Header: Similar to the Protocol field in the IPv4 packet header, the value of the Next Header field in IPv6 determines the type of information following the basic IPv6 header. Hop Limit: Similar to the Time to Live field in the IPv4 packet header, the value of the Hop Limit field specifies the maximum number of routers (hops) that an IPv6 packet can pass through before the packet is considered invalid. IPV6 Source Address: The IPv6 source address field is similar to the Source Address field in the IPv4 packet header, except that the field contains a 128-bit source address for IPv6 instead of a 32-bit source address for IPv4. Destination Address: The IPv6 destination address field is similar to the Destination Address field in the IPv4 packet header, except that the field contains a 128-bit destination address for IPv6 instead of a 32-bit destination address for IPv4. IPv6-extension header IPv6-extension header 1. 2. 3. 4. 5. 6. Hop-by-Hop Options header. Destination Options header. Routing header. Fragment header. Authentication header and Encapsulating Security Payload header Upper-Layer header. IPv6-Addressing scheme IPv6 uses 16-bit hexadecimal number fields separated by colons (:) to represent the 128-bit addressing format making the address. 2031:0000:130F:0000:0000:09C0:876A:13 0B. IPv6-Addressing scheme IPv6 addresses consist of a prefix and a local part (like in IPv4) - Example: 3FFE:400:280:0:0:0:0:1/48 here the first 48 bits a fixed (prefix) and the other 80 bits will be assigned in the local subnet IPv6-Addressing scheme In IPv6, there 3 types of addresses: 1. Unicast 2. Multicast 3. Anycast (new in IPv6) IPv6-Addressing scheme -unicast IPv6-Addressing scheme -Multicast IPv6-Addressing scheme -Anycast Packets sent to an anycast address or list of addresses are delivered to the nearest interface identified by that address. Anycast is a communication between a single sender and a list of addresses, Part 2: Ethernet Ethernet Ethernet MAC Data Frame Format Ethernet-10gigabit Ethernet 10 Gigabit Ethernet is Ethernet. 10 Gigabit Ethernet uses the IEEE 802.3 Ethernet media access control (MAC) protocol, the IEEE 802.3 Ethernet frame format, and the IEEE 802.3 frame size. 10 Gigabit Ethernet is full duplex. Ethernet-10gigabit Ethernet Technology and Standard The IEEE 802.3ae 10 Gigabit Ethernet Task Force was chartered with developing the 10 Gigabit Ethernet Standard. This group is a subcommittee of the larger 802.3 Ethernet Working Group. In contrast to previous Ethernet standards, 10 Gigabit Ethernet targets three application spaces: the LANs, MANs, and WANs. Cont. Gigabit Ethernet is no longer a shared domain, halfduplex technology. Because there are no packet collisions in a fullduplex link, the link distances are determined by optics and not by the diameter of an Ethernet collision domain. 10 Gigabit Ethernet will also be a full-duplex, switched technology, maintaining compatibility with the 802.3 Ethernet MAC protocol and the Ethernet frame format. Cont. 10 gigabit ethernet Layer 1: Physical Layer Devices Contained within the PHY are several sublayers that perform these functions, including the physical coding sublayer (PCS) and the optical transceiver or physical media dependent (PMD) sublayer for fiber media. The PCS is made up of coding (for example, 8b/10b) and serializer or multiplexing functions. Cont. 10g Ethernet define two kinds of PHY: the LAN PHY the WAN PHY WAN PHY SONET Friendly Enables use of SONET infrastructure for Layer 1 transport: SONET ADMs, DWDM Transponders, optical regenerators Not SONET Compliant Connects to SONET access devices but not directly to SONET infrastructure Cont. Not SONET Compliant SONET Friendly Requires some SONET features: OC-192 link speed SONET framing MinimalPath/Section/Li ne overheard processing Avoids most costly aspects of SONET: No TDM support Concatenated OC-192c only Does not require meeting SONET grid laser specifications, jitter requirements, stratum clocking Minimal operations, administration, maintenance, and provisioning (OAM&P) LAN PHY 10 Gigabit defines a LAN PHY that, with simple encoding, will transmit Ethernet packets on dark fiber and dark wavelengths. The LAN PHY is intended to support the existing Ethernet applications at ten times the bandwidth with the most cost-effective solution. Cont. Cont. Both the LAN and WAN PHY will support each physical medium-dependend (PMD) sublayer and, therefore, support the same distances. These PHYs are distinguished solely by the PCS. The WAN PHY differs from the LAN PHY by the inclusion of a simplified SONET framer. Cont. 10 Gigabit Ethernet Link Distance and Media Goals At least 65 meters over multimode fiber At least 300 meters over installed multimode fiber At least 2 km over single-mode fiber At least 10 km over single-mode fiber At least 40 km over single-mode fiber Application of 10GE 10 Gigabit in the LAN Cont. 10 Gigabit Ethernet Metropolitan Network Part 2 AMD & Intel Latest Desktop & Server Processors AMD Desktop: AMD Athlon 64 FX, AMD Athlon 64 Server: AMD Opteron Intel Desktop: Intel Pentium 4 w/ HT, Intel Pentium 4 Extreme Edition Server: Intel Itanium 2, Xeon Desktop Processor Pricing AMD Athlon 64 FX-51 $733 AMD Athlon 64 3400+ $417 AMD Athlon 64 3200+ $278 AMD Athlon 64 3000+ $218 Intel Pentium 4 Extreme Edition 3.4Ghz $999 Intel Pentium 4 3.4Ghz w/ HT $424 Intel Pentium 4 3.2 Ghz (Prescott) w/ HT $417 Processor Timeline Date Intel 2/2/2004 P4 3.4Ghz, P4 3.2E Ghz, P4 EE 3.4Ghz 1/6/2004 Athlon 64 3400+ 9/24/2003 P4 EE 3.2 Ghz 6/23/2004 P4 3.2 Ghz 5/13/2003 4/14/2003 AMD Athlon 64 FX-51, 3200+ Athlon XP 3200+ P4 3.0 Ghz 800Mhz 2/10/2003 11/14/2002 P4 HT 3.06Ghz Athlon XP 3000+ Traditional Intel roadmap Intel historically would move to a smaller process, double the cache, increase clock speeds It was true until first generation of Pentium 4 and when AMD was still struggling It is not the case for Prescott Intel Pentium 4 (Prescott) Intel launched Pentium 4 Prescott on February 2nd Not P5 just 3rd generation of P4 Intel CEO Paul Otinelli discuss about 64-bit extension on Prescott With enough cooler Prescott can overclock to 5Ghz P4 Prescott New Changes Prescott use 90 nm process instead of 130 nm process Double the L2 cache to 1 MB Expand L1 data cache to 16 KB to improve AGUs (address generation units) Add 13 new instructions aka SSE3 Extend pipeline from 20 to 31 stages Process and die size drop Increasing scheduler queue size Add a dedicated integer multiplier A new shifter/rotator logic block is replace in ALUs SSE3 After great success with the P4 SSE2 instruction set (144 instructions) , SSE3 added 13 more to make programmer’s life easier fisttp: fp to int conversion addsubps, addsubpd, movsldup, movshdup, movddup: complex arithmetic lddqu: video encoding haddps, hsubps, haddpd, hsubpd: graphics (SIMD FP / AOS) monitor, mwait: thread synchronization 31 Pipeline Stages Hyper-Threading Technology Could increase performance up to 40% HT enables multi-threaded software to execute threads in parallel. It split instructions into multiple streams so that multiple processors could work on it. The problem is not many software is taking advantage of HT. HT is big in graphic arena ex: Adobe taking big advantage of HT Prescott Problems 90 nm process not yet mature unlike 130 nm 90 nm process has heat and power problem Hold back 3.4E Ghz Intent to produce limited edition SSE3 will be useful down the road, but today’s software is not ready for it 31 stages pipeline would slow perfermance with wrong prediction Should you get Prescott? The real strength of Prescott is in its HyperThreading performance Great for multitasking Some applications Prescott beat Extreme Edition in multitasking Pentium 4 Extreme Edition Intel top of the line desktop processor “Xeon” processor with P4 Extreme Edition label It is more like “Emergency Edition” rather than “Extreme Edition” to repose AMD 64 Optional 2 MB L3 cache Intel Roadmap AMD 64 AMD 64 building a bridge from the 32 to 64-bit world Provide great performance without parallel Simultaneous 32 and 64 bit computing More physical address 1 TB not limited to 4GB Applications can use up to 4GB instead of 2GB Worry-Free on memory A lot less swapping to virtual memory A single architecture designed fit all AMD Athlon 64 & 64 FX Athlon 64 is 754-pin Athlon 64 FX is 940-pin New Changes 1 MB L2 cache Integrated memory controller HyperTransport channel Less power need New AMD Core Double the registers Integrated DDR Memory Controller Enlarge Look-Aside Buffer (TLB) Extend pipeline from 10 to 12 stages AMD 64 Processor Architecture Integrated Memory Controller Provide sufficient low-latency memory bandwidth to processor core With integrated memory controller it changed the way processors access main memory It greatly increase bandwidth and reduce latencies thus speed up process Run memory controller at processor speeds rather than FSB speeds Boosts performance for many applications with intensive memory use Available memory bandwidth up to 6.4GB/s with Opteron and FX and 3.2GB/s with AMD 64 AMD 64 Core Enables simultaneous 32 and 64 bit computing There are 3 main categories in AMD 64 Core 1. 32-bit applications under a 32-bit OS 2. 32-bit applications under a 64-bit OS 3. 64-bit applications under a 64-bit OS Great for migration HyperTransport Increase overall system performance by reducing I/O bottlenecks, increasing system bandwidth and reducing system latency High-speed I/O communication Up to 6.4GB/s bandwidth per link, improve interconnection with system components Up to 3 HyperTransport link (only on Opteron) SSE/SSE2 Registers Double the number of registers Double SSE registers to improve floating point calculations Enlarge Look-Aside Buffer (TLB) With enlarge look-aside buffer it reduce transmitting between system memory and physical address Pipeline Extended the pipeline to 12 states from 10 to increase the clock speeds Rework the predictions Problems AMD partner with Nvidia, but NForce 3 chipset is not mature With nForce 3 low AGP performance bug w/ HyperTransport channel interface It comes up VIA is a better chipset for AMD 64 AMD 64 FX-51 “Opteron” processor with FX label Slight change on the DDR400 support (reduce validation) Major difference from Athlon 64 is 128-bit memory controller vs 64-bit Works with dual-channel Registered memory Athlon 64 works with single-channel unbuffered DDR memory Final Word of FX-51 Athlon 64 3400+ bring the death of the FX-51 According to benchmarks from different areas Athlon 64 3400+ come very closely behind FX51 But the price is half of FX-51 Or you can wait until FX-53 to come out Watch Out! AMD is talking about new Socket-939 around late this year AMD Roadmap 754 AMD Roadmap 940 AMD Roadmap 939 Benchmarks - OpenGL Benchmarks - Benchmarks - Benchmarks – Business App Result Summary AMD is good comes to business/gaming/2D work with perspective to price/performance ratio Intel offers the best in encoding and 3D performance as well as multitasking Conclusion It is very hard to compare new processors With AMD 64 lack of true 64-bit applications With Intel Prescott lack of SS3 enhance applications and “out-to-day” video driver and DirectX Hardware open the future door but not until software catch up, we won’t be able to truly experience the great enhancement Sources Intel Corp - www.intel.com AMD Corp – www.amd.com Toms’s Hardware – www.tomshardware.com AnandTech – www.anandtech.com ExtremeTech – www.extremetech.com Tech Report – www.techreport.com Xbit Lab – www.xbitlabs.com Opteronics – www.opteronics.com