10Gigabit Ethernet, LAN, MAN, and WAN Dr. Paul Chen

advertisement
10Gigabit Ethernet, LAN,
MAN, and WAN
Dr. Paul Chen
paulpchen2k@yahoo.com
10 Gigabit Ethernet, LAN, MAN, and WAN
„ NTU video students can call in for questions or live
discussion at
214-768-3068 ?
Summer 2004
Dr. Paul Chen
2
Points of Contact
„ Electronic distribution and collection of homework, examination,
term project report, grades:
Mr. Gary McCleskey in 329B Caruth Hall, 214-768-3108,
garym@engr.smu.edu
„ TA:
Please Turn Off Your Cell Phone and Pager During the Lecture!
Summer 2004
Dr. Paul Chen
3
Reference Books
„ “Gigabit Ethernet”, by Jayant Kadambi, Ian Crayford
and Mohan Kalkunte, Prentice Hall
„ “Ethernet-based Metro Area Networks”, Daniel Minoli,
Peter Johnson, Emma Minoli, McGraw-Hill
„ Switched, Fast and Gigabit Ethernet, Robert Breyer, Sean
Riley, Ziv Davis Press
Summer 2004
Dr. Paul Chen
4
More Reference Books
„ Ethernet: The Definitive Guide, by Charles Spurgeon, Chuck
Toporek, published by O’Reilly & Associates
„ Network Troubleshooting, by Othma Kyas, published by
Agilent Technology, Jan. 2002
„ ATM Theory and Applications, David McDyson, Darren
Spohn, McGraw-Hill
„ Introduction to DWDM Technology, Stamatios V.
Kartalopoulos, IEEE Press
Summer 2004
Dr. Paul Chen
5
Course Grades
„ Home work – 30%
To help you understand the subject material, it is your
responsibility to do the homework assignments.
„ Examinations – 40%
To verify how much you really understand.
„ Research Paper – 30%
Additional opportunity to show and share what you learned from
the course.
Summer 2004
Dr. Paul Chen
6
10Gigabit Ethernet Calendar - Summer, 2004
„
„
„
„
„
„
„
„
„
May 29
June 12
June 22
June 26
July 3
July 10
July 17
July 24
Aug 7
First Class, Introduction
No life class, video tape will be sent
Homework # 1 due
HW # 1 Discussion
Holiday (No class)
Mid Term Exam
HW # 2 due & Discussion
Research paper due
Last Day – research paper presentation
Introduce yourself – background, major, work experiences, etc.
Summer 2004
Dr. Paul Chen
7
Research Paper Subjects
„ 10G Ethernet based Metropolitan Area Network
„ 10G Ethernet vs.. SONET vs.. RPR
„ 10G Ethernet based Wide Area Network
„ Solution for a real life problem in your work or
environment
Summer 2004
Dr. Paul Chen
8
Research Paper Rule
„ No more than 10 pages including the cover page,
figure, table of contents.
„ Clearly state the purpose or the goal of this research
paper
Summer 2004
Dr. Paul Chen
9
Reference Web Sites
„ www.10gigabit-ethernet.com
„ www.10GEA.org
„ www.metroethernetforum.org
„ www.rpralliance.org
„ 10GEA site will also point to Gigabit Ethernet
Forum
Summer 2004
Dr. Paul Chen
10
Course Introduction
„ This course provides technical details of several generations
of Ethernet technology and their practical applications in the
real world.
„ With the increasing number of internet users and ecommerce, Ethernet has evolved to higher speeds (10
M/100 M/Gigabit/10 Gigabit) and higher performance /
throughput (from shared to switched).
„ Applications of Ethernet in LAN, MAN and even WAN are
life examples of how a technology can be widely deployed
due to its simplicity and compatibility of new and existing
generation of Ethernet.
Summer 2004
Dr. Paul Chen
11
Course Introduction (continued)
„ Mixing and matching of different technologies to
deliver end-to-end services. In another word,
Ethernet, ATM, SONET and DWDM are major
components of an integrated higher speed network
to provide the efficient data services around the
world.
„ We will cover subjects such as background
information of Ethernet and how it evolves to
become the dominant technology of choice to serve
the computer network needs.
Summer 2004
Dr. Paul Chen
12
Use of LAN and MAN
„ Network for companies to provide resource sharing, high
reliability, money saving, convenience, portability (wireless and
wire line), etc.
„ Network for people to deliver e-mail, video conference, remote
access, news group, work group, etc.
„ With the introduction of voice over IP (packetized voice), the
issues of Quality of Service (QoS), delay-sensitive vs.. nondelay-sensitive services, loss of packets, service protection
through failure recovery schemes, etc. need to be addressed.
„ Competition among leased lines, Ethernet based optical
interface, ATM, and frame relay, etc.
Summer 2004
Dr. Paul Chen
13
LAN
„ LAN is a privately owned network within a single
building or campus of up to a few kilometers.
„ Major LAN technologies include:
- Ethernet or CSMA/CD which operates at 10 M bps
- Fast Ethernet that operates at 100 M bps
- Gigabit Ethernet that operates at 1000 M bps
- Token ring networks which operate at 4 and 16 M
bps
- Other variations such as token bus network which
is no longer in use
Summer 2004
Dr. Paul Chen
14
MAN
„ MAN covers a group of corporate offices or a city and
may be publicly owned. The well known MAN includes:
- DQDB (Distributed Queue Dual Bus) by IEEE
- SMDS (Switched Multi-megabit Data Service) by
Bellcore (Telecordia). SMDS is based on IEEE DQDB
technology
Both DQDB and SMDS are not in use today. DQDB is
used by equipment vendor for backplane interfaces.
Gigabit and 10G Ethernet are targeting this market.
Summer 2004
Dr. Paul Chen
15
WAN
„ A WAN covers a large geographic area (a country or a
continent). A WAN usually operates in point-to-point, storeand-forward, or packet switched.
„ Examples of WAN services include: X.25, ISDN, etc.
„ X.25 service is popular in Europe and Japan but failed to
catch on in the States.
„ ISDN services did not live up to the expectation.
„ 10G Ethernet in combination with SONET and RPR is
targeting this market.
„ Wireless WAN is out of scope for this course.
Summer 2004
Dr. Paul Chen
16
Relationship of OSI and IEEE Reference Model
IEEE 802.3
CSMA/CD
Model
OSI 7-Layer
Ref Model
Higher Layers
Application
Presentation
Logical Link Control
(LLC)
Session
DTE with
Exposed
AUI
Physical Signaling
(PLS)
Transport
Network
Attachment Unit
Interface (AUI)
Data Link
Physical Medium Attachment
(PMA)
Physical
Medium
Summer 2004
Dr. Paul Chen
17
Medium
Attachment
Unit (MAU)
DTE with embedded AUI
Media Access Control
(MAC)
Functions of Each CSMA/CD Subsystem
„ PLS and AUI subsystems support the signaling
between MAC and MAU.
„ MAU (including PMA) is responsible for the
physical and electrical interface to/from the
medium.
„ PLS is implemented locally to the MAC in silicon.
Summer 2004
Dr. Paul Chen
18
Functions of Each CSMA/CD Subsystem
„ AUI defines an interface to allow a special cable
and connector assembly to connect PLS to MAU.
This allows MAC/PLS to be located remotely from
the MAU and the medium.
„ LLC is implemented in software
„ Host Bus interface, MAC and PLS are implemented
in a single chip.
Summer 2004
Dr. Paul Chen
19
Typical Ethernet Adapter Implementation
AUI Connector
15-pin D-type
Buffer
Memory
IEEE
Address
EPROM
LAN
Controller
(MAC+PLS)
Isolation
Transformer
Transceiver
Cheapernet
BNC
Connector
Isolation
Transformer
UTP
RJ45
Connector
AUI
Transceiver
CPU I/O Bus
Ethernet controller implements the functions of MAC and PLS
(e.g. Manchester Encoder / Decoder for 10 M Ethernet).
Please note that location of isolation transformer for UTP vs..
D-type and BNC connectors.
Summer 2004
Dr. Paul Chen
20
IEEE 802.6 MAN Reference Model
MAC service Connection Isochronous
Oriented
To LLC
Services
Data Service
DQDB Layer
Physical Layer
Bus A
Bus A
Bus B
Bus B
DQDB offers a high speed service (DS-1 and DS-3)
Summer 2004
Dr. Paul Chen
21
Origin and Evolution of Ethernet
„
„
„
„
„
„
ALOHA System
Xerox 3M Ethernet (R. Metcalfe and D. Boggs)
10 M Ethernet
100 M Fast Ethernet
1000 M (Gigabit) Ethernet
10 G Ethernet
Summer 2004
Dr. Paul Chen
22
ALOHA System
„ It used the ground-based broadcasting. The basic
idea can be applicable to any system in which
uncoordinated users compete for the use of a single
shared channel.
„ Two versions of ALOHA system: pure and slotted
„ In slotted ALOHA, time is divided into discrete slots
into which frames must fit.
„ Pure ALOHA does not require global time
synchronization while slotted ALOHA does.
Summer 2004
Dr. Paul Chen
23
ALOHA System (continued)
„ Pure ALOHA is similar to Ethernet in user access
and collision resolution.
„ The max throughput for pure ALOHA is about
0.184. In another word, the best channel utilization
is 18%.
„ Slotted ALOHA divided time into discrete intervals,
each interval corresponding to one frame. One way
to achieve synchronization is to have one station
emit a pip at the start of each interval, like a clock.
„ With slotted ALOHA, the max throughput is
increased to 37%.
Summer 2004
Dr. Paul Chen
24
ALOHA System vs.. CSMA/CD
„ CSMA/CD introduces two major improvements over
ALOHA:
- CSMA/CD ensures that no station begins to
transmit when it senses the channel busy
- Stations shall abort their transmission as soon as
they detect a collision.
Throughput of CSMA/CD is equal or better than
slotted ALOHA system.
Summer 2004
Dr. Paul Chen
25
First Generation Ethernet
„ The 1st Ethernet is credited to Robert Metcalfe
and David Boggs, at Xerox PARC in 1973. A total
of 5,000 computers were connected via 3M bps
controllers. This experience base was key to the
industry acceptance for 10 Mb/s when it was
developed.
„ The initial Ethernet standard was developed by
Digital, Intel, and Xerox (DIX) consortium in 1979.
The Ethernet “Blue Book” was published in 1980.
Summer 2004
Dr. Paul Chen
26
First Generation Ethernet
„ The Ethernet standard was submitted to IEEE Project
802 under 802.3 CSMA/CD (Ethernet) committee.
„ Other IEEE LAN standards committees are 802.4
Token Bus and 802.5 Token Ring in addition to 802.11
wireless LAN which is outside the scope of our course.
Summer 2004
Dr. Paul Chen
27
Ethernet Rules The World
„ 802.3 committee developed a series of specifications
for 10 Mb/s Ethernet to support different kinds of media:
thick and thin coaxial cable, unshielded twisted pair,
and fiber optic cable.
„ Token Ring fell behind CSMA/CD due to relatively high
license cost and late to the market. Today, only a few
sites are using token ring (mostly IBM camp).
„ Token Bus was targeting the manufacturing automation
market but failed to materialize due to high cost and
compatibility issues between old and new
specifications.
Summer 2004
Dr. Paul Chen
28
Turning Point for 10 M Ethernet
„ Adoption of 10 M Ethernet over UTP, 10Base-T,
causes a massive surge of Ethernet installation due
to UTP’s low cost and easy cable installation.
„ Rapid increase in bandwidth demand and low silicon
implementation for complex system lead to two trends
in early 1990.
- Migration from shared Ethernet to switched Ethernet
topology
- Development and deployment of 100 M Fast Ethernet,
100Base-T
Summer 2004
Dr. Paul Chen
29
100 M Fast Ethernet
„ In 1982, proposals were made in IEEE802 committee
on a 100 M interconnect standard. Most IEEE LAN
committees were busy working on existing standards
work. FDDI in ANSI took the initiative to work on 100 M
network for backbone applications.
„ Standard for 100 M Ethernet was introduced in 1995.
„ 10 M and 100 M Ethernet uses the same frame format.
„ With auto-negotiation to detect and select the proper
speed, 100 M capable network adapters can be
deployed in a vast 10 M Ethernet installed base.
Summer 2004
Dr. Paul Chen
30
100 M Fast Ethernet (continued)
„ Subjects to be discussed on Fast Ethernet include:
- Media type
- Full / Half duplex (FDX/HDX) and flow control
- VLAN tagging
- 10/100 M b/s (auto-negotiation) capable devices
Summer 2004
Dr. Paul Chen
31
1000 Mb/s Gigabit Ethernet
„ Standard work on IEEE802.3 committee started in late
1995 and was approved in June, 1998.
„ Gigabit Ethernet is targeting the backbone networks
and emerging bandwidth-intensive applications.
„ Major differences between 100 M and 1000 M Ethernet
other than its speed:
- Gigabit media independence interface (GMII)
- Adoption of Fiber Channel encoding
- Modified CSMA/CD operation and preference for FDX
- Modification of auto-negotiation for fiber
Summer 2004
Dr. Paul Chen
32
Major Differences between Gigabit and Fast Ethernet
„ GMII transmit and receive data path were widened to 8
bits (from MII’s 4-bit path) to allow frequency clocks and
data path transition frequencies.
„ Fiber Channel encoding scheme was adopted.
Manchester encoding was used for 10 M, Non-Returnto-Zero (NRZ) was used on 100 M, and 8B/10B
encoding is used for Gigabit Ethernet.
Summer 2004
Dr. Paul Chen
33
Major Differences between Gigabit and Fast Ethernet
„ To meet the round-trip delay constraint, the slot
time was changed from 512 bits (10 M and 100
M) to 512 bytes.
„ Fiber Channel signaling scheme was adopted to
allow exchange of FDX/HDX information prior to
data transfer.
Summer 2004
Dr. Paul Chen
34
10 G Ethernet - Next Generation
„ Gigabit Ethernet targeted three areas:
- Service provider data center and enterprise LAN
where high bandwidth is demanded
- Metropolitan Area Network (MAN) and Storage
Area Network (SAN)
- Wide Area Network (WAN) and inter-operate with
SONET and DWDM backbone network
Summer 2004
Dr. Paul Chen
35
Repeater Definition
„ A device that allows extension of the physical
network topology beyond the normal range imposed
using a single cable segment in terms of distance
and node count.
„ An Ethernet repeater can only interconnect
Ethernet segments of identical speed. To connect
dissimilar-speed networks, a bridge, switch or
router is required.
„ Data received on one port is repeated to all ports
except the active receiver, with signal amplitude
and timing restored on the re-transmitted (repeated)
waveforms.
Summer 2004
Dr. Paul Chen
36
Repeater Definition
„ If the repeater detects receive activity from two or
more ports, it constitutes a collision. The repeater
will send a jam pattern on all ports, including the
active receive port.
„ Repeaters introduce delay, which must be factored
into the round-trip delay. The variability of delay
path through the repeater for back-to-back packet
causes “inter-packet gap shrinkage”.
Summer 2004
Dr. Paul Chen
37
IPG Shrinkage Example
Repeater
Unit
MAU
MAU
MAU
MAU
96 bit time
IPG
Packet 2
91 bit time
IPG
Packet 1
Packet 2
Delay through repeater set
15 bit time for packet 1
10 bit time for packet 2
Summer 2004
Dr. Paul Chen
38
Packet 1
Bridge Definition
„ Bridge operates at the MAC sub-layer, while
Repeater operates at the PHY layer.
„ Bridge may connect identical MAC technology
(Ethernet to Ethernet) or dissimilar ones (Ethernet to
Token Ring).
„ Bridge uses the source and destination address
information to make intelligent forwarding decision.
Summer 2004
Dr. Paul Chen
39
Bridge Definition (continued)
„ A bridge performs Filtering, Learning, and Forwarding
functions.
„ Spanning Tree Algorithm permits bridges to
dynamically discover a bridged topology and configure
the network to ensure connectivity without looping.
Summer 2004
Dr. Paul Chen
40
Bridge Definition (continued)
„ A bridge will flood a frame to many ports if it is
unknown or if it is a broadcast frame. This
flooding may be replicated to another bridge,
which may also replicate it. If these bridges have
more than one interconnection (a loop), this may
replicate to eventually consume all available
bandwidth of a LAN.
„ Bridged networks do not contain any hop count
information. Routers keep track of the hop count.
Router will delete any packet once its time-to-live
counter expires.
Summer 2004
Dr. Paul Chen
41
Bridge Definition (continued)
„ Bridges use normal frames (with special content) to
exchange Spanning tree configuration information.
These frames are defined as Bridge Protocol Data Unit
(BPDU).
„ Bridges connect 802.3, 802.4, and 802.5 LANs.
„ Many reasons force a single organization to have
multiple LANs.
Summer 2004
Dr. Paul Chen
42
Bridges Definition (continued)
„ Many universities and corporate departments have
their own LANs, to connect their own PC,
workstations and servers.
„ The goal of each department differs, so each
department may choose different LAN.
„ There is a need to interact with each other for
whatever reasons. So bridges are needed.
Summer 2004
Dr. Paul Chen
43
Bridges Definition (continued)
„ The corporate organization may be
geographically spread over several buildings
separated by considerable distances.
„ It may be more economic to have separate LANs
in each building and connect them with bridges,
and links rather than to run a single cable over
the entire site or campus.
„ It may be necessary to split a logically single LAN
into multiple LANs to accommodate the load.
Universities are typical examples.
Summer 2004
Dr. Paul Chen
44
Bridges (continued)
Bridge
B
Backbone LAN
B
Bridge
B
B
LAN3
LAN4
Workstations
LAN1
LAN2
Multiple LANs connected by a backbone via bridges to handle a total
Load higher than the capacity of a single LAN.
Summer 2004
Dr. Paul Chen
45
Bridges (continued)
„ The physical distance between two most distant PC is
too far (> 2.5 km for 802.3) and violates the round-trip
delay requirement. Using bridges to connect separate
LANs will cover the physical distance needed.
„ Placing bridges at critical locations (fire doors in a
building) can prevent a single malfunctioning PC or
node from bringing down the entire system. **
Reliability**
„ Bridges can contribute to organization’s security.
Summer 2004
Dr. Paul Chen
46
Bridges (continued)
„ Most LAN NICs have a promiscuous mode, in
which all frames are given to the host computers,
not just those addressed to it.
„ By putting bridges at various locations and being
careful not to forward sensitive traffic, we can
isolate part of the networks so its traffic cannot
escape and fall into the wrong hands.
Summer 2004
Dr. Paul Chen
47
Bridges (continued)
Host B
Host A
Bridge
CSMA/CD LAN
Token Ring LAN
Bridges convert the frame format from that of 802.3 to 802.5
Summer 2004
Dr. Paul Chen
48
Transparent Bridges
„ A transparent bridge operates in promiscuous mode,
accepting every frame transmitted on all the LANs to
which it is attached.
„ When a frame arrives, a bridge must decide whether
to discard or forward it, and on which LAN to forward.
The decision is made by inspecting the destination
address in a big hash table inside the bridge.
„ The table lists each possible destination and tells
which output LAN it belongs to
Summer 2004
Dr. Paul Chen
49
Transparent Bridges (continued)
Bridge
Bridge
B1
LAN 1
B2
LAN 2
LAN 4
LAN 3
Four LANs are connected by two bridges
Summer 2004
Dr. Paul Chen
50
Transparent Bridges (continued)
„ The algorithm used by the transparent bridges is
backward learning. By inspecting the source address,
bridges can tell which PC is accessible on which
LAN.
„ The network topology can change as PC and bridges
are powered up / down and moved around. To handle
dynamic topologies, whenever a hash table entry is
made, the arrival time of the frame is noted in the
entry. When a frame whose destination is in the table
arrives, the entry is updated with the current time.
Summer 2004
Dr. Paul Chen
51
Transparent Bridges (continued)
„ Periodically, a bridge scan its hash table and purges all
entries more than a few minutes old (non-active).
„ If a PC is quiet for a few minutes, any traffic from that PC
sent to the bridge will be flooded, until it next sends a
frame.
„ The procedures used by the bridge is as follows:
- If the Destination and Source LANs are the same,
discard the frame.
- If the Destination and Source LANs are different, forward
the frame.
- If the Destination LAN is unknown, use flooding.
Summer 2004
Dr. Paul Chen
52
Two Parallel Transparent Bridges
Frame copied
by B2
Frame copied
by B1
F1
LAN 2
B1
F2
Bridges
B2
LAN 1
F
Initial frame
Frame F with unknown destination address
Summer 2004
Dr. Paul Chen
53
Looping Caused by Two Parallel Transparent Bridges
„ Later, Bridge 1 sees F2 (with unknown
destination) and generates F3 (not shown in the
Figure).
„ Bridge 2 sees F1 (with unknown destination) and
generates F4 (not shown in the Figure).
„ Now, Bridge 1 forwards F4 and Bridge 2 forwards
F3 to LAN 1.
„ This cycle goes on forever and looping occurs.
Summer 2004
Dr. Paul Chen
54
Spanning Tree Bridges
„ The solution is for bridges to communicate with each
other and overlay the actual topology with a spanning
tree that reaches every LAN.
„ Some potential connections between LANs are
ignored in the interest of constructing a fictitious loopfree topology.
„ To build a spanning tree, the bridges have to choose
one bridge as the root of the tree.
„ The process begins by having each bridge broadcast
its serial number, which is unique worldwide.
Summer 2004
Dr. Paul Chen
55
Spanning Tree Bridges (continued)
„ The bridge with the lowest serial number becomes the
root.
„ A tree of shortest paths from the root to every bridge and
LAN is constructed. This is a spanning tree!
„ If a bridge or LAN fails. A new spanning tree is computed
or constructed.
„ The result is that a unique path is established from every
LAN to the root, thus to every other LAN.
„ Even though the tree spans all the LANs, not all the
bridges are present in the tree to prevent loops.
„ This algorithm continues to run to auto-detect topology
changes and update the tree.
Summer 2004
Dr. Paul Chen
56
Spanning Tree Example
C
B
D
A
C
B
D
A
F
E
F
I
G
E
H
L
N
G
H
J
L
O
K
N
J
O
K
M
M
A spanning tree with
Node I as the root
A Subnet with each node
or Bridge identified by a letter
Summer 2004
I
Dr. Paul Chen
57
Spanning Tree
„ The Spanning Tree Algorithm and Protocol
configure a simply connected active topology
from the arbitrarily connected components of a
Bridged LAN. Frames are forwarded through
some of the Bridge Ports in the Bridged LAN and
not through others, which are held in a Blocking
State.
„ Bridges effectively connect just the LANs to which
Ports in a Forwarding State are attached.
Summer 2004
Dr. Paul Chen
58
Spanning Tree
„ The Bridge with the highest priority Bridge Identifier is
the Root. Every bridge port in a bridged LAN has a
Root Path Cost associated with it. This is the sum of
path cost for each bridge port receiving frames
forwarded from the root on the least cost path to the
bridge.
„ The Designated Port for each LAN is the bridge port
for which the value of the Root Path Cost (RPC) is
the lowest.
„ If two or more ports have the same RPC value, the
first Bridge ID and their Port ID are used as tiebreaker.
Summer 2004
Dr. Paul Chen
59
Spanning Tree
„ Each port on a bridge is associated with a Port ID and
Path cost.
Port 2
Port 1
Bridge
LAN A
Summer 2004
Dr. Paul Chen
LAN B
60
Spanning Tree
„ Bridges send a type of Bridge Protocol Data Unit known
as a Configuration BPDU to each other in order to
communicate and compute the above information (Root,
Root Path Cost, etc.).
„ Bridge ID format
Bridge
Priority
MAC (includes VLAN field)
2 Bytes
Summer 2004
Dr. Paul Chen
6 Bytes
61
Spanning Tree
„ For each bridge and bridge port, three processes
are required:
- Elect one Root Bridge
- Elect one Root Port for non-root bridge
- Elect one Designated Port based on lowest cost
Summer 2004
Dr. Paul Chen
62
Spanning Tree
„ Path Cost calculation
- Path Cost = 1000 Mbps / Bandwidth of the path in
M bps
e.g. for Fast Ethernet, Cost = 10
- If the bandwidth is above (including) 1 G bps
BW
Cost
1 Gbps
4
10 Gbps
2
Summer 2004
Dr. Paul Chen
63
Rapid Spanning Tree Protocol (RSTP)
„ Rapid Spanning Tree Protocol is specified in IEEE802.1w.
This is an improved version (version 2) over the original
Spanning Tree Protocol (STP).
„ The reconfiguration time (due to failure nodes in the tree) for
STP and RSTP is 50 sec vs.. 10 sec or less respectively.
„ RSTP uses a new type of BPDU (type 2).
„ Bridges using RSTP can inter-work with bridges using
STP.RSTP can support multiple VLANs and fast re-routes
and prevent loops, but security remains a key issue.
„ RSTP alone cannot provide the restoration of the network. It
needs to work with others such as RPR technology.
Summer 2004
Dr. Paul Chen
64
Router Definition
„ Routers operate at the Network Layer (Layer 3).
A router can connect different LAN technologies
(Ethernet, Token Ring, FDDI, etc.) and different
protocol types (IP, IPX, AppleTalk, etc.).
„ Routers support complex protocols, normally
executed in software by a CPU, to perform the
routing (forwarding) decision between ports,
maintain current state of the routing tables, which
determine the optimal path for a packet to be
routed.
Summer 2004
Dr. Paul Chen
65
Router Definition (continued)
„ High performance routers take advantages of
special hardware to perform the routing of data
packets while still rely on CPU to process protocol /
control packets to perform the routing table
updates.
Summer 2004
Dr. Paul Chen
66
10 M Ethernet Physical Layer
„ Thick Ethernet (10BASE5) and Cheapernet or
Thin-net (10BASE2) are coaxial cable based.
„ 10BASE5 specifies a maximum cable length of
500m, a maximum of 100 nodes, a minimum
separation distance between MAU on the coax of
2.5m. The length and node count can be increased
by the use of repeaters.
„ Twisted pair Ethernet (10BASE-T) uses standard
voice-grade telephone cable (22-26 gauge) with a
target cable distance of 100m.
Summer 2004
Dr. Paul Chen
67
10 M Ethernet Physical Layer (continued)
„ Thin coax cable is more flexible due to its small
diameter. But key electrical properties are degraded
over the thin cable. 10BASE2 specifies a maximum
cable length of 185m, a maximum of 30 nodes, a
minimum separation distance between MAU on the
coax of 0.5m.
„ The 10BASE-T system uses a star topology with a
repeater (or hub) at the center of the star. The
repeater performs the signal-amplitude and timing
regeneration.
Summer 2004
Dr. Paul Chen
68
10 M Ethernet Physical Layer (continued)
„ The 10BASE-T system provides a low cost and
easy to install network solution. The point-to-point
star topology eases the task of network
management (fault isolation), cable
administration, and reconfiguration due to moves,
additions, deletions, or changes.
„ 10BASE-T uses 100 ohm UTP cable and
inexpensive RJ-45 telephone jack connectors.
„ 10BASE-T can operate on other unshielded or
shielded cable grades (120 and 150 ohm).
Summer 2004
Dr. Paul Chen
69
10 M Ethernet Physical Layer (continued)
„ Benefits of Fiber Optics include: very high bandwidth,
low attenuation. Its drawbacks include: fiber optic
cable and connectors are more expensive, required
skilled (costly) installation personnel.
„ Fiber Optic Inter-Repeater Link (FOIRL) is specified
for a repeater-to-repeater link for a distance of up to 1
km. It was extended for repeater-to-DTE application.
Separate TX and RX paths are used.
„ 10BASE-FL was developed to supersede the original
FOIRL. The max distance between MAUs is extended
to 2 km. The cheaper Bayonet fiber optic plug and
socket connectors are used to save cost.
Summer 2004
Dr. Paul Chen
70
10 M Ethernet Physical Layer (continued)
„ 10BASE-FB was designed to optimize the interrepeater link. 10BASE-FB MAU is embedded within a
repeater. It was targeting a backbone technology and
gained limited vendor support.
„ 10BASE-FP uses a passive optical star approach.
The star and fiber optic cabling form the overall
medium. The star has no active component, and is
not a repeater. It simply provides “optical mixing” of
received signal. This is used for the case where no
power is available or electrical signal is hazardous.
„ Only physical layer interface changes for various
medium types, which can be mixed in a network.
Summer 2004
Dr. Paul Chen
71
10 M Ethernet Physical Layer (continued)
„ Make sure that the network is NOT oversized.
„ A collision after the slot time (512 bits or 51.2 us)
results in a “late collision”.
„ The late collision statistics is used to indicate that
the network has become oversized. The roundtrip propagation delay is too large.
„ Link Test is provided to ensure network integrity.
„ Link Status allows simple diagnosis of the station,
or repeater port state.
Summer 2004
Dr. Paul Chen
72
Sample Problem 1
„ A 1-km-long 10-Mbps LAN has propagation speed
of 200 m/usec. Data frame size is 256 bits,
including 32 bits if header, checksum, and other
unspecified overhead. The first time slot after a
successful transmission is reserved for the receiver
to capture the channel and send a 32-bit
acknowledgement frame.
„ What is the effective data rate, excluding overhead,
assuming that there are no collisions?
Summer 2004
Dr. Paul Chen
73
Solution to Sample Problem 1
„ The round trip propagation delay is 2 x 1000 m / (200
m /usec) = 10 usec.
„ A complete transmission has four phases:
transmitter seizes cable: 10 usec
transmits data: 256 / (10 x 10**6) = 25.6 usec
receiver seizes cable: 10 usec
acknowledgement sent: 32 / (10 x 10**6) = 3.2 usec
The total is 48.8 usec. The actual data of (256-32)
224 data bits is sent. The effective data rate is
224/48.8 usec = 4.6 Mbps.
Summer 2004
Dr. Paul Chen
74
Sample Problem 2
Consider building a CSMA/CD network running
at 1 Gbps over a 1-km cable with no repeaters.
The signal speed on the cable is 200,000 km/sec.
What is the minimum frame size?
Summer 2004
Dr. Paul Chen
75
Solution to Sample Problem 2
„ For a 1 km cable, the one-way propagation time
is 1 km / 200,000 km/sec = 5 usec, so round trip
delay is 10 usec. This is much longer than 4.096
usec round-trip delay needed to make Gigabit
CSMA/CD LAN work properly. Slot time = 10
usec. Frames must be transmitted in 10 usec.
„ At 1 Gbps, all frames shorter than 10,000 bits can
be transmitted in under 10 usec. Thus, the min
frame size is 10,000 bits or 1250 bytes.
Summer 2004
Dr. Paul Chen
76
Sample Problem 3
„ The min Ethernet frame size must be 64 bytes to
ensure that the transmitter is still going in the
event of a collision at the far end of the cable.
„ Fast Ethernet has the same 64-byte min frame
size, but can get the bits out 10 times faster than
Ethernet.
„ How is it possible to maintain the same min frame
size?
Summer 2004
Dr. Paul Chen
77
Solution to Sample Problem 3
„ The max distance (cable length) supported by
Fast Ethernet is 1/10 as long as in Ethernet.
„ i.e. Shorter reach!!
Summer 2004
Dr. Paul Chen
78
Sample Problem 4
„ A device accepts frames from Ethernet to which it
is attached. It removes the packets inside the
frames, adds framing information around it, and
transmits it over a leased telephone line (which
only connects to the outside world) to an identical
device at the other end. The far-end device
removes the framing, inserts the packets into a
token ring frame, and transmits it to a local host
over a token ring LAN.
„ What do you call this device?
Summer 2004
Dr. Paul Chen
79
Solution to Sample Problem 4
„ Since the device connects to Ethernet (or Token
ring) on one side and to the telephone leased line
on the other side, there is no routing involved. The
device is a half bridge (not even a full bridge).
„ A full bridge can connect either two same or two
different LAN technologies.
Summer 2004
Dr. Paul Chen
80
Manchester Encoder
„ Manchester encoder translates physically
separate signals of clock and data into a single,
self-synchronizing serial bit stream, suitable for
transmission on the cable by the transmitter.
„ For the 10M data rate, the bit cell time is 100ns.
„ The encoder output timing must not exceed 0.5
ns.
Summer 2004
Dr. Paul Chen
81
Manchester Encoder
Input Data Stream
1
0
0
1
0
1
High level
Low level
Encoding Signal Pattern
Summer 2004
Dr. Paul Chen
82
1
Manchester Encoder
„ During the 1st half of the bit cell time, the serial
signal transmitted is the logical complement of the
bit value being encoded during that cell.
„ During the 2nd half of the bit cell time, the uncomplemented value of the bit being encoded is
transmitted.
„ Thus, there is always a signal transition in the
center of each bit cell.
Summer 2004
Dr. Paul Chen
83
10/100 M b/s Ethernet Layer Model
Higher Layers
Logical Link Control (LLC)
Media Access Control
(MAC)
Reconciliation Sublayer (RS)
Physical Signaling (PLS)
MII
AUI
Physical Coding Sublayer
Physical Medium Attachment
PHY
Physical Medium Attachment (PMA)
Auto Negotiation
MAU
MDI
MDI
Medium
Medium
100 M b/s
10 M b/s
Summer 2004
Dr. Paul Chen
Physical Medium Dependent
84
100BASE-T (IEEE802.3u) Standard Overview
„ Reconciliation Sublayer (RS) maps the MAC
behavior to electrical signals of the MII.
Specifically, it maps the new 4-bit data path and
associated control signals of MII to the original
PLS service interface, which is bit-serial.
„ MII (18-pin) can be used as an interconnect at the
chip, board, or physical device level. When used
as inter-chip connection, it is implemented as
printed-circuit board traces.
Summer 2004
Dr. Paul Chen
85
100BASE-T (IEEE802.3u) Standard Overview
„ MII Management interface consists of Management
Data Clock (MDC) and Management Data Input
Output (MDIO). MDC is used to synchronize the
data transfer in and out of the PHY using MDIO pin.
MDIO is a bidirectional signal which allows serial
data to be clocked in and out of the PHY device.
„ Reduced MII (RMII) reduces the number of pins
from 18 to 9 to allow more port to fit into a box or
chassis. RMII uses a 2-bit data path and operates
at 50 MHz vs. 25 MHz used by MII.
Summer 2004
Dr. Paul Chen
86
Major Differences Between 10BASE-T and 100BASE-T
„ MII vs.. AUI
- 4-bit MII interface replaces the the bit-serial AUI
interface
„ Addition of RS (Reconciliation Sublayer)
- RS maps the 4-bit wide data path and associated control
signals of MII to the original PLS service interface. In
practice, RS is implemented as an integral part of MAC
controller chip.
„ Dual-Speed MAC Operation
- To allow gradual upgrade and co-existence
„ Replacement of Manchester Encoding by NRZ
- To counter the EMI and RFI, NRZ is more suitable for
100 M b/s data rate.
Summer 2004
Dr. Paul Chen
87
NRZ Encoding
„ Non-return to zero encoding is commonly used in
slow speed communications interfaces for both
synchronous and asynchronous transmission.
Using NRZ, a logic 1 bit is sent as a high value
and a logic 0 bit is sent as a low value (the line
driver chip used to connect the cable may
subsequently invert these signals).
Summer 2004
Dr. Paul Chen
88
NRZ Encoding
„ A problem arises when using NRZ to encode a
synchronous link which may have long runs of
consecutive bits with the same value. The figure
below illustrates the problem that would arise if NRZ
encoding were used with a DPLL recovered clock
signal. In Ethernet for example, there is no control
over the number of 1's or 0's which may sent
consecutively. There could potentially be thousands
of 1's or 0's in sequence. If the encoded data
contains long 'runs' of logic 1's or 0's, this does not
result in any bit transitions.
Summer 2004
Dr. Paul Chen
89
NRZ Encoding
„ The lack of transitions prevents the receiver DPLL
from reliably regenerating the clock making it
impossible to detect the boundaries of the
received bits at the receiver. This is the reason
why Manchester coding is used in Ethernet LANs.
„ A long run of bits with the same value results in
no transitions on the cable when NRZ encoding is
used
Summer 2004
Dr. Paul Chen
90
NRZ Encoding
Summer 2004
Dr. Paul Chen
91
NRZ Inverted Encoding
„ A method for transmitting and recording data so
that it keeps the sending and receiving clocks
synchronized. This is especially helpful in
situations where bit stuffing is employed -- the
practice of adding bits to a data stream so it
conforms with communications protocols.
„ These added bits can create a long string of
similar bits, which register to the receiver as a
single, unchanging voltage. Since clocks adjust
on voltage changes, they'll lag behind true time.
Summer 2004
Dr. Paul Chen
92
NRZI Encoding
„ NRZI ensures that after a 0 bit appears, the voltage will
immediately switch to a 1 bit voltage level. These
voltage changes allow the sending and receiving clocks
to synchronize.
Summer 2004
Dr. Paul Chen
93
Major Differences Between 10BASE-T and
100BASE-T (continued)
„ Class I and II Repeater Specification
„ Addition of Auto-Negotiation
„ Full Duplex (FDX) Operation
„ Different Cable Categories and Numbers of Pairs
Used
„ 100BASE-TX operates over two pairs of Cat 5 UTP
(Cat 5 is better quality cable than Cat 3)
100BASE-T4 operates over four pairs of Cat 3 UTP
10BASE-T operates over two pairs of Cat 3 UTP
„ Updated management
Summer 2004
Dr. Paul Chen
94
Gigabit Optical PHY Layer
„ 1000Base-SX supports a light source of 850 nm
wavelength on MMF (core diameter of 50 or 62.5
um) at distance of 220 m to 550 m.
„ 1000Base-LX supports a light source of 1300 nm
wavelength on SMF (core diameter of 2 to 10 um)
at distance of > 5 km.
Summer 2004
Dr. Paul Chen
95
Gigabit Optical PHY Layer
„ Gigabit Ethernet supports two types of MMF: 50
um (core diameter) and 62.5 um fibers.
„ 62.5 um fiber has low modal bandwidth compared
with 50 um fiber, especially for short-wave lasers.
As such, the distance traversed by 62.5 um fiber
is less than in 50 um fiber.
„ Two types of light sources (transmitter) are used
to transmit light over fiber: LED and laser diode.
Summer 2004
Dr. Paul Chen
96
Gigabit Optical PHY Layer
Optical Transmitter Parameters
„ Wavelength
„ Spectral Width
„ Power
„ Rise Time / Fall Time
„ Extinction Ratio
„ Jitter
„ Relative Intensity Noise (RIN)
Summer 2004
Dr. Paul Chen
97
Gigabit Optical PHY Layer
„ Laser diodes are faster than LEDs. Typical rise
time for laser and LED is 1 ns vs.. a few ns to
250 ns.
„ Gigabit Ethernet requires a laser-diode type of
transmitter.
„ Signal loss in the optical fiber is minimal at
wavelength of 850, 1300 and 1550 nm.
„ MMF using short-wave lasers, the wavelength
ranges from 770 to 860 nm.
Summer 2004
Dr. Paul Chen
98
Gigabit Optical PHY Layer
„ MMF and SMF using long-wave lasers, the range
is 1270 to 1355 nm.
„ Transmitters never emit light at a single
wavelength. A range of wavelengths are
produced, which is called “spectral width”.
„ Wavelength = velocity of the light / the frequency
„ The spectral width for laser-diode and LED type
of transmitters is 1 – 5 nm vs.. 20 – 100 nm.
Summer 2004
Dr. Paul Chen
99
Gigabit Optical PHY Layer
„ Random timing errors (called jitters) build up
through the length of optical link, which can cause
incorrect interpretation of the signal at the
receiver.
„ Transmitter power minus loss due to transmission
on the fiber must be >= minimum acceptable
receive power.
„ Rise time: time needed for the output power of
the transmitter to rise from 20% to 80% of its final
value when the input is a step current.
Summer 2004
Dr. Paul Chen
100
Gigabit Optical PHY Layer
„ Transmitter takes certain amount of time to reach
maximum power (logic 1) and to fall (or reach)
minimum power (logic 0). These are called rise and
fall time.
„ Extinction ratio := average optical energy in Logic 1
value / average optical energy in Logic 0 value
Summer 2004
Dr. Paul Chen
101
Gigabit Optical PHY Layer : Relative Intensity Noise (RIN)
„ A laser is a highly tuned quantum-effect oscillator.
When a laser source is used to transmit light
through fiber, a certain amount of optical power is
reflected back into the laser device due to
connectors and optical lens interfaces.
„ The reflected optical power disturbs the purity of
the laser oscillation, which appears as optical
noise, (called RIN). This is similar to thermal
noise in resistors.
Summer 2004
Dr. Paul Chen
102
Class I and II Repeater for 100BASE-T
„ Objectives are to optimize the signal delay and
significant differences between the coding schemes
used for different media types.
„ Class I allows more generous delays to
accommodate conversions between two coding
schemes, and allow all media types to be connected
to the repeater.
„ Class II was defined with more stringent timing
specification, requiring it to be optimized for one
coding scheme, meaning that it could not support all
media types.
Summer 2004
Dr. Paul Chen
103
MAC Functions
MAC Client Sublayer
TX Data
Encapsu
RX Data
Decapsu
TX Media
Access MGNT
RX Media
Access MGNT
TX Data
Encoding
RX Data
Decoding
Physical Layer Signaling
Summer 2004
Dr. Paul Chen
104
MAC Functions
„ Data encapsulation (transmit and receive)
- Framing (boundary delimitation, frame sync)
- Addressing (processing source & destination
address)
- Error detection of physical medium transmission
errors
„ Media Access Management
- Medium allocation (collision avoidance)
- Contention resolution (collision resolution)
Summer 2004
Dr. Paul Chen
105
MAC Byte to MII Nibble Mapping
1st Bit from MAC
D0
D1
MAC’s Serial Bit Stream
D2
D3
D4
D5
1st Nibble
MSB
D0
D2
D3
MII Nibble Stream
Summer 2004
D7
2nd Nibble
D1
LSB
D6
Dr. Paul Chen
106
100BASE-T4 Cat-3 UTP
„ 100BASE-T4 operates over four pairs of Cat 3 (or
better) UTP cable. DTE-to-repeater distance is
limited to 100 m.
„ 100BASE-T4 uses a block coding scheme
(8B/6T). Three of the four cable pairs are used for
data transmission by either DTE or repeater, the
remaining pair is used to detect simultaneous
activity from the device at the other end of the link
indicating a collision.
Summer 2004
Dr. Paul Chen
107
100BASE-T4 Cat-3 UTP (continued)
„ The PHY sub-layer takes two nibbles (4-bit each) from
MII to form a byte, and converts to a 6-bit ternary
symbol. Each symbol (data byte) is sent over one of the
three pairs, with each data byte being encoded and
transmitted on the pairs in a round-robin fashion.
Summer 2004
Dr. Paul Chen
108
100BASE-T4 Cat-3 UTP (continued)
„ 8B/6T encoding scheme maps the 256 8B (binary)
data-byte to a subset of the available 6T (ternary)
codes (36 = 729 available). Each of the 6-bit positions in
the ternary code can take one of the three values, +1,
0, -1. This provides good clock transition density, which
leads to simple receive clock recovery, and minimizes
high-energy transitions (from +1 to -1 and vice versa)
which reduce EMI/RFI.
Summer 2004
Dr. Paul Chen
109
100BASE-T4 Cat-3 UTP (continued)
„ The data rate on each of the three pairs is 33
Mb/s. The increased efficiency of 8B/6T coding
allows the frequency on the line to operate at 25
MHz.
„ Since 1000BASE-T4 uses a new PHY layer
coding and signaling protocol, this requires a new
IC to be developed to meet the requirement
economically. This slows its deployment relative
to 100BASE-TX, which was able to leverage
existing PHY IC developed for FDDI over copper.
Summer 2004
Dr. Paul Chen
110
100BASE-TX Cat-5 UTP
„ 100BASE-TX uses 4B/5B coding scheme and
operates over two pairs of Cat 5 cable. This coding
scheme is full duplex (unlike 100BASE-T4, but
identical to 10BASE-T), with one pair transmitting and
the other pair receiving.
„ The PHY takes a nibble from MII and converts this
into 5-bit binary symbol. Due to EMI/RFI concerns for
this high frequency (125 MHz) of data rate on the
cable, additional steps are taken to reduce the
spectral content of the transmission.
Summer 2004
Dr. Paul Chen
111
100BASE-TX Cat-5 UTP
„ Scrambling helps smooth the spectral content of the
resulting transmitted waveform.
„ MLT-3 (multilevel transmit ternary) encoding converts
the binary 5B symbol to a ternary code and further
reduces the content.
„ The data rate on a single pair is 100 M b/s, scrambling
and MLT-3 coding steps reduce the frequency on the
line to 31.25 MHz.
Summer 2004
Dr. Paul Chen
112
4B/5B and MLT-3 Encoding for 100BASE-TX PHY
125 MHz NRZI
For 100BASE-FX
MII
TXD<3:0>
100BASE-TX
4-bit
Data
Current
Nibble
Summer 2004
4B/5B
Encoding
Parallel to
Serial Conv
Dr. Paul Chen
Scrambler
113
MLT-3 Encoder
100BASE-FX Fiber Optic
„ 100BASE-FX operates over two individual
(multimode) fiber optic cables. Its PHY uses the same
4B/5B block coding scheme as 100BASE-TX,
allowing the native full duplex operation of the link.
„ Due to its higher cost in fiber optic, 100BASE-FX is
typically used in long-distance, high-bandwidth, or
security-conscious applications.
„ Full duplex 100BASE-FX can support up to 2 km on
multimode fiber, while single-mode can extend the
distance further.
Summer 2004
Dr. Paul Chen
114
100BASE-T2 Cat 3 UTP
„ 100BASE-T2 was developed to operate over two
pairs of Cat 3 cable.
„ To provide full duplex operation, 100BASE-T2
operates full duplex on each pair using the
“quinery symbol” coding. This is also referred as
PAM 5x5 symbol code.
„ The standard was developed, but the vendor
interest wane due to the complexity and cost of
the PHY implementation.
„ There is No installed base for 100BASE-T2.
Summer 2004
Dr. Paul Chen
115
MAC Sub-layer
„ MAC sub-layer is the primary control entity for
access to the network.
„ On receipt of MA_UNITDATA.request primitive
from the LLC, the MAC formats a data frame from
the information in the provided parameters,
adding its own header and error checking trailer.
„ When the link goes quiet, it initiates and controls
the transmission.
Summer 2004
Dr. Paul Chen
116
MAC Sub-layer (continued)
„ When it receives a frame, the MAC sublayer checks
it for validity, strips the header and trailer, if no error
was detected, generates an
MA_UNITDATA.indication primitive that is passed
to the LLC.
„ The optional MAC Control sublayer allows flow
control procedures and contains provisions for
adding other control functions in the future.
PAUSE frame is an example of MAC Control sublayer function.
Summer 2004
Dr. Paul Chen
117
Water Mark Flow Control
„ Buffers inside a switch will assign two water marks:
high and low. High and Low water marks associate
with a large and small timer values to be used when
sending PAUSE flow control frames. The timer
value is programmable.
„ When frame buffers exceed the low-watermark
threshold, the switch will generate a PAUSE frame
and send it to the DTE, which will stop sending new
frames until timer expires.
Summer 2004
Dr. Paul Chen
118
Water Mark Flow Control
„ If the congestion persists, the frames in the buffer will
reach the High watermark. The switch will send a
PAUSE frame with a large timer value assigned to it.
„ When the congestion eases, the switch can send a
PAUSE frame with timer value zero.
Summer 2004
Dr. Paul Chen
119
Credit Based Flow Control
„ A DTE can send a frame to Switch if it has
positive credit. The switch will advertise to DTE
about the number of credits available for DTE to
send frames.
„ For Ethernet, the credit needed to send one Max
frame is 1518 bytes.
„ The advertising of credit to DTE is performed
through exchange of flow control frames.
Summer 2004
Dr. Paul Chen
120
Credit-Based Flow Control
„ The source can continue to transmit as long as its
credit counter is greater than zero.
„ The credit counter is initially set to zero.
„ Each RTT, the controller sends a feedback
message indicating the counter value for each
source under its control.
„ The controller dedicates a set of buffers to each
connection and computes the credit as the number
of remaining bits or packets in the buffer for each
connection.
Summer 2004
Dr. Paul Chen
121
Credit-Based Flow Control
„ The credit flow control results in a bursty but regular
transmission of data.
„ This scheme operates in a region that keeps the
buffer relatively full much of the time.
„ When the RTT is small (dedicated buffer capacity is
small), such as LAN traffic, credit flow control
performs well.
„ But the logic is complicated to implement, a large
storage is needed for longer propagation delay, and
the per connection message uses about 10% of the
link bandwidth.
Summer 2004
Dr. Paul Chen
122
Rate Based Flow Control
„ The switch will signals to the DTE to send frames at
desired rate using control frames.
„ DTE can send frames faster or slower by adjusting
the Inter-Frame-Gap (IPG) between frames. For
10M, 100M and Gigabit Ethernet, default IPG is 96
bit times (9.6, 0.96, 0.096 usec).
„ When there is no congestion, the switch will signal
DTE to send frames at max rate or minimum IPG
(96 bit times).
Summer 2004
Dr. Paul Chen
123
Rate Based Flow Control
„ When the congestion occurs, the switch will send
DTE a flow control frame with a new IPG value (>
96 bit times). IPG for different rates can be
predefined.
„ Unlike PAUSE frame flow control (XON, XOFF),
rate based flow control does not exhibit all-ornothing behavior.
„ There is no guarantee that frame will not be lost
due to buffer overflow.
Summer 2004
Dr. Paul Chen
124
Rate-Based Flow Control
„ This is based on the Transmit Rate of the source.
The unit of Transmit Rate is bits or packets per
Round Trip Time (RTT).
„ Every RTT, the controller provides feedback on
whether the source should increase or decrease
its rate.
„ This scheme performs better if the feedback uses
the rate of buffer growth instead of an absolute
threshold.
Summer 2004
Dr. Paul Chen
125
Rate-Based Flow Control
„ The source must be able to control its transmit
rate using some form of traffic shaping.
„ The end result of this scheme is that the data
transmission is more evenly spaced (based on
simulation).
Summer 2004
Dr. Paul Chen
126
LLC / MAC Group Service Primitives
MA_UNITDATA.request
MA_UNITDATA.indication
Media Access Control
(a) Without the optional MAC Control sublayer implemented
MA_UNITDATA.request
MA_CONTROL.request
MA_UNITDATA.indication
MA_CONTROL.indication
MAC Control Sub-layer (optional)
TransmitFrame
(DA, SA, Len/type, Data)
ReceiveFrame
(DA, SA, Len/type, Data)
Media Access Control
(b) With the optional MAC Control sublayer implemented
Summer 2004
Dr. Paul Chen
127
802.3 / Ethernet Frame Format
Bytes
7
1
Preamble
2 or 6
2 or 6
DA
SA
Start of Frame
delimiter
2
0 ~ 1500
0 ~ >= 1536
Data
0 ~ 46
Pad
4
Checksum
Length of Data Field for 802.3 (value <= 1500)
Type of Data Field for classical Ethernet (value >=
1536)
DA: Destination Address
SA: Source Address
Preamble: each byte pattern 10101010
Start of Frame Delimiter: 10101011
A valid frame must be at least 64 bytes in length (from DA to checksum)
Summer 2004
Dr. Paul Chen
128
MAC Sub-layer (continued)
„ The standard allows 2- and 6-byte address. But
CSMA/CD only uses 6-byte addresses.
„ The high order bit of the DA is 0 for ordinary
address and 1 for group address.
„ A DA with all 1’s is a broadcast address.
„ When the length of Data field is less than 46
bytes, the pad field is used to fill out the frame to
the minimum size.
Summer 2004
Dr. Paul Chen
129
MAC Sub-layer (continued)
„ The minimum frame size of 64 bytes is to
prevent a station from completing its
transmission before the first bit has even
reached the far end of the cable, where it
could collide with another frame.
„ PAUSE control frame (with a timer value
specified) is used by full duplex station to
control the number of inbound packets for
congestion control.
Summer 2004
Dr. Paul Chen
130
MAC Frame Address Format
Transmission order, left-to-right, high-to-low order bit
6 Bytes
I/G
46-bit Address
U/L
0 = Universal (globally-administered) address
1 = Locally administered address
0 = Individual address
1 = Group address
Summer 2004
Dr. Paul Chen
131
MAC Control Frame Format
2 octets
Pre
SFD
DA
SA
Length/Type
2 octets
60 octets
MAC Control
Opcode
Opcode
Parameters
FCS
Pause_time
88-08 (Hex)
01-80-C2-00-00-01 (Hex)
For PAUSE control frame
Reserved
(Set to 0)
00-01 (Hex) for PAUSE control frame
Frames with the DA value of 01-80-C2-00-00-01 are
filtered out by the receiving MAC and are not forwarded
by switch or bridge ports.
Summer 2004
Dr. Paul Chen
132
Relationship of Slot Time and Frame Size
A starts transmission
t1
t2 B starts transmission
A completes transmission
of a frame
t4
t3 B detects collision and
starts Jam
Slot Time
A detects a new frame
End of Jam signal
from B
t5
A discards a frame due
to FCS error
Distance
DTE A
Summer 2004
Dr. Paul Chen
Time
DTE B
133
Relationship of Slot Time and Frame Size (continued)
„ In previous viewgraph, frames sent by DTE A and
B are of length less than the slot time. As such,
neither A or B can send their frames successfully.
This would render the operation of the network
infeasible.
„ Minimum transmission time on the network must
be at least a slot time.
Summer 2004
Dr. Paul Chen
134
Reason for Increasing the Slot Time for Gigabit Ethernet
„ The minimum frame size for 10M and 100M
Ethernet is 512 bits. The minimum transmission
time must be at least a slot time.
„ This implies that we will decrease the network size
of Gigabit Ethernet to 20m if we keep the slot time
at 512 bits.
„ The network range of 20m will make Gigabit
Ethernet impractical for the real world applications.
„ As such, the slot time of Gigabit Ethernet is
increased to 512 bytes (4096 bit times). Not exactly
5120 bits!
Summer 2004
Dr. Paul Chen
135
Carrier Extension for Shared Gigabit Networks
„ The minimum frame size for Gigabit Ethernet is still kept
at 64 bytes in order to be compatible with other slower
Ethernet.
„ In a bridged network, a bridge must segment a frame
from Gigabit network into 64-byte chunks.
„ If a server has a Gigabit link, then each
acknowledgement would be eight times longer than
necessary.
„ IEEE802.3z decided to adopt a technique called “carrier
extension” to decouple the minimum frame length from
the slot time for Gigabit half-duplex operation.
Summer 2004
Dr. Paul Chen
136
Carrier Extension for Shared Gigabit Networks (Conti)
„ When a DTE transmits a frame with length longer than
slot time (4096 bits), the MAC returns the “transmitdone”
status to the upper layer as before.
„ If the frame length is less than the slot time, the transmit
status is withheld and the physical layer transmits a
sequence of special ‘extended carrier’ symbols until the
end of slot time.
„ These symbols are transmitted after the FCS which
delimits the frame. The special symbols are not part of the
frame and are handled in a different way at the receiver.
„ If collision occurs during the data or extended carrier
transmission, DTE will abort transmission and send a jam
signal (32 bits).
Summer 2004
Dr. Paul Chen
137
Frame Format with Carrier Extension
Bytes
7
1
Preamble
2 or 6
2 or 6
DA
SA
Start of Frame
delimiter
2
0 ~ 1500
0 ~ >= 1536
0 ~ 46
4
0-448 bytes
Data
Pad
FCS
Extension
Length of Data Field for 802.3 (value <= 1500)
Type of Data Field for classical Ethernet (value >= 1536)
64 bytes minimum
512 bytes minimum
FCS coverage
Duration of carrier event
Summer 2004
Dr. Paul Chen
138
LLC Functions
„ LLC (Logical Link Control) forms the upper half of the
Data Link layer. It hides the differences between
various kinds of 802 networks by providing a single
format and interface to the Network layer.
„ Network layer passes a packet to LLC using the LLC
access primitives. The LLC sublayer then adds an LLC
header, containing source and destination addresses,
sequence and acknowledgement numbers.
„ LLC provides three service options: unreliable
datagram service, acknowledged datagram service,
and reliable connection-oriented service.
Summer 2004
Dr. Paul Chen
139
802.3 Performance
„ Assumption: it is under heavy and constant load, k
stations are always ready to transmit.
„ For 10 Mb/s, the time slot is set to 512 bit times, or
51.2 us. It is set to accommodate the longest path
allowed by 802.3 (2.5km and four repeaters).
Summer 2004
Dr. Paul Chen
140
802.3 Performance (continued)
„ If each station transmits during a contention slot with
probability p, the probability A that some station
acquires the channel in that slot is A = kp(1 – p)k-1
„ A is maximized when p = 1/k, with A Æ1/e as k
Æinfinity.
Summer 2004
Dr. Paul Chen
141
802.3 Performance (continued)
„ The probability that the contention interval has
exactly j slots in it is A(1 – A)j-1, the mean number of
slots per contention is given by
Sum jA(1 – A) j-1 = 1/A, where j = 0 to infinity
Since each slot has a duration 2τ, the mean
contention interval, w, = 2τ/A.
Summer 2004
Dr. Paul Chen
142
802.3 Performance (continued)
„ Assuming optimal p, the mean number of
contention slots is never more than e, so w (mean
contention interval) is at most 2τe = 5.4τ.
„ If the mean frame takes P sec to transmit, when
many stations have frames to transmit, channel
efficiency = P / (P + 2τ/A)
„ Since t is dependent on the maximum cable
distance between any two stations, the longer the
cable, the longer the contention interval.
Summer 2004
Dr. Paul Chen
143
802.3 Performance (continued)
„ Substituting with the frame length, F, the network
bandwidth, B, the cable length, L, the speed of signal
propagation, c, for the optimal case of e contention
slot per frame, and with P = F / B; channel efficiency
= 1/(1 + 2BLe/cF)
Note: the mean frame takes P sec to transmit
Summer 2004
Dr. Paul Chen
144
802.3 Performance (continued)
„ Virtually all performance analysis on 802.3
assumes that the traffic is Poisson distribution.
When research is done on the real traffic data,
the network traffic is self-similar rather than
Poisson. The average number of packets in each
minute of an hour has as much variance as the
average number of packets in each second of a
minute. We will not cover the self-similar here.
Summer 2004
Dr. Paul Chen
145
Efficiency of 802.3 at 10M b/s with 512-bit slot time
1.0
1024 byte frames
0.9
0.8
Channel efficiency
512 byte frames
0.7
256 byte frames
0.6
0.5
128 byte frames
0.4
64 byte frames
0.3
1
2
4
8
16
32
64
128
Number of stations trying to send
Summer 2004
Dr. Paul Chen
146
256
Layer 2 and Layer 3 Switch
„ Layer 2 Ethernet switch appeared in 1993 when Fast
Ethernet was being developed by 802.3. Fast
Ethernet (802.3u), Full Duplex Ethernet (802.3x), and
VLAN Tagging (802.3ac) were all initiated and
executed as a result of the industry movement to
migrate high performance Ethernet from sharedmedium to switching.
„ Layer 2 switch is functionally equivalent to a bridge.
Bridges perform filtering, learning and forwarding
functions in software while switches perform these
functions in hardware to increase the throughput.
Summer 2004
Dr. Paul Chen
147
Layer 2 and Layer 3 Switch
„ Multiple ports on a switch can be active
simultaneously and can operate in full or half
duplex mode with 10/100 M b/s auto-sensing on a
port-by-port basis. Full wire-speed forwarding and
learning, and VLAN tagging are implemented in
hardware.
„ Switches operate in either store-and-forward
mode (entire packet is received before forwarding
is attempted), or cut-through mode (forwarding is
commenced before entire packet is received). It
may incorporate features like forwarding based
on protocol type, broadcast domain (VLAN)
filtering.
Summer 2004
Dr. Paul Chen
148
Layer 2 and Layer 3 Switch (continued)
„ A port on the Ethernet switch that connects to a
repeater can only operate in half duplex mode
because the repeater only operate in that mode.
„ When 10 and 100 M b/s ports are equipped in a
switch, the data packet must be transmitted using
store-and-forward technique between two
different speeds because cut-through mode does
not perform speed adaptation. Vendors offer 100
M or Gigabit Ethernet switches on the market
today.
Summer 2004
Dr. Paul Chen
149
Layer 2 and Layer 3 Switch (continued)
„ Layer 2 switch operates based on the MAC
address (Layer 2) while Layer 3 switch operates
on the network layer (Layer 3). First generation
Layer 3 switch supports a limited number of the
network layer protocols that were accelerated
using hardware. The IP is the primary protocol
used for corporate and WWW traffics. IPoptimized hardware-assisted router is used to
handle the traffic aggregation.
Summer 2004
Dr. Paul Chen
150
Multi-layer Switch Routers
„ Multi-layer switch routers essentially combine the
functions of a Layer 2 switch and a router. Since
Layer 2 technology is used to forward packets
between ports, combinations of ports can be
treated as switched. Similarly, routing can be
enabled between ports to gain the advantages of
Layer 3 network segmentation where appropriate.
„ Switch routers are also far less expensive than
software-based routers, because they are based
on specialized hardware (ASICs), not on a
complex software architecture.
Summer 2004
Dr. Paul Chen
151
Multi-layer Switch Routers
„ In theory, switch routers can replace both switches and
routers. However, due to a difference in price-per-port,
switches will still be used in price-sensitive situations.
„ But switch routers will dramatically replace routers,
since there are very few downsides to switch routers
over routers.
„ Most switch routers support standard transparent
bridging for Layer 2, and support the 802.1Q standard
for private VLANs through the optical MAN.
„ Switch routers generally have a high-speed backplane
to forward traffic from one port to another at wire speed.
Summer 2004
Dr. Paul Chen
152
Multi-layer Switch Routers
„ The larger differences in switch routers are in the
Layer 3 and 4 processing. Switch routers offer the
ability to perform standards-based routing in order
to forward IP packets.
„ Two key differentiators are in the area of route
processing:
- Degree of decentralization
- Performance with advanced features enabled
Summer 2004
Dr. Paul Chen
153
Decentralization
„ Degree of decentralization is important to service
providers because it dictates whether or not there
is a single point of failure in the network. Switch
routers with highly decentralized route processing
(including redundancy and hot-swap capability)
have a big advantage here.
Summer 2004
Dr. Paul Chen
154
Performance with Advanced Features Enabled
„ A switch router must replace a router while offering
tiered services that may be provisioned and billed.
Features such as Bandwidth Provisioning, Server
Load Balancing, and Access Control Lists (ACLs) to
perform filtering and security functions are
important in sophisticated networking
environments—they should not hinder the
performance of the switch router when turned on.
„ ASICs in switch routers should perform at wire
speed regardless of the features enabled.
Summer 2004
Dr. Paul Chen
155
Application Awareness in Switch Routers
„ Application awareness in switch routers means
prioritization at Layers 2, 3, and 4, with the ability to
perform priority-based queuing at the higher layers.
Summer 2004
Dr. Paul Chen
156
Layer 2 Prioritization
„ Most switch routers support the 802.1p standard
for Layer 2 prioritization. This standard amounts
to supporting additional header information in the
Layer 2 packet (typically Ethernet). 802.1p
specifies three bits (eight levels) of priority for
Layer 2 packets.
„ This isn't actually tied to an application, however.
In fact, there are no set rules as to how the
priorities are derived and assigned to Layer 2
frame headers.
Summer 2004
Dr. Paul Chen
157
Layer 3 Prioritization
„ Most switch routers have some form of Layer 3
prioritization, but this is typically in the form of a
partial solution. The claim of support for Resource
Reservation Protocol (RSVP) is an especially wooly
area—many vendors who cannot offer bandwidth
reservation from end to end over the wide area claim
this support. More importantly, RSVP itself doesn't
support traffic prioritization. In many situations,
networking equipment will additionally need robust
prioritization capabilities to provide the applicationacknowledged information transfer.
Summer 2004
Dr. Paul Chen
158
Layer 3 Prioritization
„ A second form of Layer 3 prioritization is referred
to as IP flow mode: the source and destination
are used in combination, which forms the basis of
the prioritization. In some situations, this can
provide a fairly decent match to application-based
prioritization; in others, it cannot.
Summer 2004
Dr. Paul Chen
159
Layer 3 Prioritization
„ Prioritizing traffic by IP flows means that a given pair
of IP addresses (source and destination) are given a
certain priority. For instance, the flow from a range of
addresses at a specific customer can be assigned
one priority, while the flow from a range of addresses
associated with a different customer can be assigned
a different (and higher) priority. This would not allow
you to offer a customer different priority levels on
traffic of differing applications. You couldn't, for
example, assign a high rate to VoIP traffic and a low
rate to HTTP traffic.
Summer 2004
Dr. Paul Chen
160
Layer 4 Prioritization
„ Layer 4 is the key to application-aware
networking.
„ Using IP as an example, Layer 4 is based on a
transport port (often referred to as a "socket") that
is generally assigned by application. There are
many Layer 4 protocols used in IP, but two very
common ones are TCP and UDP; TCP is
connection-based, whereas UDP is
connectionless.
Summer 2004
Dr. Paul Chen
161
Layer 4 Prioritization
„ Specific Layer 4 port definitions are outlined in
RFC1700; for example, ports 20 and 21 for file
transfer (FTP data), 25 is for e-mail (SMTP), 80 is
for web browsing (HTTP), etc.
„ Switch routers that can interrogate Layer 4
information can perform intelligent, applicationaware prioritization. They can prioritize in this way
regardless of whether or not multiple applications
are running on the same server.
Summer 2004
Dr. Paul Chen
162
Layer 4 Prioritization
„ Layer 4 classification can be combined with
Layer 3 information and Type of Service (ToS)
bits to provide granular classification of data
flows to specific priority levels. This is also known
as class-based queuing.
„ These classifications are usually defined using
Quality of Service Access Control Lists (QoS
ACL).
Summer 2004
Dr. Paul Chen
163
Layer 4 Prioritization
„ For switch routers servicing customers accessing
the Internet (edge), a large number of ACLs need
be supported to allow for proper SLA (service
level agreement) support—perhaps tens of
thousands per switch router.
„ For enterprise environments, a few thousand
ACLs per switch router is sufficient.
Summer 2004
Dr. Paul Chen
164
Layer 4 Prioritization
„ A measure of how robust a switch router is, and its
ability to perform in a carrier environment, is the
number of ACL’s it can support. ACL tables are usually
stored in SRAM or CAM.
Summer 2004
Dr. Paul Chen
165
Queue Prioritization
„ Two queue levels are usually enough for a wire
speed switch router, but four is ideal to offer
tiered service levels. Anything more than four is
overkill for a wire speed device.
Summer 2004
Dr. Paul Chen
166
Strict Prioritization
„ They will always forward the highest priority
packet.
„ If a certain set of applications is assigned the
highest priority, and there is traffic for those
applications, then all other applications could be
starved: they will never be able to transfer
information.
Summer 2004
Dr. Paul Chen
167
Weighted Fair Queuing Prioritization
„ Policies are set such that given applications
receive a percentage of the available bandwidth.
In many situations, this actually more closely
models the real world.
„ The only downside is that this may cause output
queues to become oversubscribed.
Summer 2004
Dr. Paul Chen
168
Weighted Random Early Detection
„ Once a buffer is beginning to get full, we randomly
drop new packets so that specific flows are not
penalized, and upper layer applications do not
overreact to an excessive number of packets
dropped.
Summer 2004
Dr. Paul Chen
169
Intelligence in Layer 4 Switch Routers
„ A single customer generates a stream of packets.
This stream, called a flow, can be identified at
Layer 2, Layer 3 or Layer 4.
„ Each layer provides more detailed information
about the flow. The fundamental task in managing
a network is controlling these flows of traffic
through a Service Provider or MAN to the
Internet.
Summer 2004
Dr. Paul Chen
170
Intelligence in Layer 2 Switch Routers
„ Each frame is identified by the MAC address of
the source and destination devices. The ability to
control the flow is thus limited to the broadcast
domain.
„ Products that switch traffic at Layer 2 deliver high
performance but little functionality.
„ MAC address is useful in an edge device, but
once the packet has gone through a router en
route to another router, the Layer 2 information
loses importance.
Summer 2004
Dr. Paul Chen
171
Intelligence in Layer 2 Switch Routers
„ At Layer 3, flows are identified by source and
destination network addresses. The ability to
control the flow is limited to source/destination
pairs. Some switch routers operate at this level of
granularity.
„ If a client is using several applications from the
same server, Layer 3 information does not
provide visibility into each application flow, so
individual rules cannot be applied.
Summer 2004
Dr. Paul Chen
172
Intelligence in Layer 4 Switch Routers
„ Software-based routers used Layer 4 information to
set security filters to control access for network
traffic. But there is a penalty with software-based
routers: when they read more of the packet
information, performance can drop by as much as
70%, especially if security filters are enabled.
Summer 2004
Dr. Paul Chen
173
Intelligence in Layer 4 Switch Routers
„ Layer 4 coordinates communication between
network source and destination systems. Each
packet contains information that can be used to
uniquely identify the application that generated
the packet.
„ TCP and UDP headers include "port numbers"
that identify which application protocols are
included in each packet.
Summer 2004
Dr. Paul Chen
174
Intelligence in Layer 4 Switch Routers
„ In combination, the port number information in the
Layer 4 header and the source destination
information in the Layer 3 header can be used to
apply truly fine-grained control.
„ Individual application conversation flows can be
controlled between clients and servers, and if the
switch router is fully functional, all this can be
done at wire speed.
Summer 2004
Dr. Paul Chen
175
Intelligence in Layer 4 Switch Routers
„ In combination, the port number information in the
Layer 4 header and the source destination
information in the Layer 3 header can be used to
apply truly fine-grained control. Individual
application conversation flows can be controlled
between clients and servers, and if the switch
router is full function, all this can be done at wire
speed even with the security feature (via ASCI)
enabled.
Summer 2004
Dr. Paul Chen
176
Components of a Layer 2 Switch
Switching Element
Control
Process
Switching
Process
Output Controller
Input Controller
Port 1
Summer 2004
Port 2
Dr. Paul Chen
Port 3
177
Functions of Switch Components
„ Input Controller functions include:
- receive data frames
- MAC layer processing
- filter out invalid frames (frames that are shorter than
64 bytes or with CRC error)
- switch between cut-through and store-and-forward
modes
- buffer incoming data while transmitting the received
frame to Control Process
- fragment the packet into cells if the cell switching is
used by the switching element
Summer 2004
Dr. Paul Chen
178
Functions of Switch Components (continued)
„ Control Process functions include:
- transmission process (verifies the received DA
against the address table to determine the
destination port. If not found, broadcast to all ports)
- learning process (enter the new SA in the address
table and perform aging process to remove
outdated SA from the table)
- forwarding process (once destination port is
determined, perform the treatment for uni-cast,
multicast and broadcast, forward the data to the
switching element)
Summer 2004
Dr. Paul Chen
179
Functions of Switch Components (continued)
„ Output Controller functions include:
- receive packet from the switching element
- forward the packet to the destination port based
on the header information
- re-assemble the cells into packets if cell switching
is used by the switching element
- flow control monitors the output resources and
send signal to the switching element if congestion is
detected. The switching element will send a PAUSE
frame to the source port to suspend the data
transfer.
Summer 2004
Dr. Paul Chen
180
Single Chip Layer 2 Switch System
SDRAM
SRAM
Frame
Buffer
CPU
64-Bit
N+1
Ethernet
Switch
N x 10/100
Fast Ethernet
Summer 2004
Dr. Paul Chen
Flash
1 GE
181
System Architecture of A Single Chip Layer 2 Switch
Registers
External
SRAM
Switch
Control
Memory
(SRAM)
Frame
Buffer
Memory
Frame
Memory
Interface
RISC based
Switch Controller
CPU
Interface
Search Engine
Frame Engine
GMAC
LED Xinterf
N x 10/100 MACs
Summer 2004
Dr. Paul Chen
GMII
182
Features of Single Chip L2 Switch
„ Support N 10/100 Auto-sensing Fast Ethernet
ports with RMII interface, a single Gigabit
Ethernet port with GMII interface.
„ Full wire speed, full duplex L2 switching
„ Internal switch database maintains up to 2K MAC
addresses
Summer 2004
Dr. Paul Chen
183
Features of Single Chip L2 Switch
„ With external buffer memory, it supports up to
16K MAC addresses
„ Support flow control (802.3x)
„ Support 256 port and ID tagged VLAN (802.1Q)
- VLAN tag insertion and extraction
Summer 2004
Dr. Paul Chen
184
Functional Description of Single Chip L2 Switch
„ When frame data is received from a MAC port, it
is temporarily stored in the MAC Rx FIFO until the
Frame Engine moves it to the chip’s external
memory one granule (128-byte-or-less fragment
of frame data) at a time.
„ The Frame Engine then issues the Search
Engine a switching request that includes the
source MAC address, the destination MAC
address, and the VLAN tag.
Summer 2004
Dr. Paul Chen
185
Functional Description of Single Chip L2 Switch
„ After the Search Engine has resolved the
address, it transfers the information back to the
Frame Engine via a switching response that
includes the destination port and frame type (e.g.
uni-cast or multicast).
Summer 2004
Dr. Paul Chen
186
Functional Description of Single Chip L2 Switch
„ Switch Controller is designed to implement
highly efficient management functions for the
switching hardware, minimizing the management
activity intervention during frame processing.
„ There are two modes of operation: cut-through
mode, store-and-forward mode.
Summer 2004
Dr. Paul Chen
187
Forwarding Decision Time for Fast Ethernet
„ For a workgroup switch (100/1000 switch) with
eight 100M ports (downlink) and one 1000M port
(uplink), each port can be individually configured
as full or half duplex.
„ Assume that all link are FDX, the total bandwidth
supported is 8 x 2 x 100 + 1 x 2 x 1000 = 3.6 G
bps (in HDX)
„ Filtering and forwarding decision must be made in
a short time for cut-through mode of operation.
Summer 2004
Dr. Paul Chen
188
Forwarding Decision Time for Fast Ethernet
„ Forwarding decision time is the ratio of Interarrival time for minimum frames (64 bytes each)
at full load and summation of ports.
„ Inter-arrival time between frames is the time
interval between start of two frames in a back-toback transmit mode.
„ The shortest Inter-arrival time is for minimum
frame size (64 bytes).
Summer 2004
Dr. Paul Chen
189
Forwarding Decision Time for Fast Ethernet
„ One Minimum frame : 64 bytes
„ Inter-frame Gap (IPG): 12 bytes (96 bits)
„ Extra time (7-byte preamble + 1-byte SFD): 8
bytes
„ Bit time: 0.01 usec
„ Inter-arrival time between back-to-back 64-byte
frames is (64 + 12 + 8) x 0.01 x 8 = 6.72 usec
Summer 2004
Dr. Paul Chen
190
Forwarding Decision Time for Fast Ethernet
„ One Gigabit link equals ten 100M ports.
„ Equivalent number of 100 M ports is 8 + 10 = 18
ports
„ Forwarding decision time = 6.72 usec / 18
0.37 usec or 370 nsec.
There is plenty of time for a hardware based ASIC
chip to accomplish.
Summer 2004
Dr. Paul Chen
191
Layer 3 Switches
„ Layer 3 switch can handle Layer 2 (Data Link) as
well as Layer 3 (Network) capabilities.
„ Layer 3 switch can switch data packets based on
the destination IP address (4 octets) contained in
the IP header field.
Summer 2004
Dr. Paul Chen
192
Layer 3 Switches
„ Destination IP address is matched against the
network address table (routing table) to determine
the output port that is associated with the next
hop.
„ Hardware based Layer 3 switch performed the
address matching with CAM (Content
Addressable Memory) to shorten the latency
associated with CPU-based search.
Summer 2004
Dr. Paul Chen
193
Layer 3 Switch (continued)
„ Layer 3 switching provides three key benefits for
applications in campus networks:
- Scalability: It benefit from the integration of ATM
with Layer 3 routing. Label swapping enables
ATM switches to be fully integrated into IP-based
core network without scalability problem of a pure
Layer 2 network.
Summer 2004
Dr. Paul Chen
194
Layer 3 Switch (continued)
- Traffic Management: Layer 3 switching simplifies
traffic management in router-based internet by
integrating Layer 2 circuit capabilities. It is able to
control the flow of packets across a Layer 2
infrastructure to support the load balancing.
- Performance: Higher performance is achieved by
simplifying the packet-forwarding and switching
decision.
Summer 2004
Dr. Paul Chen
195
Layer 3 Switch (continued)
„ Through the use of dedicated hardware such as
the network processor, which fully integrates
MAC, framer, Classification, Traffic Management,
Switch fabric control and host CPU interface into
one IC, and CAM for speedy destination IP
matching / filtering, Layer 3 switch can switch IP
data packets and process the control packets for
BGP and OSPF via CPU. This is the key in the
design of terabit routers.
Summer 2004
Dr. Paul Chen
196
Layer 3 Switch (continued)
„ IP switching, Tag switching, and Aggregate
Route-based IP switching lead to the MPLS
(multi-protocol label switch) standardized by the
IETF. This can be used in ATM as well as IP
switching / routing.
Summer 2004
Dr. Paul Chen
197
L3 Switch Implementation
Ethernet
Interface
Network
Processor
Ethernet
Interface
Summer 2004
Dr. Paul Chen
CPU
CPU
I/O Module
198
CPU with
Memory
CAM +
Frame
Buffer
Network
Processor
Switch Module
Silicon Switch Fabric
CAM +
Frame
Buffer
I/O Module
Functions of Network Processor
„ Packet assembly
„ Packet recognition, L2 / L3 classification, filtering
„ Packet queuing
„ Traffic shaping and management
„ Quality of Service, Class of Service processing
„ Packet modification and segmentation
„ Switch fabric interface
„ CPU interface (optional)
Summer 2004
Dr. Paul Chen
199
Virtual LAN (VLAN)
„ VLAN allows network operators to configure and administer a
corporate network as a single bridge-interconnected entity,
while providing users the connectivity and privacy (or security)
they expect from having multiple separate networks.
100BASE-TX/FX
Ethernet
Switch
Workgroup
1
Marketing
Summer 2004
Workgroup
2
Engineering
Dr. Paul Chen
Ethernet
Switch
Workgroup
3
Workgroup
1
Human
Resource
Sales
200
Workgroup
2
Payroll
VLAN (continued)
„ VLAN is a logical broadcast domain. Traffics sent
to the broadcast address on a specific VLAN is
only forwarded to the other port with membership
of that VLAN such as Engineering Department.
„ Most Ethernet switches (disregard of the speed)
support multiple VLANs.
„ Identification of the VLAN membership is
provided by VLAN tagging. A VLAN tag is inside
the MAC frame.
Summer 2004
Dr. Paul Chen
201
VLAN (continued)
„ VLAN association can be performed using
different policies. A station can be identified as
belonging to a particular VLAN by the port on a
switch that it is connected to. This is “port” based
VLAN. Another policy could use the station
address that can be blocked from joining another
VLAN or forwarding data to another member in
that VLAN.
Summer 2004
Dr. Paul Chen
202
VLAN-Tagged MAC Frame Format
46 ~ 1500 bytes
Pre
SFD
DA
SA
Length/Type
2 bytes
VLAN Tag
Data
Pad
FCS
2 bytes
802.1Q
Tag Type
Tag Control
Information
Tag Value
= 0x81-00
User Priority
CFI
Byte 1
(Most significant byte)
VLAN ID
Tag Control Information
Byte 2
(Least significant byte)
VLAN ID
Bit
7
6
5
4
3
2
1
0
CFI: Canonical Format Indicator used by token ring
This bit is not used by 802.3 device and should be sent and received as 0
Summer 2004
Dr. Paul Chen
203
VLAN-Tagged MAC Frame Format (continued)
„ 4-byte VLAN Tag consists of Tag Type ID (2 bytes)
and Tag Control Information (2 bytes). The Tag is
inserted between SA and Length/Type fields of the
MAC frame. CRC must be recalculated any time a
VLAN Tag is inserted or removed.
„ The legal MAC frame size is modified to allow from
64 (minimum frame size) to 1522 bytes. When the
VLAN Tag is not present, the maximum MAC frame
size is still 1518 bytes.
Summer 2004
Dr. Paul Chen
204
VLAN-Tagged MAC Frame Format (continued)
„ Three fields are defined for the Tag Control
Information:
- 3-bit user priority allows up to 8 levels of priorities
(0 is the highest) to support the “class-of-service”.
- 1-bit CFI is not used by 802.3 device. Instead, it is
used by the token ring vendors.
- 12-bit VLAN ID
Summer 2004
Dr. Paul Chen
205
VLAN Administration
„ VLAN association (or administration) can be
based on port, MAC address, subnet and protocol
fields. Majority of switch vendors supports port
based VLAN.
„ Separate mapping tables are maintained and
updated periodically in the SRAM to support
different VLAN associations.
Summer 2004
Dr. Paul Chen
206
VLAN Administration
„ Null VLAN ID (VID) indicates that the tag contains
no VID information, only the priority information.
This is referred as the priority tagged frame. A
VLAN-aware bridge or switch will forward this
frame only after either classifying an appropriate
TCI at the output port, or stripping the VLAN tag
and retransmitting the frame untagged.
Summer 2004
Dr. Paul Chen
207
LAN / MAN Management
„ The network management system consists of a
“manager” which executes the managing process,
an “agent” which interacts with the manager and
provides an interface to the resource to be
managed, and the “managed objects” which
reside in a local system.
„ A managed object is a resource, which can be a
physical device or a logical construct or function.
A managed object provides a means to identify,
control or monitor a resource. An agent resides in
a local device and collects the information from
the managed objects.
Summer 2004
Dr. Paul Chen
208
LAN / MAN Management LAN / MAN Management
„ The agent based on external network command
(from the manager) could query the managed
object.
„ A management protocol is used to monitor or
control devices and gets information on the
managed objects.
„ SNMP uses a standard object definition language
and encoding rules which is called Abstract Syntax
Notation One (ASN.1).
Summer 2004
Dr. Paul Chen
209
Interaction between Manager, Agent, and Objects
Communicating
Management
Operations
Performing Management
Operations
Agent
Manager
Notifications
Notifications Emitted
Management
Station
Local System Environment
Managed Objects
SNMP protocol
Local System Environment is a Managed Node.
Summer 2004
Dr. Paul Chen
210
SNMP
„ASN.1 abstract syntax is essentially a primitive data declaration
language. It allows the user to define primitive objects and then combine
them into more complex ones.
„The ASN.1 basic data types allowed in SNMP are shown in the following:
Primitive Type
Meaning
Code
INTEGER
Arbitrary length integer
2
BIT STRING
A string of 0 or more bits
3
OCTET STRING
A string of 0 or more unsigned bytes
4
NULL
A place holder
5
OBJECT IDENTIFIER
An officially defined data type
6
Summer 2004
Dr. Paul Chen
211
SNMP MIB
„ The collection of all possible objects in a network
is given a data structure called the Management
Information Base (MIB). A MIB specifies the
different counters, status events, alarms, and
notifications for each managed object. Clause 30
of IEEE 802.3z provides a standard for defining
the MIB.
Summer 2004
Dr. Paul Chen
212
SNMP MIB
„ Clause 30 defines MIB objects, attributes,
notifications, and behavior for:
- 10 Mb/s DTE, 10 Mb/s baseband repeater and 10
Mb/s integrated MAU
- 100 Mb/s DTE, 100 Mb/s baseband repeater and
100 Mb/s PHY
- 1000 Mb/s DTE, 1000 Mb/s baseband repeater
and 1000 Mb/s PHY
Summer 2004
Dr. Paul Chen
213
SNMP Protocol
„SNMP manager sends a request to an agent asking it for information or
commanding it to update its state. The agent just replies with information or
confirms that it has updated its state. Data are sent using the ASN.1 transfer
syntax.
„SNMP defines 7 messages that can be sent. Six messages are listed in the
following with the 7th message being the response message:
Message
Description
Get-request
Request the value of one or more variables
Get-next-request
Request the variable following this one
Get-bulk-request
Fetch a large table
Set-request
Update one or more variables
Inform-request
Manager-to-manager message describing local MIB
SnmpV2-trap
Agent-to-manager trap report
Summer 2004
Dr. Paul Chen
214
MAN
„ As we mentioned earlier that 802.6 DQDB and
SMDS (which is based on DQDB) are not in use
today except in Europe. We would briefly discuss
its principle of operation. We will discuss more on
the applications of Gigabit and 10G Ethernet on
MAN.
Summer 2004
Dr. Paul Chen
215
MAN
„ Majority of the MAN is based on SONET
(Synchronous Optical Network) which is TDM
based and is not the most efficient way to carry
the IP or Ethernet traffic which is asynchronous
based.
„ Especially, until recently, all SONET rates are
multiple of 4 above STS-3 (STS-3, STS-12, STS48, etc.). This leads to inefficient use of SONET
payloads.
Summer 2004
Dr. Paul Chen
216
MAN
„ For example, if the IP or Ethernet traffic fits into
STS-20 but must be carried in STS-48 payload.
This issue is being addressed with the
introduction of more flexible payload with STS-n
where n can be any integer up to the SONET line
interface rate.
„ Each STS-n can be transferred over a different
path.
Summer 2004
Dr. Paul Chen
217
DQDB Dual Bus Architecture
Head of
Bus A
Bus A
S
E
Node 1
Node 2
Node 3
E
Node 4
S
Bus B
Head of
Bus B
S: Start of Data Flow
E: End of Data Flow
Both buses operate simultaneously. The aggregate
capacity of the network is twice the transmission
rate of one bus.
Summer 2004
Dr. Paul Chen
218
DQDB Slot Format
„ A DQDB slot is 53 octets, which is divided into
- 1 byte: Access Control
- 4 bytes: Segment Header
- 48 bytes: Segment Payload
„ This format is similar to that of ATM cell.
Summer 2004
Dr. Paul Chen
219
Node not queued to send on Bus A
Bus A
0
-
Access Unit
(AU)
Cancel one request for each QA
(queued arbitrated) slot on Bus A
Request
Counter
(RQ)
+
Bus B
Count requests on Bus B
1
For each REQ that passes the AU on Bus B, RQ is
incremented by one. RQ counter is decremented
by one for each empty QA (Queued Arbitrated) slot
that passes on Bus A.
Summer 2004
Dr. Paul Chen
220
Node queued to send on Bus A
Bus A
0
Cancel one request for each QA
(queued arbitrated) slot on Bus A
-
Request
Counter
(RQ)
Dump count to
Join queue
Countdown
Counter (CD)
+
Count requests on Bus B
Bus B
1
Summer 2004
Dr. Paul Chen
221
Node queued to send on Bus A
„ When the AU issues a REQ on Bus B, it transfers
the content of its RQ to CD and resets RQ to
zero. This initializes CD with the number of
downstream segments queued ahead of this AU’s
segment. Transmission is allowed when CD
equals zero.
„ For example, the value of RQ was 2. When this
was transferred to CD, it forces the AU to bypass
2 empty slots, which were reserved by
downstream AU’s, before this AU can transmit.
Summer 2004
Dr. Paul Chen
222
IP Over SONET
„ Packet-Over-SONET/SDH (POS) is an emerging
technology for carrying IP and other data traffic over
the SONET/SDH backbone.
„ Variable length data packets are mapped directly
into the SONET Synchronous Payload Envelope
(SPE). It may be used in layer 2 switches or layer 3
switches/routers depending on the specific
implementation.
„ POS provides reliable, high capacity, point-to-point
data links using the SONET physical layer
transmission standards.
Summer 2004
Dr. Paul Chen
223
IP Over SONET
„ Mapping into SONET using the Point-to-Point
Protocol (PPP) was standardized in accordance with
RFC 1619.
„ SONET is a world-wide standardized transmission
protocol for implementing a robust, scalable transport
mechanism with industry standardized interfaces. It
provides a standard operating environment with
defined protocols for operations management,
“provisioning”, and performance assurance.
„ In an IP (through PPP) over SONET infrastructure,
POS links provide high bandwidth pipes that can be
used to interconnect high-speed routers.
Summer 2004
Dr. Paul Chen
224
IP Over SONET
In an IP (through PPP) over SONET infrastructure, POS links
provide high bandwidth pipes that can be used to interconnect
high-speed routers.
Access Routers
POS Link
Backbone Routers
POS Link
SONET/SDH Backbone Network
Summer 2004
Dr. Paul Chen
225
IP Over SONET (continued)
„ PPP provides a standard method for transporting multi-protocol
datagrams over point-to-point links. These links provide full-duplex
simultaneous bi-directional operations, and are assumed to deliver
packets in order.
„ Delineation of PPP encapsulated IP datagrams is performed using
Flag Sequence recognition and byte stuffing/de-stuffing techniques.
Flag
0x7E
Address
0xFF
Control
0x03
Protocol
8/16 bits
Information
Padding
PPP HDLC-like Frame Format
Summer 2004
Dr. Paul Chen
226
FCS
16/32
Flag
0x7E
STS- 1 Frame with IP
87 Bytes
3 Bytes
3 Bytes
Section
Overhead
Line
6 Bytes Overhead
1X9
Byte
1X9
Byte
1X9
Byte
P
a
t
h
F
I
X
F
I
X
O
v
e
r
h
e
a
d
S
t
u
f
f
51.84 Mbps
Summer 2004
Dr. Paul Chen
227
IP
Payload
48.384
Mbps
S
t
u
f
f
STS-3c Frame Structure
9 Bytes
261 Bytes
H
3 Bytes
6 Bytes
Section
Overhead
Line
Overhead
P
a
t
h
O
v
e
r
h
e
a
d
155.52 Mbps
Summer 2004
Dr. Paul Chen
…
H
228
IP Payload
149.76 Mbps
…
H
STS-48c Frame Structure
15
Bytes
144 Bytes
4160 Bytes
H
3 Bytes
6 Bytes
Section
Overhead
Line
Overhead
P
a
t
h
F
i
x
e
d
O
v
e
r
h
e
a
d
IP Payload
2.39616 Gbps
S
t
u
f
f
…
4176 Bytes
2.48832 Gbps
Summer 2004
Dr. Paul Chen
…
H
229
H
SONET Hierarchy
Summer 2004
Dr. Paul Chen
230
IP Over SONET Protocol Stack
„ IP datagrams are encapsulated into PPP packets, which are then
framed into POS Frames using HDLC-like framing according to RFC
1662, and finally, mapped byte synchronously into the SONET SPE.
Network Layer
IP
Datagrams
Protocol encapsulation
Error Control
Link Initialization
PPP
PPP Packet delineation
HDLC Framing
Data Link Layer
SONET
Byte Delineation
Physical Layer
IP over SONET Protocol Stack
Summer 2004
Dr. Paul Chen
231
Ethernet Over SONET
„ To connect its head office and branch offices to the same LAN,
there is an interconnection problem.
„ To interface the Ethernet to the WAN provided by the
Telco/PTT, historically, has required an inter-working protocol,
as Ethernet is not directly supported over the SONET/SDH
network.
„ Two HDLC-like framing format are used to encapsulate the
MAC frame. One is based on ITU-T X.86 Link Access
Procedure – SDH (LAPS) while the other is based on ITU-T
G.7041 Generic Framing Procedure (GFP). Majority of vendor
equipment adopts the GFP framing format. Both framing
formats are implemented by dedicated silicon. X.86 is mainly
used in Europe.
Summer 2004
Dr. Paul Chen
232
Public Transport Network Infrastructure
Voice
Data (IP, IPX)
SAN
Video
Ethernet*
Private
Lines
DVI*
POS
FICON*
ESCON*
FR
RPR
Fiber Channel*
X.86
HDLC*
ATM
GFP
SONET / SDH
Under study
WDM / OTN
Fiber
Summer 2004
Dr. Paul Chen
233
* : These types of traffic may also run directly over fiber
Ethernet over SONET Using LAPS Framing
MSB
Flag (0x7E)
LSB
1 octet
MSB
Address (SAPI, 0x0C)
LSB
1 octet
MSB
Control (0x03)
LSB
1 octet
Octets within frame
transmitted from top
to bottom
Destination Address (DA)
6 octets
Source Address (SA)
6 octets
2 octets
Length/Type
MAC Client data
46-1 500 octets
PAD
MSB
FCS of MAC
4 octets
FCS of LAPS
4 octets
LSB
Flag (0x7E)
MSB
LSB
Bit8
Bit1
Bits within an octet transmitted from left to right
T0733630-00
(114882)
The LAPS format which encapsulates IEEE802.3 MAC frame (shown in shaded area)
Summer 2004
Dr. Paul Chen
234
Functions Performed by the LAPS
„ Rate Adaptation is done by sending sequence(s)
of {0x7d, 0xdd} during transmit process. The
receive entity will remove the Rate Adaptation
octet(s) "0xdd" within the LAPS frame when
detecting sequence(s) of {0x7d, 0xdd}.
„ LAPS Transmit Processing
„ LAPS Receive Processing
Summer 2004
Dr. Paul Chen
235
Functions Performed by the LAPS
„ Error Frame Handling supports two options for
aborting an erroneous frame:
- The first option is to abort a frame by inserting the
abort sequence, 0x7d7e.
- The second option, the LAPS entity can also abort
an erroneous frame by simply inverting the FCS
bytes to generate an FCS error.
Summer 2004
Dr. Paul Chen
236
Ethernet Over SONET Using GFP Framing
„ Generic Framing Procedure (GFP) is a protocol
for mapping packet data into an octetsynchronous transport such as SONET.
„ Unlike HDLC-based protocols, GFP does not use
any special characters for frame delineation.
Instead, it has adapted the cell delineation
protocol used by ATM to encapsulate variable
length packets.
Summer 2004
Dr. Paul Chen
237
Ethernet Over SONET Using GFP Framing
„ A fixed amount of overhead is required by the GFP
encapsulation that is independent of the contents of
the packets.
„ In contrast to HDLC whose overhead is data
dependent, the fixed amount of GFP overhead per
packet allows deterministic matching of bandwidth
between the Ethernet stream and the virtually
concatenated SONET stream.
„ GFP, virtual concatenation must work with LCAS
(Link Capacity Adjustment Scheme) and a
distributed control plane (e.g. GMPLS) to make
SONET more efficient.
Summer 2004
Dr. Paul Chen
238
Ethernet Over SONET Using GFP Framing
„ The GFP overhead consists of up to 3 headers:
- a Core header containing the packet length and
a CRC which is used for packet delineation;
- a Type header identifying the payload type;
- an Extension header, which is optional.
„ Frame delineation is performed on the core
header. The core header contains the two byte
packet length and a CRC. The receiver would
hunt for a correct CRC and then use the received
packet length to predict the location of the start of
the next packet.
Summer 2004
Dr. Paul Chen
239
GFP Encapsulation Format
GFP Frame
Ethernet MAC Frame
Octets
Octets
7
Preamble
2
PLI
2
cHEC
2
Type
2
tHEC
0 - 60
1
SFD
6
DA
6
SA
2
Length / Type
GFP Extension Header
GFP Payload
MAC client Data
Pad
4
Bit #
Summer 2004
FCS
0
1
2
3
4
5
6
Dr. Paul Chen
7
0
240
1
2
3
4
5
6
7
Frame-Based GFP
„ Within GFP, there are two different mapping
modes defined: frame based mapping and
transparent mapping. Each mode is optimized for
providing different services.
„ Frame based GFP is used for connections where
efficiency and flexibility are key. In order to
support the frame delineation mode utilized within
GFP, the frame length must be known and prepended to the head of the packet. In many
protocols, this forces a store-and-forward
encapsulation architecture in order to buffer the
entire frame and determine its length.
Summer 2004
Dr. Paul Chen
241
Frame-Based GFP
„ This buffering may add undesirable latency. Frame
based GFP is good for sub-rate services and
statistically multiplexed services as the entire overhead
associated with the line coding and inter-packet gap
(IPG) are discarded and not transported.
Summer 2004
Dr. Paul Chen
242
Transparent GFP
„ Transparent GFP is useful for applications that
are sensitive to latency or for unknown physical
layers. In this encapsulation, all code words from
the physical interface are transmitted. Currently,
only physical layers that use 8B/10B encoding are
supported.
„ In order to increase efficiency, the 8B/10B line
code are trans-coded into a 64B/65B block code
and then the block codes are encapsulated into
fixed sized GFP packets.
Summer 2004
Dr. Paul Chen
243
Transparent GFP
„ This coding method is primarily targeted at Storage
Area Networks (SANs) where latency is very important
and the delays associated with frame based GFP
cannot be tolerated.
Summer 2004
Dr. Paul Chen
244
Reference for GFP
„ IEEE Communications Magazine, May 2002, Vol.
40, No. 5
“GFP and Data over SONET/SDH and OTN”
Summer 2004
Dr. Paul Chen
245
IP Over Fiber (DWDM)
„ SONET is a Physical Layer device, which schedules the
IP packets to be transported by the way of time division
multiplexing and provisioning. SONET is primarily
designed for voice only system.
IP
ATM
SONET
DWDM Layer
Protocol Stack for SONET over DWDM
Summer 2004
Dr. Paul Chen
246
Problems and Overhead with SONET
„SONET is divided into four layers: Path, Line, Section, and Photonic.
„SONET relies on overhead bytes in Path, Line, Section layers to
perform restoration in case of failure. The payload in currently installed
SONET systems presents a low utilization / efficiency for IP traffic which
is bursty.
Data Bit Rate
SONET Rate
Effective Payload
Rate
Bandwidth
Efficiency
10 Mbit/s Ethernet
STS-1
~48.4 Mbit/s
21%
100 Mbit/s Fast Ethernet
STS-3c
~150 Mbit/s
67%
1Gbit/s Ethernet
STS-48c
~ 2.4 Gbits/s
42%
Summer 2004
Dr. Paul Chen
247
IP Over DWDM
„ SONET is expensive in cost. It requires time
consuming “provisioning” before it can be put into
service to carry traffics.
„ Multi-layer structure of SONET provides
redundancy but presents functional overlapping in
restoration. It also introduces undesired latency
caused by framing and payload mapping.
„ Transmitting IP directly over DWDM systems can
increase the bandwidth and reduce the latency.
Summer 2004
Dr. Paul Chen
248
IP Over DWDM
„ DWDM system performs satisfactorily at high
speeds of OC-192 (10 Gb/s). The overheads
associated with ATM and SONET can be
eliminated.
„ With proper design, the new system (e.g. RPR)
can facilitate faster restoration, provisioning, and
path determination.
„ So, we have an optical IP transport system.
Summer 2004
Dr. Paul Chen
249
ATM Basics
„ Asynchronous Transfer Mode (ATM) is a
connection-oriented, cell-based switching
technology that uses 53-byte cells to transport
information.
„ ATM does not transmit cells asynchronously, as
the name suggests. ATM cells are transmitted
continuously and synchronously, with no break
between cells. When no user information is
transmitted, empty or idle cells are sent instead.
Summer 2004
Dr. Paul Chen
250
ATM Basics
„ The asynchronous nature of ATM comes from the
indeterminate time when the next information unit
of a logical connection may start. Time not used
by one logical connection may be given to other
connections or filled with idle cells. This means
that cells for any given connection arrive
asynchronously.
„ Small and fixed cell size facilitates simpler
hardware implementation, efficient memory usage
for buffering, efficient transport of constant, lowbit rate information such as voice.
Summer 2004
Dr. Paul Chen
251
ATM and B-ISDN Relationship
„ ATM is the foundation technology for BroadbandISDN
„ B-ISDN is the universe of services that will be
made possible by the use of ATM technology
VOICE
DATA
VIDEO
Summer 2004
Dr. Paul Chen
252
Broadband Protocol Model
Signaling
(VBR)
CO (VBR)
CBR
Other VBR
e.g.
e.g.
e.g.
DS1 DS3 VBR Video Frame Relay
X.25
Voice
Other
Services
Upper Layer 2
La
ye
2 r
Co
n
P l t ro
an l
e
User Plane
S
L a e rv
ye ice
r s
Pr o
ot r H
oc ig
ol he
s
r
Management Plane
AAL
ATM
PDH
SONET/SDH
Summer 2004
Dr. Paul Chen
253
r
e
y
a
L
1
Functions of ATM Layers
End
Station
ATM
Switch
A A P
A T H
L M Y
P A P
H T H
Y M Y
End
Station
P A A
H T A
Y M L
ATM
Cells
• ATM Adaptation Layer (AAL): Inserts/extracts information
into 48 byte payload
• ATM Layer: Adds/removes 5 byte header to payload
• Physical Layer: Converts to appropriate electrical
or optical format
Summer 2004
Dr. Paul Chen
254
ATM Protocol Stack
ISO
Model
(OSI)
Layer 3
(Network)
MAC
•Service Access
•Point (SAP)
AAL - SAP
(Not part
of ATM)
Service Specific Functions (SSCS)
• Provide additional functions as required
for specific services (can be null)
Common Part Convergence Sublayer (CPCS)
• Builds header and trailer records onto user data frame
• Assures integrity at the frame level
Sublayer
Boundary
Layer 2
(Link)
Higher
Layers
ATM
Adaptation
Layer (AAL)
Segmentation and Reassembly (SAR)
• Converts CPCS frames into cells
• Adds cell headers and trailers to provide integrity at the cell level
Cell
Switching
Service Access Point (SAP)
ATM Layer
Transmission Convergence Sublayer
• HEC generation and checking
• Transmission frame adaptation
Layer 1
(Physical)
Summer 2004
• Cell delineation
• Decoupling of Cell Rate
(ITU systems)
Physical Media Dependant Sublayer
• Encoding for transmission
• Timing and synchronization
Dr. Paul Chen
• Transmission (Electrical/Optical)
255
Physical
Layer
Comparison of ATM with other Technologies
CONVENTIONAL
LAN
CONVENTIONAL
TELECOM
ATM
TRAFFIC TYPE
DATA
VOICE
DATA, VOICE,
VIDEO
TRANSMISSION
UNIT
VARIABLE
PACKET
FIXED FRAME
FIXED CELL
UP TO G BPS
UP TO G BPS
M BPS TO G BPS
CONNECTION
LESS
CONNECTIONORIENTED
BEST EFFORT
GUARANTEED
CONNECTIONORIENTED
DEFINED
CLASSES
SHARED
DEDICATED
RATE
CONNECTION
TYPE
DELIVERY OF
TRAFFIC
ACCESS
Summer 2004
Dr. Paul Chen
256
DEDICATED
Anatomy of an ATM Cell
8
Byte 1
Byte 2
Byte 3
Byte 4
7
6
5
4
3
2
1
VPI
GFC (UNI) OR VPI (NNI)
VCI
VPI
Header
VCI
VCI
PTI
CLP
HEC
Byte 5
Payload
48
Bytes
Summer 2004
Dr. Paul Chen
257
Virtual Circuits
First we have the cable...
Next, ATM Addressing Defines Paths...
• VP’s
Then Channels.
• VC’s
Summer 2004
Dr. Paul Chen
258
SONET and ATM Channels
Transport Overhead
Transport Overhead
Path Overhead
Path Overhead
STS-1
(DS3)
STS-1
(DS3)
VT1.5
DS1
STS-1
28 VT1.5
Summer 2004
Dr. Paul Chen
259
Virtual Paths & Virtual Channels
VCs
VP
VCs
VP
Physical
Transmission
Link
VP
VCs
VP
VCs
„ VPI: Virtual Path Identifier
„ 4,096 at NNI and 256 at UNI
„ VCI: Virtual Channel Identifier
„ 65,536
„ Both used to route cells through network
„ Unique on link-by-link basis
„ Interpreted at each switch
Summer 2004
Dr. Paul Chen
260
PVC - Manual Set Up
Console or
NMS GUI
VPI/VCI
14/1055
14/1055
87/
45
125
/
5
2
9/47
9/47
Summer 2004
„ Pre-established connections
„ Permanent
„ No signaling required
Dr. Paul Chen
261
SVC - Automatic Set Up
Connect to B
OK
OK
Terminal B
Connect to B
OK
Connect to B
Terminal A
„ Uses UNI 3.0/3.1 signaling
OK
„
VPI/VCI = 0/5
„ Automatic
„ Transparent to User
Summer 2004
Dr. Paul Chen
262
ATM - Operation and Maintenance Principles
„ Fault Management, using AIS, RDI,
continuity check and loopback OAM cells.
„ Performance management, using forward
monitoring and backward reporting OAM
cells.
„ Activation/deactivation of performance
monitoring and/or continuity check, using
activation/deactivation OAM cells.
„ System management OAM cells for use by
end-systems only.
Summer 2004
Dr. Paul Chen
263
Concept -OAM
„ Operations, Administration and Maintenance (OAM)
„ ATM allows the maintenance/test operation to be
performed on a VPC or VCC.
„ These operations are performed on a selected basis; they
can span segments or can be end-to-end.
„ Types of maintenance/test operations:
„ Performance Monitoring - a VPC or VCC is monitored
to ensure the connection is not congested or has
degraded (forward and backward monitoring are
provided)
„ Failure detection (AIS, RDI)
„ PM and Failure Reporting (RDI, PM results)
„ Facility Protection of VPCs
„ Fault Isolation (continuity checks and loopbacks)
Summer 2004
Dr. Paul Chen
264
Operation and Maintenance Flows
„ Physical Layer Mechanism
„
„
„
F1: SONET Section Level
F2: SONET Line Level
F3: SONET Path Level
„ ATM Layer Mechanism
„
F4: Virtual Path Level
• End
to end F4 flow
• Segment F4 flow
„
F5: Virtual Channel Level
• End
to end F5 flow
• Segment F5 flow
Summer 2004
Dr. Paul Chen
265
ATM Fault Management Example
STE
PTE
LOS
Terminal
Repeater
X
F1
F2 (AIS-L)
Using F1 - F5 Flows
LTE
ATM Switch
ATM Switch
ADM
VP
VC
F3 (AIS-P)
F4 (VP-AIS)
F2 (RDI-L)
F3 (RDI-P)
F4 (VP-RDI)
F5 (VC-RDI)
RDI: Remote Defect Indicator
Summer 2004
Dr. Paul Chen
266
F5 (VC-AIS)
Example of Mechanism for OAM Flows
VCC
endpoint
VP cross-connect
VC cross-connect
AAL
Physical layer
connecting point
ATM
PL
PL
PL
VCC
endpoint
AAL
ATM
ATM
ATM
ATM
ATM
PL
PL
PL
PL
PL
VCI 1
VCI 1
VCI 2
VCI 2
Virtual channel OAM cell indicated by PT identifier F5
VPI 1
VPI 1
VPI 2
VPI 2
Virtual path connection uses VCI(=3/4) for OAM F4
Transmission path F3
F1, F2
Summer 2004
F1, F2
Dr. Paul Chen
Trans path F3
F1, F2
267
VPI 3
VPI 3
VPC - OAM F4
Trans path F3
F1, F2
Layered Model of AIS & RDI
VC-AIS (F5)
VP-AIS (F4)
AIS-P (F3)
AIS-L (F2)
(F1)
VC
VC-RDI (F5)
VP
VP-RDI (F4)
PATH
RDI-P (F3)
LINE
RDI-L (F2)
SECTION
(BIP-8 PM, F1)
PHYSICAL
(Layer to layer indications)
Summer 2004
Dr. Paul Chen
(Peer to peer indications)
268
The ATM Adaptation Layer
The AAL process is the most important feature of the ATM
Communications process...
How the Adaptation process is carried out depends on the
type of service to be transported...
AAL TYPE SERVICE TYPE
COMMENTS
AAL1
Isochronous Traffic like DS0,
DS1s, DS3s to carry Voice
For data services, compressed
Audio / Video, etc.
Bursty data over long periods
AAL2
AAL3
AAL4
AAL5
Summer 2004
Constant Bit Rate
CBR
Variable Bit Rate
VBR
Connection-Oriented
for VBR Data
Transfer
Connectionless VBR
Data Transfer
Simplified AAL
Dr. Paul Chen
For short, bursty data (SMDS…)
Mainly for point-to-point
269
Classes of ATM Service
CLASS A
Timing Relation Between
Source & Destination
Bit Rate
Required
1
Dr. Paul Chen
CLASS D
Variable
CONNECTION ORIENTED
AAL Types
CLASS C
Not Required
Constant
Connection Mode
Summer 2004
CLASS B
2
270
CONNECTION-LESS
3/4, 5
3/4
The AAL Process
AAL is divided
into two
sublayers:
USER INFORMATION
CS Process
CS-PDU
1) CONVERGENCE
SUBLAYER
CS-PDU
CS-PDU
SAR Process
2) SEGMENTATION &
REASSEMBLY SUBLAYER
SAR-PDU
SAR-PDU
SAR-PDU
SAR-PDU
These two sublayers convert the user information
into 48-byte cell payloads. Each sublayer produces
a Protocol Data Unit (PDU).
The CS-PDU is variable length while the SAR-PDU
is always 48 bytes.
Summer 2004
Dr. Paul Chen
271
AAL-1 Processing
Payload
Header
SN Field
4 Bits
1
CSI
SNP Field
4 Bits
1 2 3
2 3 4
Sequence
Count
CRC
PDU Payload (47 Octets)
4
Parity
SN: Sequence Number
SNP: Sequence Number Protection
CSI: Convergence Sublayer Indicator
Summer 2004
Dr. Paul Chen
272
AAL-2 Processing
CPS-Packet
Header (3 octets)
CPS-Packet
Payload (1 to 45/64 octets)
CPS-Packet
Cell Header
(5 octets)
Start Field
(1 Octet)
CPS-PDU Payload( up
to 47 octets and pad)
CPS-PDU
ATM Cell
Each AAL2 user generates CPS packets with a 3-octet packet
header and a variable length payload. The CPS sublayer collects
CPS packets from AAL2 users multiplexed onto the same VCC
over a specified interval of time, forming CPS-PDU, comprised of
48 octets worth of CPS packets.
Summer 2004
Dr. Paul Chen
273
The AAL Process: AAL 3/4 CS-PDU
CS-PDU
CPI
BTag
BASize Information
Pad
AL
ETag Length
CPI: Common Point Indicator - 1 Byte
BTag: Beginning Tag - 1 Byte
BA Size: Buffer Allocation Size - 2 Bytes
Info Payload: Length of Payload (Max: 65, 535 Bytes)
Pad: Up to 3 Bytes - used to align CS-PDU length
AL: Alignment - 1 Byte
ETag: End Tag - 1 Byte
Length: 2 Bytes
Summer 2004
Dr. Paul Chen
274
AAL 3/4
CPI BTag BASize
AAL SEVICE DATA UNIT
AAL - SDU
44 Bytes
BOM
SequenceSequence
Type
Number
2 BITS
4 BITS
MID
Payload
10 BITS
Length
Indicator
CRC
10 BITS
2 Bytes
BOM: Beginning of message
COM: Continuation of message
EOM: End of message
Summer 2004
44 Bytes
6 BITS
2 Bytes
Convergence
Sublayer
Protocol
Data Unit:
CSCS-PDU
Al
Fill Length ETag Length
44 Bytes
Payload
COM
Payload
EOM
Segmentation &
Reassembly
Protocol Data
Unit:
SARSAR-PDU
MID: Message Identifier
BASIZE: Buffer Allocation Size
CRC: Cyclic Redundancy Check BTAG: Beginning Tag
EOM: End of message
ETAG: End Tag
Dr. Paul Chen
275
The AAL Process: AAL5 CPCS-PDU
CPCS-PDU
CPCS-PDU
Trailer
CPCS-PDU Payload
1 - 65,535
PAD CPCS-UU CPI
0- 47
1
1
Length CRC
2
4
Unit: octets
PAD: Padding
UU: User-to-User Indication
CPI: Common Part Indicator
Summer 2004
Dr. Paul Chen
LENGTH: CPCS-PDU Length
CRC: Cyclic Redundancy Check
CPCS: Common Part Convergence
Sublayer
276
AAL-5
AAL Service Data Unit (SDU)
AAL5-SDUs
AAL5-SAP
CPCS-PDUs
octets
CPCS-PDU
Trailer
1-65,535 octets
CPCS-PDU Payload
PAD
CPCS-UU CPI Length CRC
0-47
1
1
2
4
•••
SAR
Payload
SAR
Payload
Header
5
Payload
48
Summer 2004
Header Payload
5
SAR
Payload
•••
Payload Type=
AAL_Indicate
•••
48
Dr. Paul Chen
277
Header
Payload
5
48
SAR-PDUs
ATM-SAP
Cells
Octets
ATM Connections
„ ATM is virtual connection-oriented; there must
always be a virtual connection established before
cells can be sent
„ Connections can be established:
›› Administratively as PVCs
– Lowest common denominator for Interoperability for devices not
supporting UNI 3.x signaling
›› Dynamically as SVCs
– Implies ATM signaling capability
Summer 2004
Dr. Paul Chen
278
ATM Switches are easily Scaleable in Speed
„ ATM protocol is connection-oriented
„
once connection is set up, cells are quickly switched in
hardware by using VPI/VCI at very high speeds
„ Uses fixed cell length
„
Allows switch hardware to be optimized around a fixed
length cell
„ Uses SONET as physical layer interface
„
Scales to high speed and is defined and deployed at
Gigabit rates
Summer 2004
Dr. Paul Chen
279
Logical ATM Switch Fabric
ATM Switch Ingress Path
from
interface
PHY
receive
termination
Connection
Lookup
OAM
Processing
Policing
Buffering,
Queuing
& Scheduling
to
queue
ATM Layer
Processing
Physical/TC
Layer
Processing
ATM Switch Egress Path
from
queue
Fabric
receive
termination
Buffering,
Queuing
& Scheduling
ATM Layer
Processing
Internal
Loopback
Summer 2004
Connection
Lookup
Dr. Paul Chen
280
OAM
Policing
(EFCI)
to
interface
Concept – VPs and VCs in the Network
VP2
Link 1
Link 1
NN 1
VP2
VP3
Link 2
Link 1
VP3
VC8
VC11
VC8
VC11
VC21
VC21
VC11
VC2
VC2
VC11
VP8
VP5
VP5
VP8
VP6
CPN 1
VP6
VC7
User/Network
Interface
(UNI)
VC2
VP3
Link 3
Link 2
CPN 2
VP5
Link 4
Network Node
Interface
(NNI)
User/Network
Interface
(UNI)
Link 1
VP3
VP5
CPN 3
VC7
VP2
Link 2
Link 1
VP2
VC2
VC9
Routing Concept in an
ATM Network
Summer 2004
VC9
VP1
NN 2
Dr. Paul Chen
281
Link 3
Link 2
VP1
ATM vs.. Gigabit Ethernet
Network
Installed
Desktops
LAN protocols
(IP, IPX)
Scalability
WAN
QoS
Multimedia
Gigabit Ethernet
Yes
Yes
Yes
Emerging
Emerging
ATM
Yes, but not
much.
With MPOS and
LANE
Yes, with MPOA
and LANE
Yes, with MPOA
and LANE
Yes
Yes
Summer 2004
Dr. Paul Chen
282
Gigabit Ethernet and ATM Feature Comparison
Feature
Gigabit Ethernet
ATM
Price/Performance/Bandwidth
Low cost
Moderate to high cost
Quality of Service (QoS)
RSVP, IEEE802.1Q/p, differential
services
Guaranteed QoS with traffic
management
User Applications
High-speed data, voice/video over
IP
Data, video and voice
Product Availability
Since late 1997
Since early 1996
Network Applications
Building backbone, campus
backbone servers and risers
WAN, building backbone, campus
backbone servers and risers
Summer 2004
Dr. Paul Chen
283
Bandwidth Overhead – ATM Cell Tax
„ For a 1,500-byte IP datagram, Gigabit Ethernet adds
26 bytes of hearer, resulting in 1,526 bytes to
transmit a 1500-byte IP datagram.
„ ATM AAL5 layer adds an 8-byte trailer and a variable
pad size to ensure that the AAL5 protocol data unit
(PDU) is a multiple of 48 bytes. For a 1,500-byte IP
datagram, this results in an AAL5 PDU equal to 1,536
bytes. AAL5 adaptation layer then segments the
AAL5 PDU into 48-byte segments to be carried in 53byte ATM cells. Each cell has a 5-byte header. Total
of 32 ATM cells (1696 bytes) are needed to transmit a
1500-byte IP datagram.
Summer 2004
Dr. Paul Chen
284
Bandwidth Overhead – ATM Cell Tax
„ The corresponding efficiencies are 98% for
Gigabit Ethernet and 88% for ATM.
„ Since ATM cells are carried inside the SONET
payloads, additional 5% or more overhead and
payload inefficiency need to be accounted for.
Gigabit Ethernet over fiber is more efficient in this
comparison.
Summer 2004
Dr. Paul Chen
285
Optical Fiber
Core
Cladding
Jacket
Diameter of Core = d
Diameter of Cladding is standardized at 125 um = D
Summer 2004
Dr. Paul Chen
286
Optical Fiber (continued)
„ Core diameter for single mode (SMF) and
multimode fiber (MMF) is different.
- SMF: 2 – 10 um (8.6 to 9.5 um commonly used)
- MMF: 50 – 200 um
„ SMF supports one ray (mode) due to small D/d
propagate
„ MMF supports many rays (modes)
Summer 2004
Dr. Paul Chen
287
Optical Fiber (continued)
„ MMF and SMF have different manufacturing
processes, refractive index, dimensions, and
therefore different transmission characteristics.
So they find different applications.
Summer 2004
Dr. Paul Chen
288
MMF
„ It minimizes delay spread, although the delay is
still significant.
„ A 1% index difference between core and cladding
amounts to 1 to 5 nsec/km delay spread.
„ Easy to splice and couple light into it
„ Bit rate is limited to 100 Mbps for up to 20km;
shorter length supports higher bit rates.
„ Fiber span without amplification is up to 20 km at
100 Mbps.
Summer 2004
Dr. Paul Chen
289
SMF
„ It almost eliminates delay spread.
„ More difficult to splice and exactly align two fibers
together.
„ More difficult to couple all photonic energy from a
source into it.
„ It is suitable for transmitting modulated signals at
40 Gbps or higher and up to 200 km without
amplification.
Summer 2004
Dr. Paul Chen
290
DWDM Basics
„ Two ways to increase the bandwidth in a single
fiber
- Increase the bit rate: transmitting a reliable
signal at 40 G b/s is available today but quite
expensive.
- Increase the number of wavelengths in the
same fiber: several wavelengths, each
transporting at 10 G or 40 Gb/s will significantly
increase the total bandwidth.
Summer 2004
Dr. Paul Chen
291
DWDM Basics
„ WDM and DWDM definitions
- WDM couples many wavelengths in the same fiber,
thus increases the aggregate bandwidth in a single
fiber.
- DWDM couples a larger (denser) number of (> 40)
wavelengths into a fiber than WDM. However, several
issues need to be addressed, such as channel width
and spacing, total optical power launched in the fiber,
cooling, non-linear effect, cross talk, span of fiber,
amplification, etc. An early WDM with < 10
wavelengths and larger channel width and spacing is
termed Course WDM (CWDM).
Summer 2004
Dr. Paul Chen
292
DWDM Technology Enabler
„ Successful deployment of DWDM is a result of
several technologies:
- Fiber with 1.3 and 1.55 um wavelength spectrum
provides low loss and better transmission
performance
- Optical amplifiers with flat gains over a range of
wavelengths eliminate the need for regenerators
- Integrated solid-state optical filters on the same
substrate with other optical components
Summer 2004
Dr. Paul Chen
293
DWDM Technology Enabler (continued)
- Optical MUX/DMUX is based on passive optical
diffraction
- Tunable filters can be used as optical add-drop
MUX (OADM)
- OADM components have made DWDM possible
in MAN and long haul networks
- OXC (optical cross-connect) made the optical
switching possible.
Summer 2004
Dr. Paul Chen
294
DWDM System Components
„ DWDM technology requires specialized optical
devices that are based on properties of light and on
the optical, electrical, and mechanical properties of
semiconductor material.
„ These devices include: Optical transmitter, optical
receiver, optical filter, optical modulator, optical
amplifier, wavelength converter, OADM and OXC.
„ Optical modulator controls the amount of
continuous optical power transmitted in an optical
waveguide.
Summer 2004
Dr. Paul Chen
295
DWDM System Components (continued)
„ Wavelength converter enables optical channels to be
relocated
„ OADM selectively drops a wavelength from a set of
wavelengths in a fiber, thus drops the traffic on this
channel. It then adds in the same direction of data
flow the same wavelength, but with different data
content.
„ An OXC interconnects N optical inputs with N outputs
using either hybrid or all optical approach. Each port
handles a bundle of multiplexed single-wavelength
signals.
Summer 2004
Dr. Paul Chen
296
DWDM System Components (continued)
„ An OXC supports network reconfiguration and
allows network providers to transport and manage
wavelengths efficiently at the optical layer.
„ An OXC is most efficient when it contains bit-rate
and format independent optical switch. It can
perform signal monitoring, provisioning and
grooming, restoration at the photonic layer itself.
„ Loss due to fiber dispersion and non-linearity can
be compensated through use of Dynamic
Compensation (non-linearly chirped fiber Bragg
grating).
Summer 2004
Dr. Paul Chen
297
Structure of DWDM System
Transmitters
Receivers
λ1
λ1
λ2
λ3
λ2
EDFA
48
DWD
Mux
Optical Fiber
DWD
DeMux
Virtual
Fibers
λn
λn
EDFA: Erbium Doped Fiber Amplifier
Summer 2004
λ3
Dr. Paul Chen
298
DWDM Optical Transmission
„ The photonic layer of DWDM system is
responsible for converting the electronic data to
information in the light waves and sending it
through the fiber.
„ The channel spacing is bounded by the optical
amplifier’s operational bandwidth and the
receiver’s capability to identify two close
wavelengths. ITU-T standard body specifies the a
spacing of 100G Hz.
Summer 2004
Dr. Paul Chen
299
DWDM Optical Transmission
„ DWDM system can be Unidirectional or Bidirectional. The choice is based on the ability of
fiber and the required bandwidth. Unidirectional
DWDM requires two fibers for two-way
communication while Bi-directional DWDM uses a
single fiber for two-way communication.
Summer 2004
Dr. Paul Chen
300
Optical Packet Switching
„ DWDM can perform switching in the optical
domain without having to convert the signal onto
electrical domain. This reduces the delay at the
switches and increases system throughput.
„ Switching involves reading the packet header and
altering the path of the signal (packet). In the
course of altering, the switch may have to edit a
part or whole of the header. All optical header
replacement is the key to updating in the
wavelength-based packets (e.g. modifying routing
information).
Summer 2004
Dr. Paul Chen
301
Optical Packet Switching
„ SONET networks support the multiplexing of
lower TDM rates onto higher rates. The ADM and
transponders en route provide the much-needed
synchronization to ensure quality and guarantee
proper delivery of data.
„ DWDM systems support multiplexing of
wavelengths, no timing relation exists for the
system. The need for a clocking system is absent.
If synchronization is still needed, SONET
terminals and ADM can support it by providing
derived DS1 timing to customers.
Summer 2004
Dr. Paul Chen
302
Optical Internet
„Fundamental properties of DWDM systems are exploited to form
an all optical layer. Bits rate and protocol transparency enable
transport of native data traffic like Gigabit Ethernet, ATM, SONET,
IP etc. on different channels.
„The DWDM functions in the optical layer can be divided into
two layers: Transport Layer and Service Layer. These two layers
perform the functions of the four SONET layers.
Summer 2004
Dr. Paul Chen
303
Optical Internet
Transport Layer
Service Layer
Bandwidth, Reliability, Access speed, Usage
Wavelength level traffic rates, Security, VoIP
control
Services, etc.
DWDM Network Model
Summer 2004
Dr. Paul Chen
304
DWDM Network Model
„ An intelligent optical layer performs fast restoration
and automated provisioning for end-to-end
wavelength path and can appease the bandwidth
demand. Restoration in the optical layer is performed
rapidly and does not overlap with the service layer’s
functions.
Summer 2004
Dr. Paul Chen
305
DWDM Network Model
„ Switching and bandwidth is furnished at the
granularity of the wavelength. ATM’s virtual path
becomes equivalent to a wavelength. MPLS divides
the traffic engineering requirements between the IP
layer and the Optical transport layer.
Summer 2004
Dr. Paul Chen
306
DWDM Network Model
„ In case of physical failure, the wavelength routing
protocol must restore the paths across the network
within a maximum of 50 ms. This is a SONET feature
and it is being resolved in the standard body (IEEE
and IETF).
Summer 2004
Dr. Paul Chen
307
Inter-working of IP, ATM and DWDM
Closed (A)
Open (B)
SONET Based
IP/DWDM
IP
IP
IP
ATM
SONET
IP
ATM
Dense Wavelength Division Multiplexer
Virtual Fiber
DWDM Network Architecture
Summer 2004
Dr. Paul Chen
308
IP/DWDM Architecture
„ The closed (A) architecture was designed to serve
the SONET system better. DWDM increases the
capacity of SONET system. IP/DWDM systems adopt
the open (B) architecture, which is not tied with
SONET or other TDM systems. It reflects protocol
transparency in all optical networks. The carrier is
responsible for providing the actual interface to end
users and the fault/failure protection work. The IP bits
enter the DWDM and are transported “as is” over the
high-speed optical connection.
Summer 2004
Dr. Paul Chen
309
IP/DWDM Architecture
„ DWDM can adopt either of the following network
architecture:
- Optical mesh transport where OXC and MUX can
provide wavelength management and restoration
- Wavelength transport where detection of failure and
restoration is done at the service layer since there is
no OXC and MUX. IP and ATM connect directly over
the wavelength links.
Summer 2004
Dr. Paul Chen
310
Issues Confronting IP/DWDM
„ Error Detection – SONET can detect signal errors
through its overhead bytes in the frame. This
feature can be carried down to DWDM when
SONET is used as the higher layer. Forward Error
Correction is performed in the all-optical DWDM
systems.
Summer 2004
Dr. Paul Chen
311
Issues Confronting IP/DWDM (continued)
„ Fault Tolerance – 1+1 optical multiplex section
protection (MSP) is the strategy supported by the
WDM system. It is similar to the 1+1 MSP in SDH
/ SONET. The WADM can accommodate more
advanced optical layer protection switching.
Summer 2004
Dr. Paul Chen
312
Issues Confronting IP/DWDM (continued)
„ Wavelength Routing – The wavelength and origin of
the signal decide the wavelength path of the signal
through the optical network.
„ Network Control & Management – GMPLS is
developed to perform network control and
management by directly communicating between the
management system (in the Control plane) and the
DWDM (in the Transport plane).
Summer 2004
Dr. Paul Chen
313
Issues Confronting IP/DWDM (continued)
„ Service Transparency – The network does not
need any extra information about the signal it
transports. Jitter introduced by the optoelectronic
processing can be removed using a bit-rate
independent optoelectronic regenerator with retiming functionality.
„ Interoperability with backbone routers and to
facilitate multi-vendor internetworking.
Summer 2004
Dr. Paul Chen
314
Issues Confronting IP/DWDM (continued)
„ Quality of Service - Work is underway to add
QoS measures to IP routing protocol (e.g. OSPF)
such that it carries not only the topology
information but also the loading information. A
study is needed to determine between a QoS
based distributed routing scheme in the IP layer
and an optical routing algorithm undertaking the
IP/DWDM routing.
Summer 2004
Dr. Paul Chen
315
The Driving Force for 10G Ethernet for LAN, MAN and WAN
„ The need for 10G Ethernet is driven by the
successful deployment of Gigabit Ethernet (costeffective now) and the aggregation of Gigabit
links.
„ The cost saving of 10G Ethernet WAN
($40,000/port) vs.. Packet over SONET
($300,000/interface) will entice the deployment of
10G Ethernet in MAN and WAN.
Summer 2004
Dr. Paul Chen
316
The Driving Force for 10G Ethernet for LAN, MAN and WAN
„ 10G Ethernet can provide a low-cost local
connection to WDM-based transponders.
„ Leverage on the huge existing installed base of
10M, 100M and popular Gigabit Ethernet.
„ Easy migration and inter-working from the
existing installation and no new network
management training is required.
Summer 2004
Dr. Paul Chen
317
10G Ethernet (802.3ae) Reference Model
Higher Layers
Logical Link Control (LLC)
Media Access Control (MAC)
Reconciliation Sublayer (RS)
XGMII
XGMII
XGMII
64B/66B PCS
WIS
64B/66B PCS
PHY
Physical Medium Attachment
PHY
Physical Medium Dependent
8B/10B PCS
Physical Medium Attachment
PHY
Physical Medium Dependent
Auto Negotiation
Physical Medium Dependent
MDI
MDI
MDI
Medium
Medium
10GBASE-R
(LAN)
10GBASE-W
(WAN)
Medium
10GBASE-X
(LAN over WWDM)
WIS: WAN Interface Sublayer
PCS: Physical Coding Sublayer
Summer 2004
Dr. Paul Chen
318
XAUI
XGXS: XAUI Extender Sublayer
XAUI:10G Attachment Unit Interface
XSBI: 10G Sixteen Bit Interface
Reconciliation
XGMII
XGXS
XAUI
XGXS
XGMII
8B/10B PCS
16-bit parallel (OIF)
XSBI
Physical Medium Attachment
Retime, SerDes, CDR
Physical Medium Dependent
E/O
MDI
Medium
XAUI functions as an extender interface between MAC and PCS.
XGMII is a 74-pin signal (32-bit data path each for TX and RX) while
XAUI is a 4-bit (4 serial lines) interface for chip-to-chip interface to save
space.
Each of the 4 serial line in XAUI operates at 2.5 Gb/s.
Summer 2004
Dr. Paul Chen
319
LAN PHY vs.. WAN PHY for 10G Ethernet
Stack
10GE LAN PHY
10 GE WAN PHY
Serial
WWDM
Serial
MAC
10 Gb/s
10 Gb/s
10 Gb/s
PCS
64B / 66B
8B / 10B
64B / 66B
SONET framing
PMA
Interface
XSBI
XAUI
XSBI
PMD
1550 nm DFB
1310 nm FP
850 nm VCSEL
1310 nm WWDM
1550 nm DFB
1310 nm FP
850 nm VCSEL
Lin Rate
10.3 Gb/s
4 x 3.125 Gb/s
9.953 Gb/s
CWDM: Course WDM XSBI: 10G Sixteen Bit Interface XAUI: 10G Attachment Unit
Interface
VCSEL: Vertical Cavity Surface Emitting Laser DFB: Distributed Feedback FP: FabryPerot laser
Summer 2004
Dr. Paul Chen
320
Comparison of GE vs. 10GbE
Characteristics
Gigabit Ethernet
10Gigabit Ethernet
Physical Medium
Optical and Copper
Optical only
Distance
LAN up to 5 km
LAN up to 40 km. Direct
attachment to SONET
for WAN.
PMD sublayer leverage
Fiber Channel PMD
sublayer
Developed new optical
PMD sublayer
PCS
Reuse 8B/10B coding
New coding schemes:
64B/66B for –W and –R;
8B/10B for –X.
MAC
Protocol half and full
duplex
Full duplex only
Summer 2004
Dr. Paul Chen
321
IEEE 802.3ae Port Types
Device
Range
Media
Optics
PCS
WIS
Application
10GBase-LX4
300m/10km
MMF/SMF
1310nm/WWDM 8B/10B
No
Enterprise
10GBase-SR
33m/300m
62.5µm/50µm MMF
850nm
64B/66B
No
Data center
10GBase-LR
10km
SMF
1310nm
64B/66B
No
Enterprise/Metro
10GBase-ER
40km
SMF
1550nm
64B/66B
No
Metro
10GBase-SW
33m/300m
62.5µm/50µm MMF 850nm
64B/66B
Yes
Metro/WAN
10GBase-LW
10km
SMF
1310nm
64B/66B
Yes
Metro/WAN
10GBase-EW
40km
SMF
1550nm
64B/66B
Yes
WAN
10GBase-CX4
15m
Coaxial
-
8B/10B
No
Data center
10GBase-T
100m
Twisted pair
-
8B/10B
No
Enterprise
Summer 2004
Dr. Paul Chen
322
Emerging 10GE Applications
„ High speed internet access that supports multi-media
and QoS.
„ Corporate LAN interconnect for distributed
communication, remote services, and home office
access.
„ Back-end server connections to minimize congestion
and delay.
„ Inter- and intra-POP connections to enhance reliability
and scalability.
„ Real-time streaming that supports video and VoIP, etc.
Summer 2004
Dr. Paul Chen
323
Emerging 10GE Applications (continued)
„ Telecommuting that supports office LAN, metro and
regional Ethernet connectivity.
„ High speed data transport for large size data transfer
over the network.
Summer 2004
Dr. Paul Chen
324
10 GbE for Expanded LAN
„ 10 G Ethernet can be used in service provider
data centers and enterprise LAN environment.
„ It can be used for
- Switch to switch
- Switch to server
- Data centers
- Between two buildings in a campus
Summer 2004
Dr. Paul Chen
325
10 GbE for MAN
Location A
10GbE
10GbE
10GbE
10GbE
10GbE
MAN
10GbE
Location C
10GbE
Location B
Summer 2004
Dr. Paul Chen
326
10 GbE for MAN
„ Gigabit Ethernet is already deployed as a MAN.
„ With 10G Gigabit Ethernet interfaces, optical
transceivers, and single mode fiber, service providers
can provide links reaching 40 km or more.
„ This can serve the range required for MAN.
Summer 2004
Dr. Paul Chen
327
10 GbE for SAN
„ 10 G Ethernet can be used for the following
applications in a SAN (Storage Area Network)
environment:
- Database servers
- Technical and scientific computing
- High resolution video transport
- Local and remote data mirroring
- Centralized backup
Summer 2004
Dr. Paul Chen
328
10 GbE for WAN
Service Provider
Point of Presence
Carrier
Central Office
Carrier
Central Office
Service Provider
Point of Presence
10 GbE
10 GbE
Core DWDM
Optical Network
10 Gigabit Ethernet is compatible with the installed base of
SONET OC-192.
Summer 2004
Dr. Paul Chen
329
The Future of 10GbE
„ An Ethernet-optimized infrastructure build out has
already started.
„ The metro areas are the current focus of network
development to deliver optical Ethernet service to
the business customers. Service providers like
Telseon, Cox Communications, BT, and Qwest
already deploy Gigabit Ethernet services.
Summer 2004
Dr. Paul Chen
330
The Future of 10GbE
„ 10GbE is on the roadmap of most, if not all,
switch, router and metro optical system vendors
to enable low-cost metro-based campus
interconnection over dark fibers, and to provide
end-to-end optical networks with common
management systems.
„ IEEE802.3ae 10GbE standard is standardized on
June 12, 2002.
Summer 2004
Dr. Paul Chen
331
RPR-Enabling Technology
„ Any logical topology
„ Sub-50 milliseconds protection
„ Bandwidth management (QoS)
„ Delay guarantees
„ Loss guarantees
„ Unicast, multicast and broadcast
„ OAMP support
RPR allows true convergence of service in metro networks
Summer 2004
Dr. Paul Chen
332
RPR Application
End-to-End
IP
RPR
Core
RPR
802.3
Metro Ethernet
Access
Summer 2004
Regional
Metro
Dr. Paul Chen
IP/MPLS
333
Regional
Metro
802.3
Metro Ethernet
Access
MPLS, Ethernet and RPR
„ MPLS
– End-to-end path set up
– Service creation
– Adaptation layer
– Traffic engineering
/ Ethernet
– Universal service interface
– No service guarantees or bandwidth management
„ RPR
– Metro network
– Convergence of services
– Service enabling mechanisms
Summer 2004
Dr. Paul Chen
334
RPR : Better Than Both Worlds
SONET Ethernet
RPR
Fair access to ring bandwidth
v
High BW efficiency on dual ring topology
v
Full FCAPS with LAN-like economics
v
Controlled latency and jitter
v
v
50 milliseconds ring protection
v
v
Optimized for data
v
v
Cost effective for data
v
v
Summer 2004
Dr. Paul Chen
335
RPR Value Proposition
„ A layer 2 technology designed for metro transport
„ Shared ring technology with spatial reuse
„ Offers carrier class ring protection and
resiliency for packet switched networks
„ Multiple services over one layer with QoS
„ Reduced cost of operations
„ Improved service velocity
Summer 2004
Dr. Paul Chen
336
Standards Bodies
„ IETF
– IPORPR – IP Over RPR
„ IEEE
– 802.17
„ ITU and ANSI
– SONET, SDH, GFP, OCh
Standards impact both data plane and
management plane
Summer 2004
Dr. Paul Chen
337
IETF
„ Operation of IP over 802.17
– Representing link layer in routing database and
multicast operation
– Link metrics
– Interaction of Layer 3 and 2 resiliency
„ MPLS adaptation over 802.17
„ Management Information Base (MIB)
Summer 2004
Dr. Paul Chen
338
ITU & ANSI
„ SONET & SDH
„ Generic Framing Procedure
Summer 2004
Dr. Paul Chen
339
IEEE
„ RPR MAC definition
„ Transit path & fairness management
„ Topology discovery
„ Protection switching
„ Adaptation to different physical layers
„ Conformance to 802.1
„ Layer management
Summer 2004
Dr. Paul Chen
340
Introducing RPR
„ Resilient Packet Ring (RPR) is a new Layer 2
technology
– optimized for MAN and WAN
– RPR MAC is PHY layer agnostic: leverages the
Ethernet or SONET physical layer
– Full FCAPS management
– increases bandwidth efficiency through statistical
multiplexing
„ Standard under development in IEEE 802.17
Summer 2004
Dr. Paul Chen
341
RPR MAC: Key Features
„ Ring protection and fast restoration (<50ms)
„ Support for multiple classes of service
„ Controlled dynamic BW sharing on the ring
– No wasted BW due to pre-allocation
– Spatial reuse between other nodes
„ Controlled latency and jitter
Summer 2004
Dr. Paul Chen
342
RPR Frame Format
RPR Header
RPR header:
2 bytes
Destination
6 bytes
Source
6 bytes
Type
2 bytes
HEC
2 bytes
Frame type, CoS, TTL, Ring-ID,
In-Out profile indicator
Compliance with 802.1
Support for transparent
bridging with broadcast of
unknown unicast addresses
Payload
Payload FCS
Summer 2004
4 bytes
Dr. Paul Chen
343
Fairness
Support for 4 traffic types:
„ Provisioned – no BW management
„ High Priority – no BW management
„ Medium Priority – Guaranteed Transmit
„ Best effort – BW negotiated (equal/weighted)
Bandwidth Management:
„ Requires no provisioning or configuration for each node.
„ Provides a flexible and reliable transport while protecting high
priority service requirements.
„ Over-subscription of bandwidth is easily supported.
„ All nodes can dynamically compete for spare bandwidth.
„ Works with large topologies and designs with many node.
Summer 2004
Dr. Paul Chen
344
Protection
„ Global
– Steering –Nodes use topology map to avoid sending
traffic over failed spans.
„ Local
– Wrap – Nodes at failure redirect traffic to
alternate ring
<50ms protection switch requirement !
Summer 2004
Dr. Paul Chen
345
Topology
„ New nodes automatically trigger advertisement.
„ All nodes periodically update their map.
„ Topology map is kept by all nodes.
– Determine optional path to any destination
– Identify node capabilities (wrap/steer)
Summer 2004
Dr. Paul Chen
346
Download