用Nexus 设计数据中心
-Deploying OTV in Datacenter
Presentation_ID
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Confidential
1
Agenda
 OTV 介绍
 OTV 典型部署模式
 路径优化(Path Optimization)
Presentation_ID
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Confidential
2
数据中心二层扩展需求
 业务需求
 Disaster Avoidance
 Business Continuance
 Workload mobility
 多点数据中心
• 灾备中心如2地3中心
• 原有数据中心由于早期设计机房空间、电力、制冷、性能容量的限制
,需要新增数据中心灵活扩展
• 建多点物理位置分散的数据中心提供更高可靠性保障,同时实现用户
访问的流量更好的在数据中心之间分担,获得更好的访问性能
Presentation_ID
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Confidential
3
Traditional Layer 2 Extension
EoMPLS
Dark Fiber
VPLS
Presentation_ID
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Confidential
4
Overlay Transport Virtualization (OTV)
OTV is a “MAC in IP” technique to extend Layer 2 domains
OVER ANY TRANSPORT
O
T
V
Presentation_ID
Overlay - A solution that is independent of
the infrastructure technology and services,
flexible over various inter-connect facilities
Transport - Transporting services for layer
2 Ethernet and IP traffic
Virtualization - Provides virtual stateless
multi-access connections
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Confidential
5
OTV Control Plane
MAC Address Advertisements (Multicast-Enabled Transport)
 Every time an Edge Device learns a new MAC address, the OTV control plane will
advertise it together with its associated VLAN IDs and IP next hop.
 The IP next hops are the addresses of the Edge Devices through which these
MACs addresses are reachable in the core.
 A single OTV update can contain multiple MAC addresses for different VLANs.
 A single update reaches all neighbors, as it is encapsulated in the same ASM
multicast group used for the neighbor discovery.
4
VLAN
1
3 New MACs are
learned on VLAN 100
Vlan 100
MAC A
Vlan 100
MAC B
Vlan 100
MAC C
OTV update is replicated
by the core
3
Core
MAC
IF
100
MAC A
IP A
100
MAC B
IP A
100
MAC C
IP A
East
2
IP A
VLAN
West
3
MAC
IF
100
MAC A
IP A
100
MAC B
IP A
100
MAC C
IP A
4
South-East
Presentation_ID
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Confidential
6
OTV Data Plane: Inter-Site Packet Flow
4. The Edge Device on site East receives
and decapsulates the packet.
5. Layer 2 lookup on the original frame.
MAC 3 is a local MAC.
6. The frame is delivered to the
destination.
1. Layer 2 lookup on the destination MAC.
MAC 3 is reachable through IP B.
2. The Edge Device encapsulates the
frame.
3. The transport delivers the packet to the
Edge Device on site East.
3
Transport
Infrastructure
MAC TABLE
1
Layer
2
Looku
p
VLAN
MAC
IF
100
MAC 1
Eth 2
100 OTV MAC 2
Eth 1
100
MAC 3
IP B
100
MAC 4
IP B
MAC 1  MAC 3
Presentation_ID
IP A
OTV
2
Encap
MAC 1  MAC 3 IP A  IP B
MAC 1
© 2006 Cisco Systems, Inc. All rights reserved.
West
Site
Cisco Confidential
MAC TABLE
Decap
IP B
4
OTV
MAC 1  MAC 3 IP A  IP B
East
Site
VLAN
MAC
IF
100
MAC 1
IP A
100
MAC 2OTV IP A
100
MAC 3
Eth 3
100
MAC 4
Eth 4
MAC 1  MAC 3
5
Layer
2
Looku
p
6
MAC 3
7
OTV Data Plane: Multicast Data
Multicast State Creation
1. The multicast receivers for the multicast group “Gs” on the East site send IGMP
reports to join the multicast group.
2. The Edge Device (ED) snoops these IGMP reports, but it doesn’t forward them.
3. Upon snooping the IGMP reports, the ED does two things:
1. Announces the receivers in a Group-Membership Update (GM-Update) to all EDs.
2. Sends an IGMPv3 report to join the (IP A, Gd) group in the core.
4. On reception of the GM-Update, the source ED will add the overlay interface to the
appropriate multicast Outbound Interface List (OIL).
2
OIL-List
Group
Gs  Gd
IF
OTV
Overlay
4
Client IGMP
snoop
Multicast-enabled Transport
OTV
Receive GM-Update
Update OIL
3.1
GM-Update
IP B
SSM Tree
for Gd
Source
1
Client IGMP
report to join
Gs
Receiver
3.2
IP A
West
Presentation_ID
From Right to
Left
IGMPv3 report to
join (IP A, Gd) ,
the SSM group in
the Core.
It is important to clarify that the edge devices join the core multicast groups as hosts, not as
routers!
Cisco Confidential
© 2006 Cisco Systems, Inc. All rights reserved.
East
8
OTV Data Plane: Multicast Data
Multicast Packet Flow
OIF-List
1
Looku
p
Group
IF
Gs  Gd
Overlay
OTV
IPs  Gs
Multicast-enabled
Transport
IPs  Gs
IP A Gd
3
Transport
Replication
IPs  Gs
2
IP B
IP A  Gd
Source
IPs  Gs
4
IP A
OTV
IP A  Gd
IPs  Gs
Receiver
Decap 5
Encap
West
IP C
OTV
IP s  Gs
East
4
IP A  Gd
IPs  Gs
Decap
5
Receiver
South
Presentation_ID
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Confidential
9
OTV Control Plane
Neighbor Discovery (Unicast-Only Transport)
1. One of the OTV Edge Devices (ED) is configured as an Adjacency Server (AS)*.
2. All EDs are configured to register to the AS: send their site-id and IP address.
3. The AS builds a list of neighbor IP addresses: overlay Neighbor List (oNL).
4. The AS unicasts the oNL to every neighbor.
5. Each node unicasts hellos and updates to every neighbor in the oNL.
Site 2
IP B
oNL
Site 1, IP
Site 2, IP
Site 3, IP
Site 4, IP
Site 5, IP
Site 3
IP C
Site 1
A
B
C
D
E
Unicast-Only
Transport
IP A
Adjacency
Server Mode
IP E
IP D
Site 4
Site 5
* A redundant pair may be configured
Presentation_ID
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Confidential
10
OTV Encapsulation
Consideration
 OTV adds a 42 Byte IP encapsulation
 The OTV shim header contains VLAN ID, Overlay number and CoS
 The OTV Edge Devices do NOT perform packet fragmenting and
reassembling. A packet failing the MTU is dropped by the Forwarding
Engine
 Make sure that [xB + 42B] < DCI MTU… where x = Size of
original packet
802.1Q
DMAC
802.1Q
6B
6B
2B
IP Header
Payload
20B
CRC
OTV Shim
VLA
N
Ether
Type
Et
h
CoS
SMAC
To
S
DMAC
SMAC
8B
Original Frame
4B
42 Byte encapsulation
Presentation_ID
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Confidential
11
OTV Automated Multi-homing
Per-VLAN Load Balancing
 The detection of the multi-homing is fully automated and it does not require
additional protocols and configuration
 The Edge Devices within a site discover each other over the “otv site vlan”.
 In each site OTV elects one of the Edge Devices to be the Authoritative
Edge Device (AED) for a subset of the extended VLANs
 In a dual-homed site the VLANs will be split in odd and even VLANs
 The AED:
forwards traffic to and from the overlay
advertises MAC addresses for any given site/VLAN
MAC TABLE
VLAN
MAC
100
MAC 1
IP A
101
MAC 2
IP B
AED
IF
AED
Transport
OTV
OTV
IP A
OTV
OTV
IP B
AED
Presentation_ID
© 2006 Cisco Systems, Inc. All rights reserved.
AED
Cisco Confidential
12
OTV Layer 2 Fault Isolation
 STP isolation – No configuration required
• No BPDUs forwarded across the overlay
• STP remains local to each site
• Edge device internal interfaces behave as any other switchport
 Unknown unicast isolation – No configuration required
• No unknown unicast frames flooded onto the overlay
• Assumption is that end stations are not silent
• Option for selective unknown unicast flooding (for certain applications)
 Proxy ARP cache for remote-site hosts – On by default
• On ARP request for remote host, request forwarded through OTV and initial
ARP reply generated by that host
• OTV edge device snoops ARP replies and caches data
• Subsequent ARP replies proxied by local OTV edge device using ARP cache
Presentation_ID
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Confidential
13
MAC Mobility
Local MAC = Blue
Remote MAC = Red
Server Moves
OTV
OTV
MAC X
MAC X
West
East
OTV
MAC X
OTV
MAC X
MAC X
AED
OTV
AED
AED advertises MAC X
with a metric of zero
OTV
MAC X
MAC X
East
OTV
West
AED detects
MAC X is
now local.
OTV
MAC X
MAC X
MAC X
AED
AED
OTV
OTV
EDs in site West see MAC X advertisement with a
better metric from site East and change them to
remote MAC address.
MAC X
MAC X
West
East
OTV
OTV
MAC X
MAC X
MAC X
AED
AED
Presentation_ID
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Confidential
14
OTV VDC
OTV VDC Models
 Two different deployment models are considered for the OTV VDC:
 OTV Appliance on a Stick
 Inline OTV Appliance
Common Uplinks to Transport
For Layer3 and DCI
OTV
VDC
Dedicated
Uplinks to the
Uplink for DCI Layer3 Transport
Join Interface
Internal Interface
SVIs
L3
OTV
VDC
SVIs
L3
L2
L2
Inline OTV Appliance
OTV Appliance on a Stick
 No difference in OTV functionality between the two models
Presentation_ID
 The Inline OTV Appliance requires availability of Core downstream
links
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Confidential
15
OTV Edge Device at the Aggregation
OTV at the Aggregation w/ L2-L3 Boundary
 DC Core performs only Layer 3 role
 ARP, STP and unknown unicast domains isolated between PODs
 Inter or Intra-DC LAN extension provided by OTV
 Ideal for single aggregation block topology
Join Interface
Recommended for Greenfield
Internal Interface
Virtual Overlay
Interface
Core
OTV VDC
SVIs
SVIs
OTV VDC
VPC
OTV VDC
Aggregation
SVIs
SVIs
OTV VDC
VPC
Access
Presentation_ID
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Confidential
16
OTV Edge Device at the Core
OTV at the DC Core with L2–L3 boundary at the Aggregation
Option 1 – Dedicated devices to perform OTV
 Physical devices or VDCs carved out
from the Nexus 7000 deployed in the
core
Easy deployment for Brownfield
Dedicated Uplinks
for DCI
Dedicated Uplinks
for Layer 3
 Separated infrastructure to provide
Layer 2 extension and Layer 3
connectivity services
OTV
OTV
 VLANs extended from Agg Layer
VPC
Recommended to use separate
physical links for L2 & L3 traffic
Loop-free hub-and-spoke
Layer 2 topology
VPC
L3
L2
VSS
Aggregation
VPC
Access
Presentation_ID
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Confidential
17
OTV Edge Device at the Core
OTV at the DC Core with L2–L3 boundary at the Aggregation
Option2 – Common Devices for DCI and Layer 3
 Easy deployment for brownfields
 DC Core devices perform Layer 3 and
OTV functionalities
Easy deployment for Brownfield
 HSRP Localization at each POD
 VLANs extended from Agg Layer
OTV
Recommended to use separate
physical links for L2 & L3 traffic
Loop-free hub-and-spoke
Layer 2 topology
STP and L2 broadcast Domains
not isolated between PODs
OTV
Common Uplinks
for DCI and Layer 3
Core
Carries Only the
OTV extended VLAN
VPC
VPC
L3
L2
Carries Only the
OTV extended VLAN
VSS
Aggregation
VPC
Access
Presentation_ID
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Confidential
18
Deploy OTV at the Core
OTV at the DC Core with L2–L3 boundary at the Core
 Easy deployment for Brownfield
 L2-L3 boundary in the DC core
 DC Core devices performs L2, L3 and
OTV functionalities
Requires a dedicated OTV VDC
into core Nexus
OTV deployed in the DC core to
provide LAN extension services to
remote sites
Intra-DC LAN extension provided by
bridging through the Core
VSS/vPC recommended to create an
STP loopless topology
Storm-control between PODs
Presentation_ID
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Confidential
19
OTV VDC Two possible approaches
DCI Edge
Layer
DCI Edge
Layer
N7K1-VDCB
N7K1-VDCB
N7K2-VDCB
N7K2-VDCB
Warning
Aggregation
Layer
Aggregation
Layer
N7K1-VDCA
N7K1-VDCA
N7K2-VDCA
 Only AED forwards the traffic to and from
OTV Overlay
 DCI traffic hashed to OTV Edge (non-AED)
device will have to traverse the vPC PeerLink between the two DCI Edge switches
Presentation_ID
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Confidential
N7K2-VDCA
 Single vPC Layer at the Aggregation.
 Provides good level of resiliency
with the minimum amount of ports.
 DCI traffic is always forwarded directly
to the OTV AED device (mac-addresstable)
20
Path Optimization
Egress Routing Localization – OTV Solution
 The approach is to use the same HSRP group in all sites and therefore
provide the same default gateway MAC address.
 Each site pretends that it is the sole existing one, and provide optimal egress
routing of traffic locally.
 OTV achieves Edge Routing Localization by filtering the HSRP hello
messages between the sites, therefore limiting the “view” of what other
routers are present within the VLAN.
 ARP requests are intercepted at the OTV edge to ensure the replies are from
the local active GWY.
Active
GWY Site
1
Active
GWY Site
2
L3
L2
ARP traffic
is kept local
FHRP
Hellos
West
Presentation_ID
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Confidential
FHRP
Hellos
East
ARP traffic is
kept local
21
Filtering Configuration for HSRP Localization
To be applied in the OTV VDC
Step 1: VACL Option
or
Port ACL Option
HSRPv2
ip access-list hsrp
10 permit udp any 224.0.0.2/32 eq 1985
Filters HSRP
packets
in OTV VDC
action drop
vlan access-map hsrp-localize 20
match ip address all-ips
action forward
ip access-list otv-hsrp-filter
10 deny udp any 224.0.0.2/32 eq 1985
20 deny udp any 224.0.0.102/32 eq 1985
20 permit ip any any
20 permit udp any 224.0.0.102/32 eq 1985
ip access-list all-ips
10 permit ip any any
vlan access-map hsrp-localize 10
match ip address hsrp
HSRPv2
interface x/y
description [ OTV internal interfacs]
ip port access-group otv-hsrp-filter
Step2:
Filters VIP MAC advertisements in OTV
vlan filter hsrp-localize vlan-list <OTV-VLANs>
mac-list hsrp-vmac seq 10 deny 0000.0c07.ac00 ffff.ffff.ff00
HSRPv2
mac-list hsrp-vmac seq 20 deny 0000.0c9f.f000 ffff.ffff.f000
mac-list hsrp-vmac seq 20 permit 0000.0000.0000 0000.0000.0000
route-map hsrp-filter permit 10
match mac-list hsrp-vmac
otv-isis default
Presentation_ID
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Confidential
vpn overlay<#>
redistribute filter route-map hsrp-filter
22
Distributed Workload Mobility
State
Created
Before
vMotion
© 2006 Cisco Systems, Inc. All rights reserved.
N7K3-VDCA
Firewall
DCI
 Traffic incurs DCI latency
Presentation_ID
LB
N7K2-VDCA
 Source NAT for symmetric flow
SNAT
Firewall
LB
N7K1-VDCA
 FHRP localization is not possible,
because request and reply need to pass
through the same service device pair
N7K4-VDCA
Outbound Traffic with Services
Cisco Confidential
LD vMotion
After
vMotion
23
Distributed Workload Mobility
Inbound Traffic using RHI
 Route Health Injection makes use of
ACE Load Balancer to inject /32 host
route once Virtual Machine moves
RHI
/32
N7K4-VDCA
Load Balancer
N7K3-VDCA
N7K2-VDCA
N7K1-VDCA
Load Balancer
DCI
Before
vMotion
Presentation_ID
© 2006 Cisco Systems, Inc. All rights reserved.
LD vMotion
Cisco Confidential
After
vMotion
24
Path Optimization
Prefix
Route Locator
(RLOC)
Ingress Routing Optimization with LISP
10.10.10.1
A, B
10.10.10.2
A, B
…
…
10.10.10.5
C, D
10.10.10.6
C, D
End-point host ID (EID)
Route Locator (RLOC)
Ingress Tunnel Router (ITR)
Egress Tunnel Router (ETR)
IP_DA = 10.10.10.1
1) ITR consults directory to get Route
Locator (RLOC) for the destination
End-point ID (EID)
2) ITR IPinIP encapsulates traffic to
send it to the RLOC address
IP_DA= A
3
Here RLOC
routes only
Core
OTV
Decap
 Granular reachability
information for hosts in
extended subnet
RLOCs:
A
B
Egress TR (ETR)
C
D
Pod N
Pod A
IP_DA = 10.10.10.1
 If a host moves, its
mapping is updated
…
 No end-host state in routing
tables
© 2006 Cisco Systems, Inc. All rights reserved.
Ingress Tunnel Router (ITR)
IP_DA = 10.10.10.1
3) ETRs receive and decapsulate
traffic
Presentation_ID
1
Encap
2
EIDs: 10.10.10.1
Cisco Confidential
.2
.3
.4
.5
.6
Extended Subnet (10.10.10.0 /24)
.7
.8
25
OTV在企业网的应用
部门位置分散,需要按照部门划分VLAN
在园区移动办公
网络迁移
集团单位骨干网为下属单位提供二层通道
等等
Presentation_ID
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Confidential
26
Challenges with LAN Extensions
Real Problems Solved by OTV
 Extensions over any transport (IP, MPLS)
Fault
Domain
North
Data
Center
Fault
Domain
 Failure boundary preservation
 Site independence / isolation
 Optimal BW utilization
(no head-end replication)
 Resiliency/multihoming
LAN Extension
 Built-in end-to-end loop prevention
 Multisite connectivity (inter and intra DC)
 Scalability
 VLANs, sites, MACs
 ARP, broadcasts/floods
Only 5 CLI
commands
 Operations simplicity
Fault
Domain
Fault
Domain
South
Data
Center
Presentation_ID
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Confidential
27
OTV现阶段不足之处
IETF draft,还未形成正式标准
Convergence time(3s-30s)
目前支持的Site比较少,不适合汇聚层的部署
SVI limitation
目前Per-VLAN AED流量负载平衡问题
目前backbone必须支持组播
Presentation_ID
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Confidential
28
Presentation_ID
© 2006 Cisco Systems, Inc. All rights reserved.
Cisco Confidential
29