DOC - AT&T Labs Research

advertisement
Efficient Restoration Capacity Design in MPLS Networks
Guangzhi Li, Dongmei Wang, Jennifer Yates, Chuck Kalmanek, Robert Doverspike
AT&T Labs (Research), NJ
Extended Abstract: With the development of Multi-Protocol Label Switching (MPLS)
technology, increasing numbers of Internet Service Providers (ISP) are evolving their networks
into a common MPLS core. In MPLS core network design, rapid service restoration after a
network failure is critical to meet the strict requirements of packet delay/loss sensitive
applications such as voice over IP.
There has been a great deal of recent work on restoration schemes in MPLS networks,
including those based on routing protocol reconvergence and path-based restoration. In routing
protocol convergence schemes, new label switched paths (LSPs) will be automatically established
along the shortest paths selected as the routing protocol converges upon network failure. In pathbased restoration, two LSPs are typically pre-established: a primary LSP and a restoration LSP.
Traffic is switched to the restoration LSP on failure of the primary LSP. Since restoration LSPs
do not consume bandwidth before their primary LSPs fail, the restoration LSPs are able to share
restoration capacity on common links as long as their primary LSPs do not fail together. Thus,
optimization algorithms can maximize the restoration LSP sharing to reduce the bandwidth
capacity requirements. Efficient restoration capacity sharing strongly relies on effective selection
of LSP routes, especially restoration LSP routes.
There has been little work to date investigating efficient restoration path selection in MPLS
(packet) networks. However, there has been signficant work on a related problem – namely
restoration path selection in reconfigurable optical (circuit) networks. In optical networks, the
LSPs are bi-directional and LSP bandwidth is granular (e.g., taking on only fixed, defined
SONET/SDH rates). In contrast, in MPLS networks, the LSPs are uni-directional and they can
have any bandwidth. Furthermore, once the application traffic is switched to the restoration LSP,
the bandwidth along the primary LSP will be released automatically in packet networks, as
compared with optical networks where this bandwidth is either not released, or requires explicit
signaling to release its bandwidth.
In this paper, we extend our proposed restoration path selection scheme for optical networks [1]
to provide an efficient restoration LSP route selection algorithm for MPLS networks. We
compare the proposed scheme’s bandwidth efficiency and restoration time with other well-known
algorithms.
To design a restorable MPLS network, we assume that the network operator has full knowledge
of the network topology and the traffic demands, including the physical fiberspan information.
This information is used to design and dimension the network under particular failure scenarios.
To do this, we specify the set of failure scenarios to be considered (e.g., all single router and
fiberspan failures) and then enumerate each failure scenario in turn, routing the traffic in response
to the failure. The capacity dimensioned for each link is the maximum required across all failure
scenarios.
Routing protocols (such as OSPF and IS-IS) have been used for many years in packet networks
and are thus widely deployed and understood. Routing protocol convergence methods in MPLS
networks establish new routes after routing tables update upon network topology changes,
including nework failures or repair. However, the disadvantage of using them for restoration of
network failures is that some implementations can take up to tens of seconds or more to settle.
This restoration time includes the failure detection time and the routing protocol convergence
time (i.e., update the next-hop tables) plus new routes creation time.
Networks that utilize hop-by-hop packet forwarding make routing decisions at each network
node before forwarding a packet. These decisions must be consistent at each node to avoid
routing loops, and thus the route selection algorithm must be identical at each network node.
Route selection algorithms in routing convergence protocols are standardized. However, in pathbased restoration schemes, LSPs are signaled along a selected path, which is only made by the
source node of each LSP. Thus, the route selection algorithm is not standardized and, hence, open
to innovation. Disjoint shortest path selection is a commonly used algorithm: for each demand,
the shortest route is selected as the service LSP (since the service LSPs cannot be shared and the
shortest path is locally optimal), and the restoration LSP is then selected as the disjoint shortest
path. In path-based restoration, the restoration time includes the failure detection and notification
times, along with the time required to switch the traffic to the restoration LSP. Overall, this is
typically faster than the routing convergence schemes, taking less than a second to a few seconds
depending on failure detection mechanisms. However, additional complexity is introduced as
enhanced MPLS signaling protocols (MPLS-TE) must be deployed within networks
implementing path-based restoration schemes and additional intelligence is required in the source
LSR to compute the LSP routes.
The disjoint shortest path selection algorithm is simple, but inefficient because the LSP route
selection algorithm does not consider restoration bandwidth sharing. We instead propose a more
efficient algorithm for restoration LSP route selection in MPLS networks. For simplicity, we
discuss the proposed algorithm here in the context of a centralized server implementation,
although the algorithm can also be implemented in a distributed way. We start with a set of
demands, order them in some fashion, and then route each one in seqence. For each demand d,
the service LSP is always routed alone the the shortest route Ps. To select the restoration route Pr,
we define the matrix failneed(s,kj), to be the amount of restoration capacity required on link k in
direction j to restore all failed service LSPs when failure s occurs, where j=1,2 stands for the two
directions, k1and k2. The bandwidth required for service LSPs on a given uni-directional link kj is
denoted as S(kj). Since MPLS links usually have the same bandwidth in both directions, if we
denote total(s,k) as the total bandwidth required on link k when failure s occurs, then total(s,k) =
max (failneed(s,k1)+S(k1), failneed(s,k2)+S(k2)). Similarly, we define failrelease(s,kj) and
failreroute(s,kj) to be the capacity released and rerouted on link k in direction j after service traffic
reroutes upon failure s respectively. Then, failneed(s,kj) =max(0,failreroute(s,kj)-failrelease(s,kj)).
The total unused capacity (including both spare and restoration capacity) on a link in direction j is
then calculated as the maximum capacity required across all failure scenarios, i.e., U(kj) = maxS
(total(s,k)-S(kj)). We also define M(kj) as the maximum required capacity on uni-directional link
kj upon Ps failure, which is calculated as maxS failneed(s,kj) among the set of possible failure
scenarios along Ps. The routing weights on link k are denoted as w(k). We then select the
restoration route as the shortest path defined by the followign link weights:
(b  C ( k j )) / b  w( k )

v( k j )  


if b  C (k j )  0 and k  Ps
if b  C (k j )  0 and k  Ps
if k share at least one failure with links in Ps
where C(kj) = U(kj)-M(kj) represents the existing restoration capacity on uni-directional link kj
and b is the capacity requirement of demand d.  is a very small positive value which is much less
than the weight of a link. The link weights are selected so as to route restoration paths along links
with less increasing capacity – the idea being that if the restoration LSP for demand d is routed
over unidirectional link kj and the bandwidth requirement of demand d does not exceed C(kj),
then the total unused capacity required on unidirectional link kj does not increase.
To compare the bandwidth efficiency of the three restoration methods, we simulated them on a
simplified US backbone MPLS network with 18 nodes, 32 OC192 links, and full-mesh unidirectional demands. The failure set included all of the fiberspans and backbone LSRs. Our
results demonstrate a 6.8% bandwidth reduction using the diverse shortest path routing and a
12.6% bandwidth reduction using our proposed restoration path selection algorithm compared to
the shortest-path based (reconvergence) schemes. These results demonstrate that path-based
restoration schemes outperform the shortest-path based (reconvergence) schemes, and that our
proposed route selection algorithm significantly outperforms the simpler disjoint shortest-path
algorithm for path-based restoration.
[1] Guangzhi Li, Dongmei Wang, Charles Kalmanek and Robert Doverpike, "Efficient
Distributed Path Selection for Shared Restoration Connections," IEEE Infocom, New York, Vol.
1, pp.140-149, 2002.
Download