Implementing An Differentiated Services Architecture on IXP1200 18-884 Network Design and Evaluation Design Document Tung-fai Chan, Jiann-min Ho, Jia-cheng Hu, Igor Pevac {tungfai, jiannmin, jiacheng, ipevac}@andrew.cmu.edu Abstract This project will use Intel's IXP 1200 packet processor to implement Differentiated Services Architecture base on the paper propose by K. Nichols, V. Jacobson and L. Zhang [1]. We will examine the effectiveness of our implementation to see how the service is improved and what side effect is incurred. Meanwhile, we will also put emphasis on the bandwidth broker and the signaling protocol. 1) Introduction The basic idea of Differentiated Service (DiffServ) is to define the architecture and a set of forwarding behaviors so that the service providers can identify and implement end-to-end services on top of this architecture. This framework will push complexity to the edge routers to the extent possible and keep the core router simple. The edge routers will check the traffic flow coming into the switch, classify it, mark appropriate bits on it, shape the traffic flow or drop the packet if necessary, and forward to the core network. The core routers will just examine the bits pattern and transmit the packet as appropriate. The forwarding behavior is predefined in profile through the negotiation between service providers. In this project, we will define three different types of service, 'Assured', 'Premium' and 'Best effort'. Each type will provide different forwarding behavior and bandwidth is the resource being requested and allocated. We will briefly explain these terms in the following. Assured service will follow "expected capacity" usage profiles that are statistically provisioned. The assurance that user of such service receives is that such traffic is 'unlikely' to be dropped as long as it observes the expected capacity profile. An Assured service traffic flow may exceed its profile, but the excess traffic is not given the same assurance level. On the other hand, a Premium service traffic flow is hard-limited to its provisioned peak rate. Traffic flows within the predefined bandwidth will be guaranteed, but bursts over the profile limit maybe be queued at edge router and may not be injected into the network promptly. Finally, the 'Best effort' service will not be given any assurance, but they will be sent as long as there is bandwidth available. In our implementation, we will focus on the interaction between routers. We will incorporate the concept of 'Bandwidth Broker' and regulate forwarding behaviors based on the result of negotiation between bandwidth brokers. The core router will maintain the minimal knowledge of its neighboring routers' profile. This design report is organized into sections. Section 2 will describe design and implementation by flowbasis; an evaluation plan is propose to examine this project in section 3; and the last section will present our implementation plan. 2) Design and Implementation 2.1) Admission Control Admission control is a critical part of any Quality of Service scheme. It is very important to know the limitations at which a particular domain can be reserved in order to assure that the reservations made 1 receive the service promised to them. As part of our design we need to examine admission control for both Premium and Assured classes of service. Premium class is very strict in the way traffic is sent. It is regulated by bandwidth and shaped to eliminate any bursts. Being such it receives absolute priority in the network. As such it is crucial that there is not an over reservation of this type of traffic, since it will starve the remaining traffic in the network. Assured service is a little less strict in its requirements for flow allowing periods of bursts. However, it also makes fewer promises about the priority that this traffic will receive over the best effort traffic. Thus the admission control for this type of service need not be a stringent in its allocation. The approach that we are taking in our design is the admission control through use of a central authority. This is described as a Bandwidth Broker in [1]. The information that the Bandwidth Broker keeps is the allocation on the leaf nodes of the network (edge routers). By knowing how much of the resources are already reserved it can answer requests form incoming reservations. Once it is determined that a reservation can be made, the Bandwidth Broker sends the specifics of the flow to the corresponding edge router to allocate the resources towards classifying the flow and providing the level of service requested. The state kept is a soft state and is refreshed by the Bandwidth Broker for the specified duration of the reservation. D Edge Edge Core BB S to D P@256kb/s 2am-4am S to D P@256kb/s Edge Classifier S to D Rate 256kb/s Type P S Figure 1 Admission Control for local domain The problem of admitting a flow becomes a bit trickier when multiple domains are involved. In this case their needs to be an agreement between the domains on the link shared exactly how to treat the traffic sent. Once the agreement is made the information need to be entered into the Bandwidth Broker. It should be in the form Domain A agrees to receive X amount of specific class of traffic from Domain B, and likewise Domain be makes a similar agreement. So a reservation across domain starts in the same way a reservation in a singe domain. The sender requests the reservation to its Bandwidth Broker, it in turn makes sure the domain is able to support it, and forwards the reservation to the next domain. The Bandwidth Broker of the second domain look to see if the flow will exceed the amount of traffic specified between the 2 domains, as well as to see if second domain is able to accommodate the flow. If it is satisfactory it sends the flow information to the edge router and a replay to the Bandwidth Broker of the first domain. The first domains Bandwidth Broker send the flow information to the edge node and a confirmation to the host requesting the flow. 2 128kbs to D CMU PITT BB 128kbs to D BB Peer CMU OK Total 256 Used 128 OK S D Figure 2 Admission Control for inter-domain There is a lot of signaling information that is passed between the Bandwidth Brokers and the edge routers. It is important to establish a protocol that is robust against failures in the reservation. We plan to implement an RSVP like signaling scheme for this purpose. Also, keeping a soft state at the edge routers also prevents stale information to interfere with incoming reservation requests. In much the same way teardown of a connection is made simple. After the time for a connection expires the Bandwidth Broker simply stops refreshing the state at the edge routers. 2.2) Data Transmission 2.2.1) Edge Router After the connection setup phase, a traffic profile has been contracted between source and the ingress router. For subsequent traffic entering the domain, the ingress performs following actions before the packets are passed to the forwarding engine at the ingress router as shown in Figure 3. flow Arriving Packets Packet Classifier 1 … Forwarding Engine … flow Profile Marker n Figure 3 Diagram for Packet Flow in Edge Router The framework adopts the idea from [1] that designating two bits patterns from the IP header precedence field to indicate the corresponding service type: premium (P-bit), assured (A-bit), and best effort. For every arriving packet, it must have both the P-bit and A-bit cleared before processed by Packet Classifier. Packets are classified into different flows in which will pass through individual Profile Marker or passed to the Forwarding Engine according to service type. Each marker has been configured from the usage profile for its corresponding flow: service class, rate, and burst size. Packets coming out from markers are injected into the Forwarding Engine in which they are transmitted according to the bit-pattern in the header. 3 2.2.1.2) Packet Classifier This component is basically a general classifier, which performs a transport level flow matching, based on the information within the packet header. Packets that match to one of the configured flow profile are forwarding to corresponding Profile Marker, otherwise, to the Forwarding Engine directly. This may be a time-consuming process and keeping it at the ingress router where fewer flow can reduce its workload. The classifier will be implemented as a data plane component written in micro-code. The functionalities can be achieved by extracting field values from the packet header and compare with contracted profile. 2.2.1.2) Profile Marker The marker, according to the flow profile, carries out both policing and marking responsibilities. Assured flow packets emerge from the marker with the A-bit set when the flow is conformance to its profile, whereas for premium traffic, the marker will hold the packets whenever necessary to enforce the contracted rate. Token Bucket Queue Set P-bit Wait for token Packet Input Packet Output Token Bucket Wait for token Without Token Set A-bit With Token Figure 4 Implement Premium and Assured Services by Token Buckers Figure 4 shows the details of how the marker should do to different service types. The marker is implemented by using token bucket in micro-code fills at the flow rate in number of bytes specified in the usage profile. For Assured service, the token bucket depth represents the contracted burst size. When a token is present, packet has its A-bit set; otherwise, it is passed to the engine without any bit set. For Premium service, the incoming packet will have P-bit set and forwarded if a token is present; otherwise, the packets are held in the queue till token arrive. This is one of the new core components that added into the data plane. The token bucket can be simulated by following implementation: setup a longword variable to represent the bucket and pre-set a maximum value as the bucket depth according to the contracted burst size; increment the variable at each clock interrupt at the rate obtained from the profile; for every arriving packet, if the packet size is equal or smaller than the number of bytes represented in the bucket, the value of the variable will be decremented by the packet size and the packet will be marked and forwarded; otherwise, it will be held in the queue until the bucket has enough ‘bytes’ for premium traffic. Note that incrementing the variable (filling the bucket) must be designed and implemented carefully in order to avoid race condition since the micro-code is a multi-threaded environment. 4 2.2.1.3) Priority Queues There are two priority queues being maintained: high and low priority. Premium traffics are enqueued into high priority queue while others, including both Assured and Best-Effort traffic, are put into low priority queue. Since Premium traffic will hardly conform to the contract profile and so there should not be queue overflow problem; whereas in the low priority queue, RIO mechanism from RED is employed to manage. This algorithm uses two thresholds to determine when to drop packets. The lower one based on number of Best-Effort packets and another one based on number of packets from Assured traffic. When the queue size is smaller than the lower threshold, no packet will drop; if the size exceeds the lower threshold but below the upper threshold, only unmarked packets will be dropped according to the probability calculated following the RED mechanism. In this case, no Assured traffic will be penalized. Keeping track of accurate Assured and Best-Effort packets count can do the implementation of RIO queue management, and then make decision based on these two counters. 2.2.1.4) Forwarding Engine This basically employs absolute priority over the two queues. High priority queue with Premium packets must be serviced first before any other traffic, including both Assured and Best-Effort traffic, in another queue. 2.2.2) Core Router Following the framework described in [2], simplicity is the keep design rationale of the core router for data transmission. During this phase, the core routers carry no per-flow information. Instead, they perform traffic aggregations according to the different service types: Premium, Assured and Best-Effort. All incoming packets into the core router are classified according to the bit-pattern in the packet header. Then, they are enqueued into different priority queue in the Forwarding Engine as shown in Figure 5. Arriving Packets Packet Classifier Queue 1 Queue 2 Forwarding Engine Figure 5 Diagram of Packet Flow in Core Router 2.2.2.1) Packet Classifier This classifier, unlike the general classifier employed by edges, is in fact a simple bit-pattern classifier. It comprises a simple binary decision based on a particular bit-pattern in the IP header is set or not. This classification is done without look into other header field and this greatly enhances the efficiency and reduces packet overhead at the core router. Since core routers are usually required to handle rather heavy aggregated traffic, this simplicity is essential. 2.2.2.2) Forwarding Engine The mechanism of the Forwarding Engine at the core routers is essentially the same as edge routers. Packets are transmitted following to the priority level mentioned in previous section. 5 3) Evaluation Plan In order to evaluate the framework composed of its components (edge router, core routers and bandwidth brokers) to provide differentiated services, we need to construct and simulate several simple ISP provider domains to deploy the edge routers and the bandwidth brokers we implement. The topology inside should be generic and pre-defined such that we can easily setup reserved paths or use existing resource allocation semantics like RSVP or SNMP without complexity for analysis. We will construct several sets of traffic flows (video streams or other data streams) with different types of services (assured, premium and best effort) as testing traffic. Then, send one set of them each time from one or more hosts into the first immediate simulated domain and obtain evaluation metric information at the target. We have primarily three dimensions of evaluation: low-level, high-level, and inter-domain. In low-level, the evaluation metrics we will evaluate the performance of our implementation in the architecture are “average throughput” and “transmission latency”. The average throughput of each traffic flow of different service type is defined by the ratio of the number of sent packets and the number of received packets, while the transmission latency is fined by the average delay of all packets sent from source to destination. We can verify and analyze the streams with different service types based on above metrics. In addition, through these analyses, we can also find out the overhead our implementation added into the code. In high-level, we evaluate the design and implementation of our differentiated service framework by comparing the performance resulted from differentiated traffic flows and best effort traffic flows in single domain. The metrics being used to assess the performance will be latency, drop rate, and throughput. For inter-domain issues, it would be the same as high-level evaluation but in multi-domain environment. The major component to be demonstrated is bandwidth broker. Its performance will be evaluated by: analyzing the relation between the signaling functionalities and the resulted latencies; and then comparing with besteffort traffic without broker involvement. 4) Timeline and Planning Week of 9/24 10/1 10/8 10/15 10/22 10/29 11/5 11/12 11/19 11/26 12/3 Design Refinement Go through MicroCode in detail Data – Control Plane Interaction Design Data Plane: Packet Classifier and Token Bucket Control Plane: Bandwidth Broker and Admission Control within local domain Data Plane: Queue Management and Forwarding Engine Control Plane: Bandwidth Broker and Admission Control for interdomain Implementation Consolidation and Testing Simulation and Refinement Simulation and Evaluation Report Write-up Presentation Preparation Buffer 6 Group Member Tung-fai Chan Jiann-min Ho Jia-cheng Hu Igor Pevac Responsibilities Data Plane Data Plane and Control Plane Data Plane Control Plane References [1] Nichols, K., Jacobson, V., and Zhang, L., “A Two-bit Differentiated Services Architecture for the Internet”, November 1997. [2] Blake, S. et al, “An Architecture for Differentiated Services”, RFC 2475, December 1998. [3] Jacobson, V., “Differentiated Services Architecture”, August 1997. [4] Clark, D., and Wroclawski, J., “An Approach to Service Allocation in the Internet”, August 1997. 7