8.1 Introduction Some basic concepts of routing inside an autonomous system was introduced in chapter 6. This chapter will discuss routing between autonomous systems. 8.2 Core Router, Peers and Default Route • The routing table of a given router inside autonomous systems (AS) contains default route to limit the size of the routing table. Routing with partial information (default) also allows relative autonomy in making local routing changes. • Core routers used to connect all local area networks together and know routes for all possible destinations, they are supposed to use few or no default routes. Otherwise, we have an inefficiency problem. CSE5803 Advanced Internet Protocols and Applications (8) 1 • • • It is equally impossible for all core routers to have all global routing information. Protocols have been designed for these routers to exchange routing information. Peer routers are core or gateway routers each contains part of routing information and communicate with each other using the above protocols. With the rapid development of the Internet, the concept of core and local networks is becoming obsolete. Protocols (BGP4) have also been developed to adapt with the changes. CSE5803 Advanced Internet Protocols and Applications (8) 2 8.3 Interior and Exterior Gateway Protocols (IGPs & EGPs) • IGPs operate inside an AS. These include static routing, RIP(1, 2) and OSPF (will be discussed later). • IGPs are mainly concerned with hop count or link cost. An IGP reacts to a change in the topology by trying to find a new best (shortest) path automatically. • EGPs are used to route traffic between ASes. These include EGP (becoming obsolete) and BGP (version 4 in use). • EGPs are mainly concerned with network reachability. EGPs do not pass link cost information from one AS to another, even if they do, the meaning is not the same as in IGP. Next-hop in EGP may not be an immediately reachable link. CSE5803 Advanced Internet Protocols and Applications (8) 3 8.4 The Exterior Gateway Protocol (EGP) • An AS needs to pass reachability information to Internet core routers and other ASes. One or more routers in the AS are designated for this purpose, and EGP is one of the protocols. • Two routers exchanging routing information belong two different ASes are referred to as Exterior neighbors. (Interior neighbors if inside the same AS). CSE5803 Advanced Internet Protocols and Applications (8) 4 • Main features of EGP – Neighbor Acquisition: one router acquires peer routers or neighbors so they can communicate reachability information. Geographic proximity is irrelevant. – Neighbor State: one router continually tests whether its EGP neighbors are responding. – Information Exchange: EGP peers periodically exchange reachability information by passing routing update message. 8.4.1 EGP message types and the fixed header • EGP has nine message types to support the main features – Acquisition Request: requests router become a neighbor (peer) – Acquisition Confirm: positive response to acquisition request – Acquisition Refuse: negative response to acquisition request – Cease Request: requests termination of neighbor relationship – Cease Confirm: confirmation response to cease request CSE5803 Advanced Internet Protocols and Applications (8) 5 – Hello: requests neighbor to respond if alive – I Heard You: response to hello message – Poll Request: requests network routing update – Routing Update: network reachability information – Error: response to incorrect message • Common header for all EGP messages – – – – – Version: version of EGP Type: type of message Code: used to distinguish subtypes Status: message dependent Checksum: same as the IP checksum CSE5803 Advanced Internet Protocols and Applications (8) 6 – Autonomous system number: a number for the AS – Sequence number: a number that the sender used to associate replies with messages. 8.4.2 EGP neighbor acquisition messages • EGP does not specify why one router choose another router as neighbor. • Message format: – Hello interval: time interval for testing whether neighbor is alive – Polling interval: maximum frequency of routing updates CSE5803 Advanced Internet Protocols and Applications (8) 7 • Code: 0 – acquisition request 2 - acquisition refuse 4 - cease confirm 1 - acquisition confirm 3 - cease request 8.4.3 EGP reachability messages • EGP routers send hello and poll messages to their neighbors in active mode, or just listen to other routers in passive mode. • hello message provides reachability information (alive?) of a neighbor. It is the same length as the common header with Type=5 and Code=0 (request) or 1 (response). CSE5803 Advanced Internet Protocols and Applications (8) 8 • • Poll message provides network reachability information. It has a field which is labeled IP SOURCE NETWORK in addition to a hello message. This specifies the common network that both of the neighbors are attached. Common network CSE5803 Advanced Internet Protocols and Applications (8) 9 • With the network source: – The router identifies the port over which EGP is running. – The entry point of the AS. • Hello and poll messages are shorter than routing updates and more frequent sent. 8.4.4 EGP routing update messages • EGP restricts a noncore router to advertise only networks reachable within its AS, not those it knows but outside its own AS. • The format of routing update message is shown on the next page. • EGP routes are given relative to a specified network. The update message lists routers on that network and the distance of destinations. A network address contains 1-3 bytes (classful). CSE5803 Advanced Internet Protocols and Applications (8) 10 CSE5803 Advanced Internet Protocols and Applications (8) 11 • # INT. GWYS and # EXT. GWYS give the number of interior and exterior routers appear in the message. • IP SOURCE NETWORK gives the network from which all reachability is measured. • The reachable networks are grouped by a router IP and a distance value. This distance value is not going to be interpreted by EGP. 8.5 EGP Limitations • EGP was devised for old internet architecture where core served as the interconnection for all ASes. • The Internet today has grown considerably and is no longer a twotiered, hierarchical structure, with non-core gateways connecting to core gateways only. • EGP has no facility for mesh type of connection or loop detection when non-core gateways are connected to core and non-core gateways at the same time. CSE5803 Advanced Internet Protocols and Applications (8) 12 • EGP uses IP directly and can be unreliable. • EGP updates routes about every 3 minutes, too slow in today’s standards. The updating message is also large. • EGP reports destinations in classful addresses only. No net or subnet mask information • EGP interprets distance metric (1-255) only in two ways, 255 means unreachable, and any other number means reachable. CSE5803 Advanced Internet Protocols and Applications (8) 13 8.6 Modern Internet Structure and Autonomous Systems (AS) • The Internet today consists of major backbones, regional networks and campus or corporate networks. • ISPs play an important role with today’s internet. One ISP can consist of one or more ASes, corporate networks can be part of an ISP or an independent AS. • An arbitrary collection of interconnected ASes. • Routing protocols need to prevent loops, advertise thousands of destinations, and give the AS administrator flexibility in determining routing policy. • Global ASes are assigned with 2-byte numbers for the purpose of identification. ISPs can issue private AS numbers from 65412 to 65535. CSE5803 Advanced Internet Protocols and Applications (8) 14 Subscriber 1 AS 1.1 Subscriber 3 AS 3.1 Subscriber 2 AS 2.1 ISP A AS 1 ISP B AS 2 ISP D AS 3 ISP C AS 4 Subscriber 4 AS 4.1 Subscriber 5 AS 4.2 CSE5803 Advanced Internet Protocols and Applications (8) 15 • AS types – Stub AS: only one entry/exit point to the AS, all traffic comes in or exit from this point. One BGP peer is configured. – Multihomed transit AS: Multiple connections to other ASes. Traffic in this AS might have originated in another AS and be destined for a third AS. It might also be a local origination or destination. Multi BGP peers are configured. – Multihomed nontransit AS: Multiple connections to other ASes but does not function as a transit. All traffic originates in or is destined for the AS. Again, multiple BGP peers are configured. CSE5803 Advanced Internet Protocols and Applications (8) 16 8.7 Border Gateway Protocol (BGP) • BGP-4 is the current version under application (RFC1771, 1995). Many amendments and improvements proposed later. • BGP uses TCP, route updates are reliably transmitted. This eliminates the need for periodic updates. After the initial exchange of complete BGP routing information, only keep-alive messages and incremental network changes are exchanged, saving bandwidth and processing time. • Loop detection. Each BGP routing update contains Network Layer Reachability Information (NLRI), or network address and a list of AS numbers that were traversed (AS_PATH) as well as other path attributes. By carrying the AS path in the network update, loops can be detected. If a received route already contains the local AS number, that route is not used. • Support of CIDR. Network updates contain network IP address and the length of effective bits (mask). CSE5803 Advanced Internet Protocols and Applications (8) 17 8.7.1 BGP concepts • Initial updates between BGP peers. All known reachable networks are exchanged in form of (length, prefix). For example: (19, 198.24.160.0) • Incremental updates are sent as network information changes (nets unreachable or a better path). • KEEPALIVE messages are sent periodically between BGP neighbors to ensure the connection is kept on. These messages are only 19 bytes long. N1 N2 N3, N4 N1 N2 N3 N4 N3 N4 N3 N4 N1 N2 N1, N2 CSE5803 Advanced Internet Protocols and Applications (8) 18 8.7.2 BGP message header and Keepalive message • Header = 16-byte marker + 2-byte length + 1-byte type • Marker field is used for authentication computation. If the message type is OPEN, the marker field must be all ones. • Length field is the total message length including the header. A BGP message is between 19 to 4096 bytes. • There are four different BGP types of messages. The types are: – OPEN – UPDATE – NOTIFICATION – KEEPALIVE • KEEPALIVE message consists only the header, and is sent periodically to ensure hold time (explained later) does not expire. The rate is 1/3 of hold time. If hold time is zero, KEEPALIVE will not be sent. CSE5803 Advanced Internet Protocols and Applications (8) 19 8.7.3 BGP OPEN message Header Message Format: • Version: 1-byte integer for the version of the BGP (4). • AS number: of this BGP router, 2-byte. • Hold time: 2-byte, the maximum amount of time (seconds) between successive KEEPALVIE or UPDATE messages. The timer resets after the reception of those messages. Time out means the BGP neighbor is dead. – The BGP routers negotiate with their neighbors to set the value. It chooses a lower value between its own and its neighbor’s. If the value is 0, the timer never expires and the connection is considered as always up. If not 0, the minimum recommended value is 3 seconds. • BGP Identifier: 4-byte, Router ID. This can be the first-up link IP, highest IP address on the router, or a specifically given, “always on”, IP. (How many IP addresses does a router have?) CSE5803 Advanced Internet Protocols and Applications (8) 20 • Optional Parameter Length: 1-byte for the length in bytes of the followed optional parameter field. • Optional Parameter: 1-byte type and variable length values, e.g. Type 1 for BGP peer authentication. OPEN message is used to establish BGP peer connections. There is a sixstate FSM to describe the events of the establishment. 8.7.4 BGP NOTIFICATION message This message is used by BGP if there are errors within the six-state FSM for connection establishment. It provides the network administrator with the nature of the errors. • This message consists of 1-byte error code, 1-byte error subcode and a variable data field. • There six possible BGP error codes: CSE5803 Advanced Internet Protocols and Applications (8) 21 – 1: Message Header Error, – 2:OPEN Message Error – 3:UPDATE Message Error, – 4: Hold timer expired – 5: FSM error (errors detected by FSM) – 6: Cease: Other fatal errors. • There are 11 error subcodes, which include Bad Peer AS, Authentication Failure, Unacceptable Hold Time, etc. 8.7.5 BGP UPDATE message and routing information • This is essential to BGP. The following are the basic blocks of an UPDATE message: – NLRI – Path attributes – Unreachable routes CSE5803 Advanced Internet Protocols and Applications (8) 22 • UPDATE message format: Withdrawn Route Length (2 bytes) Withdrawn Route(s) (variable) Path Attribute Length (2 bytes) Path Attribute(s) (variable) NLRI List (variable) • NLRI: A list of reachable networks in the format of (Length, Prefix). Prefix is aligned to the byte boundary. • Withdrawn Routes: A list of routing updates that need to be removed from the BGP routing table. (These nets are no longer reachable.) The format is the same as NLRI. • Path Attributes: Some compulsory and some optional. Each attribute starts with a 2-byte field which consists flags plus type/code. CSE5803 Advanced Internet Protocols and Applications (8) 23 – Well-known mandatory: the attributes have to exist for NLRI in an UPDATE message. A notification error will be generated if one of these is missing. These include ORIGIN (code1), AS_path (2), NEXT_HOP (3). – Well-known discretionary: exist in all BGP implementation but do not have to be sent in every UPDATE message. – Optional transitive: optional in BGP implementation. If a BGP router does not understand it but the transitive flag is set. The attribute is passed on to other BGP speakers. – Optional nontransitive: When an optional attribute is not understood and the transitive flag not set, it will be ignored and deleted. • If Attribute Length is zero then NLRI is also zero, the UPDATE message is used to withdraw routes only. CSE5803 Advanced Internet Protocols and Applications (8) 24 • An UPDATE message can advertise only one route/path, which may be described by several path attributes. All path attributes contained in a given UPDATE messages apply to the destinations carried in the Network Layer Reachability Information field. Attributes For AS20 AS20 Attributes For AS 10 AS30 AS10 CSE5803 Advanced Internet Protocols and Applications (8) 25 8.8 BGP Attributes The following classification of BGP is necessary before the discussion. • EBGP connection, exterior BGP – these BGP connections exist between BGP peers in different AS(es). These peers are typically directly connected with the same IP subnet. BGP routes exchanged should have their AS path and next-hop information updated. • IBGP connection, interior BGP – these exist between BGP peers inside the same AS. These peers are not directly connected within the same subnet, but talks via IP with some sort of IGP. When BGP routes (external routes) are exchanged across an IBGP connection, they do not have AS path and next-hop information updated. CSE5803 Advanced Internet Protocols and Applications (8) 26 CSE5803 Advanced Internet Protocols and Applications (8) 27 8.8.1 BGP well-known mandatory attributes • NEXT_HOP (type code 3) – For EBGP sessions, the next hop is the IP address of the peer that announced the route. – For IBGP, routes originated inside the AS, next hop is the same as if it is in EBGP. For routes injected into it via an EBGP, the next hop is the IP address of the EBGP neighbor from which the route was learnt. • AS_Path (type code 2) – The AS that sends the route adds it own AS number before forwarding it to its external BGP peers. This is done by prepending its own AS number to the list to form the AS_sequence. – EBGP route with own AS in the list: route rejected (maybe loop). – IBGP route with own AS prepending: do nothing. – Private AS: customers with single ISP (single or multihomed) do not get global AS but private AS numbers. The ISP must strip it off before propagating the route to other BGP peers. CSE5803 Advanced Internet Protocols and Applications (8) 28 – Route aggregation issues: summarising ranges of routes or CIDR blocks to minimise the number of routes in the global routing tables. A side effect is the loss of some information with the specific routes. BGP solves the loss of AS_path information by introducing an optional AS-SET parameter (to avoid loops). AS numbers appear in AS-SET without particular order. CSE5803 Advanced Internet Protocols and Applications (8) 29 • ORIGIN (type code 1) This indicates the origin of the route (how it was learnt) with the respect to the originating AS. There three types of origins: – IGP: NLRI is internal to the AS – EGP: NLRI is learnt via an EGP (eg BGP 4) – INCOMPLETE: NLRI is learnt by some other means 8.8.2 BGP well-known discretionary attributes • LOCAL_PREF (type code 5): The local preference attribute is a degree of preference given to a route to compare it with other routes for the same destination. Higher value means a more preferred route (with a faster speed, etc). This is local to AS only and gets exchanged between IBGP peers and not passed on to EBGP peers. • ATOMIC_AGGREGATE (type code 6): This is an indication bit that gets set when route aggregation causes information loss. CSE5803 Advanced Internet Protocols and Applications (8) 30 8.8.3 BGP other attributes • MULTI_EXIT_DISC(MED) (code 4): optional nontransitive. It is a hint to external peers about the preferred path into an AS with more than one entry point. Lower value is preferred than (or discriminates against) higher ones. MED attributes are passed to other AS(es) once, but not passed on again. • COMMUNITY (type code 8): optional transitive. This is used to indicate a group of destinations that share some common logical property, such as .edu, .gov, etc. • AGGREGATOR (type code 7): optional transitive. It identifies the BGP peer which has generated a route aggregation, using the AS number and router ID (RID). Example: AGGREGATOR=(800, 193.160.54.2). CSE5803 Advanced Internet Protocols and Applications (8) 31 8.9 BGP4 Route Aggregation • Since BGP4 handles CIDR, route aggregation will happen with the routing table. This follows common principles and practices, rather than a standard (or an RFC). • Simple aggregation, suppressing more specific. For example: if a range of subnets exist from 130.192.0.0/24 to 130.192.15.0/24 in Monash, these can be aggregated to 130.192.0.0/20, and is sent to other BGP peers. • Aggregation plus more specific routes. This is useful with multihomed AS to one ISP. The ISP knows which connection to chose for the sake of load balancing. The specific routes will not go beyond the ISP. • Loss of information inside aggregation (AS-SET). AS-SET is introduced to overcome the problem. However, due to the change of single route attributes, the change can be reflected with the aggregation. This may cause aggregated route being unstable. CSE5803 Advanced Internet Protocols and Applications (8) 32 • Change of attributes. Inherited attributes from specific routes may not be desirable anymore with aggregated routes. Manual change has to be provided. • Forming the aggregate based on a subnet of more specific routes. The AS-SET can be manipulated so an aggregated route can be based on certain subnet(s) only. This makes an AS to advertise as few different aggregated routes as possible. CSE5803 Advanced Internet Protocols and Applications (8) 33