VPNs and Network Management EECS 489 Computer Networks http://www.eecs.umich.edu/courses/eecs489/w07 Z. Morley Mao Monday, April 2, 2007 Acknowledgement: Some slides taken from Kurose&Ross VPN Taxonomy VPN Network End-to-end Provider-based Provider-based Customer-based Customer-based L3 L2 ATM Frame Relay LAN What is a VPN? Making a shared network look like a private network Why do this? Private networks have all kinds of advantages • (we’ll get to that) But building a private network is expensive • (cheaper to have shared resources rather than dedicated) History of VPNs Originally a telephone network concept Separated offices could have a phone system that looked like one internal phone system Benefits? Fewer digits to dial Could have different tariffs • Company didn’t have to pay for individual long distance calls Came with own blocking probabilities, etc. • Service guarantees better (or worse) than public phone service Original data VPNs Lots of different network technologies in those days Decnet, Appletalk, SNA, XNS, IPX, … None of these were meant to scale to global proportions Virtually always used in corporate settings Providers offer virtual circuits between customer sites Frame Relay or ATM A lot cheaper than dedicated leased lines Customer runs whatever network technology over these These still exist (but being replaced by IP VPNs) Advantages of original data VPNs Repeat: a lot cheaper than dedicated leased lines Corporate users had no other choice This was the whole business behind frame-relay and ATM services Fine-grained bandwidth tariffs Bandwidth guarantees Service Level Agreements (SLA) “Multi-protocol” How has the world changed? Everything is IP now Some old stuff still around, but most data networks are just IP So, why do we still care about VPNs??? IP VPN benefits IP not really global (private addresses) VPN makes separated IP sites look like one private IP network Security Bandwidth guarantees across ISP QoS, SLAs Simplified network operation ISP can do the routing for you End-to-end VPNs Solves problem of how to connect remote hosts to a firewalled network Security and private addresses benefits only Not simplicity or QoS benefits End-to-end VPNs Solves problem of how to connect remote hosts to a firewalled network FW/ VPN Internet IPsec Tunnels Remote Host Remote Host Site (private network) Site Host Site Host Provider-based end-to-end VPNs Used for instance when enterprise pays for employee access, wants it to go through enterprise network I know Cisco did this But never used that much • Business model didn’t take off Used even less now • In part because VPN client comes with windows OS? The tunneling technology commonly used for roaming dialup though Customer-based Network VPNs Site Site CE CE Internet CE CE Site Site Customer buys own equipment, configures IPsec tunnels over the global internet, manages addressing and routing. ISP plays no role. Customer-based Network VPNs Great for enterprises that have the resources and skills to do it Large companies More control, better security model Doesn’t require trust in ISP ability and intentions Can use different ISPs at different sites But not all enterprises have this skill Provider-based Network VPNs Site Site CE CE PE PE ISP PE PE CE CE Site Site Provider manages all the complexity of the VPN. Customer simply connects to the provider equipment. Model for customer Attach to ISP router (PE) as though it was one of your routers Run routing algorithm with it OSPF, RIP, BGP PE will advertise prefixes from other sites of same customer Various PPVPN issues Provider provisioned VPNs Tunnel type? IPsec (more secure, more expensive) GRE etc. How to discover which customer is at which PE? Don’t want PEs without given customer to participate in routing for that customer How to distinguish overlapping private address spaces Chapter 9: Network Management Chapter goals: introduction to network management motivation major components Internet network management framework MIB: management information base SMI: data definition language SNMP: protocol for network management security and administration presentation services: ASN.1 Chapter 9 outline What is network management? Internet-standard management framework Structure of Management Information: SMI Management Information Base: MIB SNMP Protocol Operations and Transport Mappings Security and Administration ASN.1 What is network management? autonomous systems (aka “network”): 100s or 1000s of interacting hardware/software components other complex systems requiring monitoring, control: jet airplane nuclear power plant others? "Network management includes the deployment, integration and coordination of the hardware, software, and human elements to monitor, test, poll, configure, analyze, evaluate, and control the network and element resources to meet the real-time, operational performance, and Quality of Service requirements at a reasonable cost." Infrastructure for network management definitions: managing entity managing entity agent data data network management managed devices contain managed device managed objects whose data is gathered into a Management Information agent data Base (MIB) managed device protocol agent data agent data managed device managed device Network Management standards OSI CMIP Common Management Information Protocol designed 1980’s: the unifying net management standard too slowly standardized SNMP: Simple Network Management Protocol Internet roots (SGMP) started simple deployed, adopted rapidly growth: size, complexity currently: SNMP V3 de facto network management standard Chapter 9 outline What is network management? Internet-standard management framework Structure of Management Information: SMI Management Information Base: MIB SNMP Protocol Operations and Transport Mappings Security and Administration ASN.1 SNMP overview: 4 key parts Management information base (MIB): distributed information store of network management data Structure of Management Information (SMI): data definition language for MIB objects SNMP protocol convey manager<->managed object info, commands security, administration capabilities major addition in SNMPv3 SMI: data definition language Purpose: syntax, semantics of management data welldefined, unambiguous base data types: straightforward, boring OBJECT-TYPE data type, status, semantics of managed object MODULE-IDENTITY groups related objects into MIB module Basic Data Types INTEGER Integer32 Unsigned32 OCTET STRING OBJECT IDENTIFIED IPaddress Counter32 Counter64 Guage32 Time Ticks Opaque SNMP MIB MIB module specified via SMI MODULE-IDENTITY (100 standardized MIBs, more vendor-specific) MODULE OBJECT TYPE: OBJECT TYPE:OBJECT TYPE: objects specified via SMI OBJECT-TYPE construct SMI: Object, module examples OBJECT-TYPE: ipInDelivers ipInDelivers OBJECT TYPE SYNTAX Counter32 MAX-ACCESS read-only STATUS current DESCRIPTION “The total number of input datagrams successfully delivered to IP userprotocols (including ICMP)” ::= { ip 9} MODULE-IDENTITY: ipMIB ipMIB MODULE-IDENTITY LAST-UPDATED “941101000Z” ORGANZATION “IETF SNPv2 Working Group” CONTACT-INFO “ Keith McCloghrie ……” DESCRIPTION “The MIB module for managing IP and ICMP implementations, but excluding their management of IP routes.” REVISION “019331000Z” ……… ::= {mib-2 48} MIB example: UDP module Object ID Name Type Comments 1.3.6.1.2.1.7.1 UDPInDatagrams Counter32 total # datagrams delivered at this node 1.3.6.1.2.1.7.2 UDPNoPorts Counter32 # underliverable datagrams no app at portl 1.3.6.1.2.1.7.3 UDInErrors Counter32 # undeliverable datagrams all other reasons 1.3.6.1.2.1.7.4 1.3.6.1.2.1.7.5 UDPOutDatagrams Counter32 # datagrams sent udpTable SEQUENCE one entry for each port in use by app, gives port # and IP address SNMP Naming question: how to name every possible standard object (protocol, data, more..) in every possible network standard?? answer: ISO Object Identifier tree: hierarchical naming of all objects each branchpoint has name, number 1.3.6.1.2.1.7.1 ISO ISO-ident. Org. US DoD Internet udpInDatagrams UDP MIB2 management OSI Object Identifier Tree Check out www.alvestrand.no/harald/objectid/top.html SNMP protocol Two ways to convey MIB info, commands: managing managing entity entity request response agent data Managed device request/response mode trap msg agent data Managed device trap mode SNMP protocol: message types Message type GetRequest GetNextRequest GetBulkRequest InformRequest SetRequest Response Trap Function Mgr-to-agent: “get me data” (instance,next in list, block) Mgr-to-Mgr: here’s MIB value Mgr-to-agent: set MIB value Agent-to-mgr: value, response to Request Agent-to-mgr: inform manager of exceptional event SNMP protocol: message formats SNMP security and administration encryption: DES-encrypt SNMP message authentication: compute, send MIC(m,k): compute hash (MIC) over message (m), secret shared key (k) protection against playback: use nonce view-based access control SNMP entity maintains database of access rights, policies for various users database itself accessible as managed object! Chapter 9 outline What is network management? Internet-standard management framework Structure of Management Information: SMI Management Information Base: MIB SNMP Protocol Operations and Transport Mappings Security and Administration The presentation problem: ASN.1 The presentation problem Q: does perfect memory-to-memory copy solve “the communication problem”? A: not always! struct { char code; int x; } test; test.x = 256; test.code=‘a’ test.code test.x a 00000001 00000011 host 1 format test.code test.x a 00000011 00000001 host 2 format problem: different data format, storage conventions A real-life presentation problem: grandma 2004 teenager aging 60’s hippie Presentation problem: potential solutions 1. Sender learns receiver’s format. Sender translates into receiver’s format. Sender sends. 2. Sender sends. Receiver learns sender’s format. Receiver translate into receiver-local format 3. Sender translates host-independent format. Sends. Receiver translates to receiver-local format. Solving the presentation problem 1. Translate local-host format to host-independent format 2. Transmit data in host-independent format 3. Translate host-independent format to remote-host format grandma aging 60’s hippie 2004 teenager ASN.1: Abstract Syntax Notation 1 ISO standard X.680 used extensively in Internet like eating vegetables, knowing this “good for you”! defined data types, object constructors like SMI BER: Basic Encoding Rules specify how ASN.1-defined data objects to be transmitted each transmitted object has Type, Length, Value (TLV) encoding TLV Encoding Idea: transmitted data is self-identifying T: data type, one of ASN.1-defined types L: length of data in bytes V: value of data, encoded according to ASN.1 standard Tag Value 1 2 3 4 5 6 9 Type Boolean Integer Bitstring Octet string Null Object Identifier Real TLV encoding: example Value, 259 Length, 2 bytes Type=2, integer Value, 5 octets (chars) Length, 5 bytes Type=4, octet string Network Management: summary network management extremely important: 80% of network “cost” ASN.1 for data description SNMP protocol as a tool for conveying information Network management: more art than science what to measure/monitor how to respond to failures? alarm correlation/filtering? Examples of network management Challenges Reacting quickly to alleviate congestion Avoiding over-reacting and causing oscillations Limiting bandwidth & CPU overhead on routers Load-sensitive routing Routers adapt to link load in a distributed fashion At the packet level, or on “group of packets” Traffic engineering Centralized computation of routing parameters Network-wide measurements of offered traffic Do IP Networks Manage Themselves? TCP congestion control Senders react to congestion Decrease sending rate But… the TCP sessions receive lower throughput IP routing protocols Routers react to failures Compute new paths But… the new paths may be congested 2 2 4 2 1 1 4 1 3 1 5 4 3 2 1 3 1 5 4 3 Do IP Networks Manage Themselves? In some sense, yes: TCP senders send less traffic during congestion Routing protocols adapt to topology changes But, does the network run efficiently? Congested link when idle paths exist? High-delay path when a low-delay path exists? 2 2 1 4 4 1 1 3 3 2 2 1 1 5 5 4 3 4 3 1 Adapting the Routing to the Traffic Goal: modify the routes to steer traffic through the network in most effective way Approach #1: load-sensitive protocols Distribute traffic & performance measurements Routers compute paths based on load Approach #2: adaptive management system Collect measurements of traffic and topology Management system optimizes the parameters Debates still today about the right answer Load-Sensitive Routing Protocols Advantages Efficient use of network resources Satisfying the performance needs of end users Self-managing network takes care of itself Disadvantages Higher overhead on the routers Long alternate paths consume extra resources Instability from out-of-date feedback information Packet-Based Load-Sensitive Routing Packet-based routing Forward packets based on forwarding table Load-sensitive Compute table entries based on load or delay Questions What link metrics to use? How frequently to update the metrics? How to propagate the metrics? How to compute the paths based on metrics? Original ARPANET Algorithm (1969) Routing algorithm Shortest-path routing based on link metrics Instantaneous queue length plus a constant Distributed shortest-path algorithm (BellmanFord) 2 3 2 1 1 3 5 1 20 congested link Performance of ARPANET Algorithm Light load Delay dominated by transmission & propagation So, link metrics don’t fluctuate much Medium load Queuing delay is no longer negligible Moderate traffic shifts to avoid congestion Heavy load Very high metrics on congested links Busy links look bad to all of the routers All routers avoid the busy links Routers may send packets on longer paths Problem: Out-of-Date Information Lincoln Tunnel NJ NYC Holland Tunnel “Backup at Lincoln” on radio triggers congestion at Holland Routers make decisions based on old information Propagation delay in flooding link metrics Thresholds applied to limit number of updates Old information leads to bad decisions All routers avoid the congested links … leading to congestion on other links … and the whole things repeats Problem: Frequent Updates Update messages Link keeps track of its metric (e.g., queuing delay) Link transmits updates when the metric changes Frequency of updates Frequent changes to the metric lead to frequent updates Significantly increases the overhead of the protocol Oscillation makes the problem worse Oscillation leads to wild swings in the link metrics Forcing very frequent update messages … that add to the load on the links in the network Second ARPANET Algorithm (1979) Link-state protocol Old: Distributed path computation leads to loops New: Better to flood metrics and have each router compute the shortest paths Averaging of the link metric over time Old: Instantaneous delay fluctuates a lot New: Averaging reduces the fluctuations Reduce frequency of updates Old: Sending updates on each change is too much New: Send updates if change passes a threshold Problem of Long Alternate Paths Picking alternate paths Long path chosen by one router consumes resource that other packets could have used Leads other routers to pick other alternate paths Solution: limit path length Bound the value of the link metric “This link is busy enough to go two extra hops” Extreme case Limit path selection to the shortest paths Pick least-loaded shortest path in the network Load-Sensitive Routing Timescales What timescale of routing decisions? What timescale of feedback about link loads? Load-sensitive routing at packet level Routers receive feedback on load and delay Routers re-compute their forwarding tables Fundamental problems with oscillation Load-sensitive routing for groups of packets Routers receive feedback on load and delay Router compute a path for the next flow or circuit Less oscillation, as long as circuits last for a while Reducing Effects of Out-of-Date Info Send link metrics more often But, leads to higher overhead But, propagation delay is a fundamental limit Make the traffic last longer Route on groups of packets, rather than packets Fewer routing decisions, and more accurate feedback Groups of packets Telephone network: phone call (3-minutes long) Internet: TCP connection (10-packets long) Internet: all traffic between a pair of hosts, or routers, … Traffic Engineering as a Network-Management Problem: Case Study Using Traditional Routing Protocols Routers flood information to learn topology Determine “next hop” to reach other routers… Compute shortest paths based on link weights Link weights configured by network operator 2 3 2 1 1 1 3 5 4 3 Approaches for Setting the Link Weights Conventional static heuristics Proportional to physical distance • Cross-country links have higher weights • Minimizes end-to-end propagation delay Inversely proportional to link capacity • Smaller weights for higher-bandwidth links • Attracts more traffic to links with more capacity Tune the weights based on the offered traffic Network-wide optimization of the link weights Directly minimize metrics like max link utilization Example of Tuning the Link Weights Problem: congestion along the pink path Second or third link on the path is overloaded Solution: move some traffic to the bottom path E.g., by increasing the weight of the second link 2 3 2 1 1 31 3 5 4 3 Measure, Model, and Control Network-wide “what if” model Offered Topology/ traffic Configuration measure Changes to the network control Operational network Traffic Engineering Problem Topology Connectivity and capacity of routers and links Traffic matrix Offered load between points in the network Link weights Configurable parameters for routing protocol Performance objective Balanced load, low latency, service level agreements … Question: Given the topology and traffic matrix, which link weights should be used? Key Ingredients of the Approach Instrumentation Topology: monitoring of the routing protocols Traffic matrix: fine-grained traffic measurement Network-wide models Representations of topology and traffic “What-if” models of shortest-path routing Network optimization Efficient algorithms to find good configurations Operational experience to identify key constraints Formalizing the Optimization Problem Input: graph G(R,L) R is the set of routers L is the set of unidirectional links i cl is the capacity of link l Input: traffic matrix Mi,j is load from router i to j Output: setting of the link weights wl is weight on unidirectional link l Pi,j,l is fraction of traffic from i to j traversing link l j Multiple Shortest Paths: Even Splitting 0.25 0.25 0.5 1.0 0.25 0.5 1.0 0.25 0.5 0.5 Values of Pi,j,l Defining the Objective Function Computing the link utilization ul = Si,j Mi,j Pi,j,l Utilization: ul/cl Link load: Objective functions min (maxl(ul/cl)) min(Sl f(ul/cl)) f(x) x 1 Complexity of the Optimization Problem Computationally intractable problem No efficient algorithm to find the link weights Even for simple objective functions What are the implications? Must resort to searching through weight settings Optimization Based on Local Search Start with an initial setting of the link weights E.g., same integer weight on every link E.g., weights inversely proportional to capacity E.g., existing weights in the operational network Compute the objective function Compute the all-pairs shortest paths to get Pi,j,l Apply the traffic matrix Mi,j to get link loads ul Evaluate the objective function from the ul/cl Generate a new setting of the link weights repeat Making the Search Efficient Avoid repeating the same weight setting Keep track of past values of the weight setting … or keep a small signature of past values Do not evaluate setting if signatures match Avoid computing shortest paths from scratch Explore settings that changes just one weight Apply fast incremental shortest-path algorithms Limit number of unique link-weight values Don’t explore 216 possible values for each weight Stop early, before exploring all settings Incorporating Operational Realities Minimize number of changes to the network Changing just 1 or 2 link weights is often enough Tolerate failure of network equipment Weights usually remain good after failure … or can be fixed by changing 1-2 weights Limit effects of measurement accuracy Good weights remain good, despite noise Limit frequency of changes to the weights Joint optimization for day & night traffic matrices Application to AT&T’s Backbone Performance of the optimized weights Search finds a good solution within a few minutes Much better than link capacity or physical distance Competitive with multi-commodity flow solution How AT&T changes the link weights Maintenance every night from midnight to 6am Predict effects of removing link(s) from network Reoptimize the link weights to avoid congestion Configure new weights before disabling equipment Example from AT&T’s Operations Center Amtrak repairing/moving part of train track Need to move some of the fiber optic cables Or, heightened risk of the cables being cut Amtrak notifies AT&T the timework will be done AT&T engineers model the effects Determine which IP links go over affected fiber Pretend the network no longer has these links Evaluate the new shortest paths and traffic flow Identify whether link loads will be too high Example Continued If load will be too high Reoptimize the weights on the remaining links Schedule time for new weights to be configured Roll back to old weights when Amtrak is done Same process applied to other cases Assessing the network’s risk to possible failures Planning for maintenance of existing equipment Adapting link weights to installation of new links Adapting link weights in response to traffic shifts What About Interdomain Routing? Border Gateway Protocol Announcements carry very limited information E.g., AS path, but nothing about delay, loss, etc. Challenging to make load-sensitive protocol Hard to agree upon a common metric Hard to scale to such a large network Hard to prevent ASes from gaming the system Instead, individual ASes act alone Change routing policies based on link load E.g., moving some traffic to another provider Interdomain Traffic Engineering Predict effects of changes to import policies Inputs: routing, traffic, and configuration data Outputs: flow of traffic through the network Topology Externally learned routes BGP policy configuration BGP routing model Offered traffic Flow of traffic through the network Outbound Traffic: Pick a BGP Route Easier to control than inbound traffic IP routing is destination based Sender determines where the packets go Control only by selecting the next hop Border router can pick the next-hop AS Cannot control selection of the entire path Provider 1 “(1, 3, 4)” Provider 2 “(2, 7, 8, 4)” Outbound Traffic: Shortest AS Path No import policy on border router Pick route with shortest AS path Arbitrary tie break (e.g., smallest router-id) Performance? Shortest AS path is not necessarily best Could have high delays or congestion Load balancing? Could lead to uneven split in traffic E.g., one provider with shorter paths E.g., too many ties with skewed tie-break Outbound Traffic: Load Balancing Selectively use each provider Assign local-pref across destination prefixes Change the local-pref assignments over time Useful inputs to load balancing End-to-end path performance data • E.g., active measurements along each path Outbound traffic statistics per destination prefix • E.g., packet monitors or router-level support Link capacity to each provider Billing model of each provider Balancing Load, Performance, and Cost Balance traffic based on link capacity Measure outbound traffic per prefix Select provider per prefix for even load splitting But, might lead to poor performance and high bill Balance traffic based on performance Select provider with best performance per prefix But, might lead to congestion and a high bill Balance traffic based on financial cost Select provider per prefix over time to minimize the total financial cost But, might lead to bad performance Conclusions Adapting routing to the traffic To alleviate congestion To minimize propagation delay To be robust to future failures Two main approaches Load-sensitive routing protocol Optimization of configurable parameters