Review Slides Theophilus Benson How Does the Internet Work? • Context: you are trying to visit facebook.com – What are the different protocols that are used? – How does this Class’s content fit in? • Browser decides: – What version of HTTP to use.. – And uses TCP Anatomy of a Web Page • HTML content • A number of additional resources – Images – Scripts – Frames All of these Are web-objects • Browser makes one HTTP request for each object – Course web page: 14 objects – My facebook page this morning: 100 objects Step-0: Open your browser What Version of HTTP • Versions vary in terms of performance • Cause of performance problems – For small objects: • Latency matters (RTT dominates) – For large objects: • Throughput matters • Major causes of latency problems: – Opening a TCP connection – Actually sending the request and receiving response – And a third one: DNS lookup! HTTP Timeline TCP Timeline • HTTP1.0 No keep-alive • • HTTP1.1 Keep-alive Get index.html Response Get img1 Response Get img2 Response Get img3 Response Green lines are TCP-handshake Black lines: HTTP request Blue Lines: HTTP responses Browser Request GET / HTTP/1.1 Host: localhost:8000 User-Agent: Mozilla/5.0 (Macinto ... Accept: text/xml,application/xm ... Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Step-1: Name Resolution • Packets are sent using IP-addresses – You don’t know IP, you only know the URL. • So need to figure out the ip-address for facebook.com – Domain Name Resolution. • Converts name to IP-addresses Basic Domain Name Resolution • Every host knows a local DNS server – Sends all queries to the local DNS server • If the local DNS can answer the query, then you’re done 1. Local server is also the authoritative server for that name 2. Local server has cached the record for that name • Otherwise, go down the hierarchy and search for the authoritative name server – Every local DNS server knows the root servers – How is caching used by the resolver? What are the implications? – Iterative versus Recursive queries 10 Local Name Servers Where is google.com? Northeastern • Each ISP/company has a local, default name server • Often configured via DHCP • Hosts begin DNS queries by contacting the local name server • Frequently cache query results 11 Authoritative Name Servers Where is www.neu.edu? www.neu.edu = 155.33.17.68 www.neu.edu Northeastern Root edu Authority for ‘edu’ neu Authority for ‘neu.edu’ • Stores the nameď IP mapping for a given host 12 Step-2: Transport – TCP? • Reliable, in-ordered • Congestion-control + Flow-control – UDP? • Low over-head • Website use TCP, some interesting questions: – How does connection start-up? – What is Flow-Control? (helps avoid receiver problems) – Congestion-control? (helps avoid network problems) Establishing a Connection three –way handshake Connect Listen, Accept… Accept returns • Three-way handshake – Two sides agree on respective initial sequence nums • If no one is listening on port: server sends RST • If server is overloaded: ignore SYN • If no SYN-ACK: retry, timeout Step-2: Transport – TCP? • Reliable, in-ordered • Congestion-control + Flow-control – UDP? • Low over-head • Website use TCP, some interesting questions: – – – – How does connection start-up? What is Flow-Control? (helps avoid receiver problems) Congestion-control? (helps avoid network problems) How to set buffers Flow Control • We should not send more data than the receiver can take. • Receiver uses window header field to tell sender how much space it has Step-2: Transport – TCP? • Reliable, in-ordered • Congestion-control + Flow-control – UDP? • Low over-head • Website use TCP, some interesting questions: – – – – How does connection start-up? What is Flow-Control? (helps avoid receiver problems) Congestion-control? (helps avoid network problems) How to set buffers Congestion Control Window cwnd Timeout Init_ssthresh AIMD Timeout AIMD ssthresh Slow Start Slow Start Slow Start Time Congestion-Control • TCP has two states: – Slow Start (SS) – Congestion Avoidance (CA) • A window size threshold governs the state transition – Window <= threshold (ssthresh): slow start – Window > threshold (ssthresh): congestion avoidance – Threshold magically defined • States differ in how they respond to ACKs – Slow start: w = w + MSS – Congestion Avoidance: w = w + MSS2/w (1 MSS per RTT) Duplicate ACK example each segment contains 1460 bytes Receiver sends ACKs for the last inDropped Pkt order accepted packet. Seg2 re-transmitted after 3-dupacks Dup-Ack Dup-Ack ACK_5 after re-transmission acknowledges all packets Dup-Ack 20 Timeout (RTO) example 1K SeqNo=0 Wait for ACK … if no ACK then packet is lost AckNo=1024 1K SeqNo=1 024 How long to wait? **some function of RTT 1K SeqNo=2 048 1. duplicate AckNo=1024 1K SeqNo=3 072 2. duplicate AckNo=1024 1K SeqNo=4 096 3. duplicate AckNo=1024 1K SeqNo=1 024 1K SeqNo=5 120 21 TCP Response to Loss Slow Start • Triggered by a Timeout • W=1 • Ssthresh= W/2 • Switch to (SS) Fast Recovery • Triggered by 3 dup-acks • W = W/2 • Ssthresh= W/2 • Stay in (CA) Step-3: IP Routing • How to get Traffic from your browser to facebook’s (FB) server? – Determine network of FB’s IP. • In my local-area-network? Or in a different network? • Use Netmask!! – If in different network, route to it • Use IGP to route in an ISP – IGP = Distance Vector (RIP), Link-State (OSPF) • USE EGP to route between ISPs – EGP = BGP – Valley Free routing Compare your IP address with destination IP address • Source IP: 128.35.7.2 • You netmask is 128.35.7.*/24 so you network has: 128.35.7.0-128.35.7.255 Dest IP: 128.44.7.5 Destination is not in your network range, so you need to use your gateway router. GateWay == First Router that I’m connected to. Gateway: responds to DHCP and gives you an IPaddress and netmask Router C G Router B H G H IGP Protocols Link-State Distance Vector Flood messages from one neighbors to other neighbors Send forwarding table to neighbors Each router has whole topology (Scaling issues) Each router has local view of network (loop-issues) E.g. OSPF E.g. RIP BGP= Distance Vector+Path information • Distance vector algorithm with extra information – For each route, store the complete path (ASs) – No extra computation, just extra storage (and traffic) • Advantages – Can make policy choices based on set of ASs in path – Can easily avoid loops • Challenges: – – – – Convergence Traffic engineering: Load Balancing Scaling (route reflectors) Security Recall Bad Policies canBGP be costly Tier 1 ISP Tier 1 ISP Default free, Has information on every prefix Default: provider Tier 2 Regional $$ $$ $$ Tier 2 $$ Tier 2 $$ $$ Tier 3 (local) Tier 3 (local) Recall BGP: Realistic Example “Best Route” is not The shortest route Tier 1 ISP Tier 2 Tier 2 $10 Tier 2 Regional $$ $$ Tier 2 $20 $$ Tier 2 $$ $$ Tier 3 (local) Tier 3 (local) BGP Policies • Two mechanisms – Route export filters • Control what routes you send to neighbors – Route import ranking • Controls which route you prefer of those you hear. • The resulting paths must be Valley-free – Number links as(+1,0,-1) for provider, peer and customer – In any valid path should only see sequence of+1 , followed by at most one 0, followed by sequence of -1 IGP+EGP: Two types of BGP sessions 128.112.0.0/16 Next Hop = 192.0.2.1 128.112.0.0/16 iBGP AS23 eBGP 192.0.2.1 Forwarding Table destination next hop 192.0.2.0/30 AT&T Sprint AS23 10.10.10.10 + BGP (iBGP) destination next hop 128.112.0.0/16 192.0.2.1 Forwarding Table destination next hop 128.112.0.0/16 192.0.2.0/30 10.10.10.10 10.10.10.10 Step 3: Switching • How do you get packets to that first router? – Layer 2 switching: each switch makes local decision Router C G Router B H G H What Limitations Does Ethernet Have? • Switches use a very simple Forwarding policy – @ start-up: Flood the traffic on all interfaces – Traffic will go to all switches • Learning == loop problems when there’s a cycle – Spanning tree used to eliminate loops • Minimum Packet Size – 64Bytes (512 bits): To ensure that collisions are detected! – Bandwidth-Delay-Product (for a link) • Maximum Ethernet LAN size – 2500 meters: Due to signal decay, any longer and packets would not be delivered Router Versus Switches Router Switches • Runs multiple switching protocols: Ethernet, ATM • Runs one switching protocol – Switches between techs • Runs routing protocols • Runs DHCP • Needs a common address across techs: IP address – E.g. Ethernet addresses make no sense to ATM hosts – Can only work with same tech • Dictates how bits become signals • Dictates how bits becomes a packet/frame • Needs hardware addresses to identify hosts/switches Step-4: Link Layer and Framing Traffic • Frame= Act of putting bits on the link as a packet (frame) – Collision detection – Collision avoidance G Layers, Services, Protocols Application Service: user-facing application. Functions: Application specific Transport Service: multiplexing applications Functions: Connection establishment/termination, error control, flow control Network Service: move packets to any other node in the network Functions: Routing, addressing Link Service: move frames to other node across link. Functions: Framing, media access control, error checking Physical Service: move bits to other node across link Functions: Convert bits to singal