ECE/CS 372 – introduction to computer networks Lecture 5 Announcements: Assign 1 is due today Credit for lecture slides to Professor Bechir Hamdaoui Adapted from Jim Kurose & Keith Ross (original copyright) Chapter 3, slide: 1 Chapter 3: Transport Layer Our goals: understand principles behind transport layer services: reliable data transfer flow control congestion control learn about transport layer protocols in the Internet: UDP: connectionless transport TCP: connectionoriented transport TCP congestion control Chapter 3, slide: 2 Transport services and protocols provide logical communication between app processes running on different hosts transport protocols run in end systems send side: breaks app messages into segments, passes to network layer rcv side: reassembles segments into messages, passes to app layer more than one transport protocol available to apps Internet: TCP and UDP application transport network data link physical application transport network data link physical Chapter 3, slide: 3 What transport service does an app need? Data loss/reliability some apps (e.g., audio) can tolerate some loss other apps (e.g., file transfer, telnet) require 100% reliable data transfer Timing some apps (e.g., Internet telephony, interactive games) require low delay to be “effective” Bandwidth some apps (e.g., multimedia) require minimum amount of bandwidth to be “effective” other apps (“elastic apps”) make use of whatever bandwidth they get Security Privacy, authentication, integrity Chapter 3, slide: 4 Transport service requirements of common apps Application file transfer e-mail Web documents real-time audio/video Data loss Bandwidth Time Sensitive no loss no loss no loss loss-tolerant elastic elastic elastic audio: 5kbps-1Mbps video:10kbps-5Mbps same as above few kbps up elastic no no no yes, 100’s msec (telephone, video conf.) stored audio/video interactive games instant messaging loss-tolerant loss-tolerant no loss yes, few secs yes, 100’s msec yes and no Chapter 3, slide: 5 Internet transport-layer protocols TCP: reliable, in-order delivery congestion control flow control connection setup UDP: unreliable, unordered delivery extension of “besteffort” IP application transport network data link physical network data link physical network data link physical network data link physicalnetwork network data link physical data link physical network data link physical application transport network data link physical Chapter 3, slide: 6 What services do Internet transport protocols provide? TCP service: connection-oriented: setup required between client and server processes reliable transport between sending and receiving process flow control: sender won’t overwhelm receiver congestion control: throttle sender when network overloaded does not provide: timing, minimum bandwidth guarantees UDP service: unreliable data transfer between sending and receiving process does not provide: connection setup, reliability, flow control, congestion control, timing, or bandwidth guarantee Q: why bother? Why is there a UDP? Chapter 3, slide: 7 UDP: User Datagram Protocol [RFC 768] “best effort” service, UDP segments may be: lost delivered out of order to app connectionless: no handshaking between UDP sender, receiver each UDP segment handled independently of others Why is there a UDP? less delay: no connection establishment (which can add delay) simple: no connection state at sender, receiver less traffic: small segment header no congestion control: UDP can blast away as fast as desired Chapter 3, slide: 8 Transport vs. network layer network layer: logical communication between hosts transport layer: logical communication between processes Household case: 12 kids (East coast house) sending letters to 12 kids (West coast house) Ann is responsible for the house at East coast Bill is responsible for the house at West coast Postal service is responsible for between houses Household analogy: kids = processes/apps letters = messages houses = hosts home address = IP address kid names = port numbers Ann and Bill = transport protocol postal service = networklayer protocol Chapter 3, slide: 9 Chapter 3 outline Principles of reliable data transfer Connection-oriented Principles of congestion control TCP congestion control transport: TCP Chapter 3, slide: 10 Principles of reliable data transfer important in app., transport, link layers top-10 list of important networking topics! characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt) Chapter 3, slide: 11 Principles of reliable data transfer important in app., transport, link layers top-10 list of important networking topics! characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt) Chapter 3, slide: 12 Principles of reliable data transfer important in app., transport, link layers top-10 list of important networking topics! characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt) Chapter 3, slide: 13 Reliable data transfer: getting started rdt_send(): called from above, (e.g., by app.). Passed data to deliver to receiver upper layer send side udt_send(): called by rdt, to transfer packet over unreliable channel to receiver deliver_data(): called by rdt to deliver data to upper receive side rdt_rcv(): called when packet arrives on rcv-side of channel Chapter 3, slide: 14 Reliable data transfer: getting started We will: incrementally develop sender, receiver sides of reliable data transfer protocol (rdt) consider only unidirectional data transfer use finite state machines (FSM) to specify sender, receiver event causing state transition actions taken on state transition state: when in this “state” next state uniquely determined by next event state 1 event actions state 2 Chapter 3, slide: 15 rdt1.0: reliable transfer over a reliable channel underlying channel perfectly reliable no bit errors no loss of packets separate FSMs for sender, receiver: sender sends data into underlying channel receiver read data from underlying channel Wait for call from above rdt_send(data) packet = make_pkt(data) udt_send(packet) sender Wait for call from below rdt_rcv(packet) extract (packet,data) deliver_data(data) receiver Chapter 3, slide: 16 rdt2.0: channel with bit errors underlying channel may flip bits in packet Receiver can detect bit errors (e.g., use error detection) But still no packet loss questions: (1) how does sender know there is an error, and (2) what does it do when there is an error: acknowledgements: • positive ack (ACK): receiver tells sender that pkt received OK • negative ack (NAK): receiver tells sender that pkt had erros retransmission: sender retransmits pkt on receipt of NAK new mechanisms in rdt2.0 (beyond rdt1.0): error detection receiver feedback: control msgs (ACK,NAK) rcvr->sender assume ACK/NAK are error free Chapter 3, slide: 17 rdt2.0: operation with no errors rdt_send(data) snkpkt = make_pkt(data, checksum) udt_send(sndpkt) rdt_rcv(rcvpkt) && isNAK(rcvpkt) Wait for Wait for call from ACK or udt_send(sndpkt) above NAK rdt_rcv(rcvpkt) && isACK(rcvpkt) L sender receiver rdt_rcv(rcvpkt) && corrupt(rcvpkt) udt_send(NAK) Wait for call from below rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) extract(rcvpkt,data) deliver_data(data) udt_send(ACK) Chapter 3, slide: 19 rdt2.0: error scenario rdt_send(data) snkpkt = make_pkt(data, checksum) udt_send(sndpkt) rdt_rcv(rcvpkt) && isNAK(rcvpkt) Wait for Wait for call from ACK or udt_send(sndpkt) above NAK rdt_rcv(rcvpkt) && isACK(rcvpkt) L sender receiver rdt_rcv(rcvpkt) && corrupt(rcvpkt) udt_send(NAK) Wait for call from below rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) extract(rcvpkt,data) deliver_data(data) udt_send(ACK) Chapter 3, slide: 20 rdt2.0 has a fatal flaw! What happens if ACK/NAK is corrupted? That is, sender receives garbled ACK/NAK sender doesn’t know what happened at receiver! can’t sender just retransmit? Sure: sender retransmits current pkt if ACK/NAK garbled Any problem with this ?? Problem: duplicate Receiver doesn’t know whether received pkt is a retransmit or a new pkt Handling duplicates: sender adds sequence number to each pkt receiver discards (doesn’t deliver up) duplicate pkt stop and wait Sender sends one packet, then waits for receiver response Chapter 3, slide: 21 rdt2.1: sender, handles garbled ACK/NAKs rdt_send(data) sndpkt = make_pkt(0, data, checksum) udt_send(sndpkt) rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) || Wait for Wait for isNAK(rcvpkt) ) ACK or call 0 from udt_send(sndpkt) NAK 0 above rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt) rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt) L rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) || isNAK(rcvpkt) ) udt_send(sndpkt) L Wait for ACK or NAK 1 Wait for call 1 from above rdt_send(data) sndpkt = make_pkt(1, data, checksum) udt_send(sndpkt) Chapter 3, slide: 22 rdt2.1: receiver, handles garbled ACK/NAKs rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq0(rcvpkt) rdt_rcv(rcvpkt) && (corrupt(rcvpkt) extract(rcvpkt,data) deliver_data(data) sndpkt = make_pkt(ACK, chksum) udt_send(sndpkt) rdt_rcv(rcvpkt) && (corrupt(rcvpkt) sndpkt = make_pkt(NAK, chksum) udt_send(sndpkt) rdt_rcv(rcvpkt) && not corrupt(rcvpkt) && has_seq1(rcvpkt) sndpkt = make_pkt(ACK, chksum) udt_send(sndpkt) sndpkt = make_pkt(NAK, chksum) udt_send(sndpkt) Wait for 0 from below Wait for 1 from below rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq1(rcvpkt) rdt_rcv(rcvpkt) && not corrupt(rcvpkt) && has_seq0(rcvpkt) sndpkt = make_pkt(ACK, chksum) udt_send(sndpkt) extract(rcvpkt,data) deliver_data(data) sndpkt = make_pkt(ACK, chksum) udt_send(sndpkt) Chapter 3, slide: 23 rdt2.1: discussion Sender: seq # added to pkt two seq. #’s (0,1) will suffice. Why? must check if received ACK/NAK corrupted twice as many states state must “remember” whether “current” pkt has 0 or 1 seq. # Receiver: must check if received packet is duplicate state indicates whether 0 or 1 is expected pkt seq # note: receiver can not know if its last ACK/NAK received OK at sender Chapter 3, slide: 24 rdt2.2: a NAK-free protocol do we really need NAKs?? instead of NAK, receiver sends ACK for last pkt received OK receiver must explicitly include seq # of pkt being ACKed duplicate ACK at sender results in same action as NAK: retransmit current pkt rdt2.2: same functionality as rdt2.1, using ACKs only Chapter 3, slide: 25 rdt3.0: channels with errors and loss New assumption: packet may be lost: underlying channel can also lose packets (data or ACKs) checksum, seq. #, ACKs, retransmissions will be of help, but not enough What else is needed? Approach: timeout policy: sender waits “reasonable” amount of time for ACK retransmits if no ACK received in this time if pkt (or ACK) just delayed (not lost): retransmission will be duplicate, but use of seq. #’s already handles this receiver must specify seq # of pkt being ACKed requires countdown timer Chapter 3, slide: 26 rdt3.0 in action (still stop-n-wait w/ (0,1) sn) Chapter 3, slide: 27 rdt3.0 in action (still stop-n-wait w/ (0,1) sn) Chapter 3, slide: 28 Performance of rdt3.0: stop-n-wait rdt3.0 works, but performance stinks example: R=1 Gbps, RTT=30 ms, L=1000Byte packet: sender receiver first packet bit transmitted, t = 0 last packet bit transmitted, t = L / R RTT first packet bit arrives last packet bit arrives, send ACK ACK arrives, send next packet, t = RTT + L / R Ttransmit = L (packet length in bits) 8.103 b/pkt = R (transmission rate, bps) 109 b/sec = 8 microsec Chapter 3, slide: 29 Performance of rdt3.0: stop-n-wait rdt3.0 works, but performance stinks example: R=1 Gbps, RTT=30 ms, L=1000Byte packet: sender receiver first packet bit transmitted, t = 0 last packet bit transmitted, t = L / R first packet bit arrives last packet bit arrives, send ACK RTT ACK arrives, send next packet, t = RTT + L / R U sender: utilization – fraction of time sender busy sending U sender = L/R RTT + L / R = .008 30.008 = 0.00027 microsec Chapter 3, slide: 30 onds Performance of rdt3.0: stop-n-wait rdt3.0 works, but performance stinks example: R=1 Gbps, RTT=30 ms, L=1000Byte packet: sender receiver first packet bit transmitted, t = 0 last packet bit transmitted, t = L / R RTT first packet bit arrives last packet bit arrives, send ACK ACK arrives, send next packet, t = RTT + L / R 1KB pkt every 30 msec -> 33kB/sec thruput over 1 Gbps link network protocol limits use of physical resources! Chapter 3, slide: 31 Pipelining: increased utilization sender receiver first packet bit transmitted, t = 0 last bit transmitted, t = L / R first packet bit arrives last packet bit arrives, send ACK last bit of 2nd packet arrives, send ACK last bit of 3rd packet arrives, send ACK RTT ACK arrives, send next packet, t = RTT + L / R Increase utilization by a factor of 3! U sender = 3*L/R RTT + L / R = .024 30.008 = 0.0008 microsecon ds Question: What is the link utilization Usender Chapter 3, slide: 32 Pipelined protocols Pipelining: sender allows multiple, “in-flight”, yet-to-be-ACK’ed pkts What about the range of sequence numbers then?? What about buffering at receiver?? Chapter 3, slide: 33 ECE/CS 372 – introduction to computer networks Lecture 6 Announcements: Assignment 1 Solutions Posted Credit for lecture slides to Professor Bechir Hamdaoui Adapted from Jim Kurose & Keith Ross (original copyright) Chapter 3, slide: 34 So far: rdt3.0 Acknowledgment and retransmission Reliable delivery Sequence numbers Duplication detection Timeout and retransmit Deal with packet losses Stop-n-wait one packet at a time Problem: efficiency Chapter 3, slide: 35 Performance of rdt3.0: stop-n-wait rdt3.0 works, but performance is terribad example: R=1 Gbps, RTT=30 ms, L=1000Byte packet: sender receiver first packet bit transmitted, t = 0 last packet bit transmitted, t = L / R first packet bit arrives last packet bit arrives, send ACK RTT ACK arrives, send next packet, t = RTT + L / R U sender: utilization – fraction of time sender busy sending U sender = L/R RTT + L / R = .008 30.008 = 0.00027 microsec Chapter 3, slide: 36 onds Performance of rdt3.0: stop-n-wait rdt3.0 works, but performance stinks example: R=1 Gbps, RTT=30 ms, L=1000Byte packet: sender receiver first packet bit transmitted, t = 0 last packet bit transmitted, t = L / R RTT first packet bit arrives last packet bit arrives, send ACK ACK arrives, send next packet, t = RTT + L / R 1KB pkt every 30 msec -> 33kB/sec thruput over 1 Gbps link network protocol limits use of physical resources! Chapter 3, slide: 37 Pipelined protocols Pipelining: sender allows multiple, “in-flight”, yet-to-be-ACK’ed pkts What about the range of sequence numbers then?? What about buffering at receiver?? Two generic forms of pipelined protocols: go-Back-N and selective repeat Chapter 3, slide: 38 Go-Back-N Sender side ‘N’ (or W) packets can be sent w/o being ACK’ed Fills “sliding window” Single timer: oldest non-ACK’ed packet At timeout(n), retransmit packet n and all successive packets in window Receiver side Cumulative ACK Discard out-of-order packets, no buffering Chapter 3, slide: 39 Selective Repeat Sender side ‘N’ (or W) packets can be sent w/o being ACK’ed, Fills “sliding window” ‘N’ (or W) timers: every packet in window At timeout(n), retransmit packet n ACK for bottom of window? Advance window to the next unACK’ed packet Receiver side Selective ACK ACK and Buffer out-of-order segments (if in window) Discard out-of-order packets (if ahead of window) ACK and Discard out-of-order packets (if behind window) In-order packet? ACK and deliver, then deliver all sequential buffered packets and advance window! Chapter 3, slide: 40 Selective repeat in action GBN? GBN? GBN? Chapter 3, slide: 41 GBN/SR comparison Go-Back-N Selective Repeat Sliding window Sliding window W packets in the pipe Cumulative ACK Single timer (oldest unACK’ed packet) Discard out-of-order packets Retransmit all successive packets at timeout W packets in the pipe Selective ACK One timer for each packet Buffer out-of-order packets (within window) Retransmit only lost or delayed packet at timeout Chapter 3, slide: 42 Selective repeat: dilemma Example: Seq #’s: 0, 1, 2, 3 Window Size = 3 Receiver sees no difference in two scenarios! Even though, (a) is a retransmitted packet (b) is a new packet in (a), receiver incorrectly passes old data as new Q: What relationship between sequence # size and window size to avoid duplication problem?? Chapter 3, slide: 43 Duplication problem: illustration Let’s assume window size W = 2 for illustration Consider a scenario where: Only ACK 1 is lost Next, assume the Sequence Number (SN) space n = W = 2 That is, SN pattern is 0,1,0,1,0,1,… Chapter 3, slide: 44 Duplication problem: illustration Scenario: n=W=2; only ACK 1 is lost Receiver thinks of retransmission of 2nd Packet (Pkt1, SN=1) as a 4th/new Pkt (Pkt3,SN=1) Duplication detection problem! Chapter 3, slide: 45 Duplication problem: illustration Recap: When window size (W) = Sequence Number space (n), there is a duplication problem receiver can’t tell a new transmission from a retransmission of a lost packet Let’s now see what happens if we increase n by 1: Now n=3 and SN pattern is: 0,1,2,0,1,2,0… Let’s revisit the scenario again Chapter 3, slide: 46 Duplication problem: illustration Scenario: n=3;W=2; ACK 1 is lost Receiver still thinks of retransmission of 2nd Packet (Pkt1,SN=1) as a new Pkt (Pkt3,SN=1) Increasing n by 1 didn’t solve the duplication detection problem this time!! Chapter 3, slide: 47 Duplication problem: illustration Recap: Increasing SN space n from 2 to 3 didn’t solve the duplication problem yet! Let’s now see what happens if we again increase n by 1: Now n=4 and SN pattern is: 0,1,2,3,0,1,2,3,0… Let’s revisit the scenario with n=4 and see Chapter 3, slide: 48 Duplication problem: illustration Scenario: n=4;W=2; ACK 1 is lost Receiver now drops this retransmission of 2nd Packet (Pkt1, SN=1) since its SN falls outside the window Duplication detection problem is now solved when n=2W=4 Chapter 3, slide: 49 Duplication problem: illustration Recap: Increasing SN space n from 2 to 4 did solve the duplication problem That is, when SN space = twice window size (n = 2W) the duplication problem was solved Hence, in this case: when W ≤ n/2, the problem is solved Can we generalize? Yes, W ≤ n/2 must hold in order to avoid duplication problem Chapter 3, slide: 50 Sequence Number space n & window size W relationship: general case seqNbr space n-1 Sender: Suppose ½ n < W < n W segments with SN in [0, W-1] are sent, but their ACKs are not received yet 0 1 W-1 w pkts already sent, but ACKs not received yet = sender’s sliding window Sequence Number space n & window size W relationship: general case Receiver: Now assume: seqNbr expected by receiver receiver received all W segments (SN in [0,W-1]) n-1 0 1 receiver sent all W ACKs, one for each received segment receiver slides its window: Sliding window (SN of expected segments) from: W to x = 2W-n-1 x=2W-n-1 W Hint: (n-1-W+1) + (x-0+1)=W x=2W–n-1 Chapter 3, slide: 52 Sequence Number space n & window size W relationship: general case Consider worst case scenario: all W ACKs are lost Sender will re-transmit all the first W SN expected by receiver n-1 segments; i.e., with SN in [0,W-1] SN overlap 0 But receiver is expecting segments with SN in [W,n-1] U [0,2W-n-1] Retransmissions with SN falling within [0,2W-n-1] will then be interpreted by receiver as new transmissions ( dup problem) To avoid dup problem, we must then have: 2W-n-1 ≤ -1 W ≤ ½ n x=2W-n-1 W SN retransmitted by sender ECE/CS 372 – introduction to computer networks Lecture 7 Announcements: Assignment 2 and Lab 2 are posted Credit for lecture slides to Professor Bechir Hamdaoui Adapted from Jim Kurose & Keith Ross (original copyright) Chapter 3, slide: 54 Chapter 3 outline Principles of reliable data transfer Connection-oriented Principles of congestion control TCP congestion control transport: TCP Chapter 3, slide: 55 TCP Round Trip Time (RTT) and Timeout Why need to estimate RTT? “timeout” and “retransmit” needed to address packet loss need to know when to timeout and retransmit Ideal world: exact RTT is needed Real world: RTTs change over time: Some intuition What happens if too short? Premature timeout Unnecessary retransmissions What happens if too long? Slow reaction to segment loss packets may take different paths network load changes over time RTTs can only be estimated Chapter 3, slide: 56 TCP Round Trip Time (RTT) and Timeout Technique: Exponential Weighted Moving Average (EWMA) EstimatedRTT(current) = (1-)*EstimatedRTT(previous) + *SampleRTT(recent) 0 < < 1; typical value: = 0.125 SampleRTT: measured time from segment transmission until ACK receipt current value of RTT Ignore retransmission EstimatedRTT: estimated based on past & present; smoother than SampleRTT to be used to set timeout period Chapter 3, slide: 57 TCP Round Trip Time (RTT) and Timeout Technique: Exponential Weighted Moving Average (EWMA) EstimatedRTT(current) = (1-)*EstimatedRTT(previous) + *SampleRTT(recent) 0 < < 1; typical value: = 0.125 Illustration: Assume: we received n RTT samples so far: Let’s order them as: n is the most recent one, n-1 is the 2nd most recent, etc. EstimatedRTT(n) = (1-)*EstimatedRTT(n-1) + *SampleRTT(n) (EstimatedRTT(n) : estimated RTT after receiving ACK of nth Packet.) Example: Supppse 3 ACKs returned with SampleRTT(1), SampleRTT(2), SampleRTT(3) Question: What would be EstimatedRTT after receiving the 3 ACKs ? Assume: EstimatedRTT(0) = 2 seconds (initial/default value) Chapter 3, slide: 58 TCP Round Trip Time (RTT) and Timeout Technique: Exponential Weighted Moving Average (EWMA) EstimatedRTT(current) = (1-)*EstimatedRTT(previous) + *SampleRTT(recent) 0 < < 1; typical value: = 0.125 What happens if is too small (say very close 0): A sudden, real change in network load does not get reflected in EstimatedRTT fast enough May lead to under- or overestimation of RTT for a long time What happens if is too large(say very close 1): Transient fluctuations/changes in network load affects EstimatedRTT and makes it unstable when it should not Also leads to under- or overestimation of RTT Chapter 3, slide: 59 Example RTT estimation: RTT: gaia.cs.umass.edu to fantasia.eurecom.fr 350 RTT (milliseconds) 300 250 200 150 100 1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106 time (seconnds) SampleRTT Estimated RTT Chapter 3, slide: 60 TCP Round Trip Time (RTT) and Timeout Setting the timeout timeout = EstimtedRTT, any problem with this??? add a “safety margin” to EstimtedRTT large variation in EstimatedRTT -> larger safety margin see how much SampleRTT deviates from EstimatedRTT: DevRTT = (1-)*DevRTT + *|SampleRTT-EstimatedRTT| (typically, = 0.25) Then set timeout interval: TimeoutInterval = EstimatedRTT + 4*DevRTT Chapter 3, slide: 61 TCP: Overview point-to-point: one sender, one receiver reliable, in-order byte stream: no “message boundaries” pipelined: TCP congestion and flow control set window size send & receive buffers socket door application writes data application reads data TCP send buffer TCP receive buffer RFCs: 793, 1122, 1323, 2018, 2581 full duplex data: bi-directional data flow in same connection MSS: maximum segment size connection-oriented: handshaking (exchange of control msgs) init’s sender, receiver state before data exchange flow controlled: sender will not socket door overwhelm receiver segment Chapter 3, slide: 62 TCP: a reliable data transfer TCP creates rdt service on top of IP’s unreliable service Pipelined segments Cumulative acks TCP uses single retransmission timer Retransmissions are triggered by: timeout events duplicate acks Initially consider simplified TCP sender: ignore duplicate acks ignore flow control, congestion control Chapter 3, slide: 63 TCP sender events: Data received from app: Create segment with Seq # Seq # is byte-stream number of first data byte in segment start timer if not already running (think of timer as for “oldest unACK’ed segment”) expiration interval: TimeOutInterval Timeout: retransmit segment that caused timeout restart timer Ack received: If it acknowledges previously unACK’ed segments update what is known to be ACK’ed start timer if there are outstanding segments Chapter 3, slide: 64 TCP Seq. #’s and ACKs Seq. #’s: byte stream “number” of first byte in segment’s data ACKs: Seq # of next byte expected from other side cumulative ACK Host A User types ‘C’ Host B host ACKs receipt of ‘C’, echoes back ‘C’ host ACKs receipt of echoed ‘C’ simple telnet scenario time Chapter 3, slide: 65 TCP: retransmission scenarios Host A X loss Host B Seq=92 timeout Host B Seq=92 timeout timeout Host A time lost ACK scenario time premature timeout Chapter 3, slide: 66 TCP retransmission scenarios (more) timeout Host A Host B X loss time Cumulative ACK scenario Chapter 3, slide: 67 TCP ACK generation [RFC 1122, RFC 2581] Event at Receiver TCP Receiver action Arrival of in-order segment with expected seq #. All data up to expected seq # already ACKed Delayed ACK. Wait up to 500ms for next segment. If no next segment, send ACK Arrival of in-order segment with expected seq #. One other segment has ACK pending Immediately send single cumulative ACK, ACKing both in-order segments Arrival of out-of-order segment higher-than-expect seq. # . Gap detected Immediately send duplicate ACK, indicating seq. # of next expected byte Arrival of segment that partially or completely fills gap Immediate send ACK, provided that segment starts at lower end of gap Chapter 3, slide: 68 Fast Retransmit client Suppose: Packet 0 gets lost Q: When will retransmission of Packet 0 happen? Why does it happen at that time? A: typically at t1 ; we think it is lost when timer expires Can we do better?? server Timer is set at t0 Think of what means to receive many duplicate ACKs: it means Packet 0 is lost Why wait until timeout when we know packet 0 is lost? => Fast retransmit => better performance Why several dupl. ACKs, not 1 or 2? Think of what happens when pkt0 arrives after pkt1 (delayed, not lost) Think of what happens when pkt0 arrives after pkt1 & 2, etc. Timer expires at t1 Chapter 3, slide: 69 Fast Retransmit: recap Receipt of duplicate ACKs indicate loss of segments Sender often sends many segments back-toback If segment is lost, there will likely be many duplicate ACKs. This is how TCP works: If sender receives 3 ACKs for the same data, it supposes that segment after ACK’ed data was lost: Fast Retransmit: resend segment before timer expires better performance Chapter 3, slide: 70 Review questions Problem: Host A Host B TCP connection between A and B B received up to 248 bytes A sends back-to-back 2 segments to B with 40 and 60 bytes B ACKs every packet it receives Q1: What is Seq# in 1st and 2nd segments from A to B? Q2: Suppose: 1st segment gets to B first. What is ACK # in 1st ACK? Chapter 3, slide: 71 Review questions Problem: Host A Host B TCP connection between A and B B received up to 248 bytes A sends back-to-back 2 segments to B with 40 and 60 bytes B ACKs every packet it receives Q1: What is Seq# in 1st and 2nd segments from A to B? Q2: Suppose: 1st segment gets to B first. What is ACK # in 1st ACK? Q3: Suppose: 2nd segment gets to B first. What is ACK # in 1st ACK? What is ACK # in 2nd ACK? Chapter 3, slide: 72 Review questions Problem: Host A Host B TCP connection between A and B B received up to 248 bytes A sends back-to-back 2 segments to B with 40 and 60 bytes B ACKs every packet it receives Q1: What is Seq# in 1st and 2nd segments from A to B? Q2: Suppose: 1st segment gets to B first. What is ACK # in 1st ACK? Q3: Suppose: 2nd segment gets to B first. What is ACK # in 1st ACK? What is ACK # in 2nd ACK? Chapter 3, slide: 73 Review questions Problem: TCP connection between A and B B received up to 248 bytes A sends back-to-back 2 segments to B with 12 and 20 bytes, respectively B ACKs every packet it receives Now suppose: - 2 segments get to B in order. - 1st ACK is lost - 2nd ACK arrives after timeout Question: Fill out all packet Seq #’s, and ACKs’ ACK #’s in the timing diagram. Chapter 3, slide: 74 ECE/CS 372 – introduction to computer networks Lecture 8 Announcements: Assignment 2 due next Tuesday Lab 2 due next Tuesday Midterm exam: next Thursday Credit for lecture slides to Professor Bechir Hamdaoui Adapted from Jim Kurose & Keith Ross (original copyright) Chapter 3, slide: 75 Chapter 3 outline Principles of reliable data transfer Connection-oriented Principles of congestion control TCP congestion control transport: TCP Chapter 3, slide: 76 Principles of Congestion Control Oct. ’86: LBL (Lawrence Berkeley Lab) to UC-Berkeley: drop from 32kbps to 40bps (400 yards, two IMP hops) Congestion Collapse! Cause: end systems are sending too much data too fast for network/routers to handle Manifestations: lost/dropped packets (buffer overflow at routers) long delays (queueing in router buffers) a top-10 problem! Chapter 3, slide: 77 Approaches towards congestion control Two broad approaches toward congestion control: End-end congestion control: no explicit feedback from network congestion inferred from end-system observed loss, delay approach taken by TCP Network-assisted congestion control: routers provide feedback to end systems single bit indicating congestion (DECbit, TCP/IP ECN) explicit rate sender should send at One restriction on sliding window: CongWin Chapter 3, slide: 78 TCP congestion control Keep in mind: Too slow: under-utilization => waste of network resources by not using them! Too fast: over-utilization => waste of network resources by congesting them! Challenge is then: Not too slow, nor too fast!! Approach: Increase slowly the sending rates to probe for usable bandwidth Decrease the sending rates when congestion is observed => Additive-increase, multiplicative decrease (AIMD) Chapter 3, slide: 79 TCP congestion control: AIMD Additive-increase, multiplicative decrease (AIMD) (also called “congestion avoidance”) additive increase: increase CongWin by 1 MSS every RTT until loss detected (MSS = Max Segment Size) multiplicative decrease: cut CongWin in half after loss congestion window size congestion window 24 Kbytes 16 Kbytes Saw tooth behavior: probing for bandwidth 8 Kbytes time time Chapter 3, slide: 80 TCP Congestion Control: details sender limits transmission: LastByteSent-LastByteAcked CongWin Roughly, rate = How does sender perceive congestion? loss event timeout or 3 duplicate ACKs TCP sender reduces rate CongWin Bytes/sec RTT CongWin is dynamic, function of perceived network congestion (CongWin) after loss event AIMD increases window by 1 MSS every RTT Improvements: AIMD, any problem?? Think of the start of connections Solution: start a little faster, and then slow down =>“slow-start” Chapter 3, slide: 81 TCP Slow Start When connection begins, TCP addresses this via CongWin = 1 MSS Slow-Start mechanism Example: MSS = 500 Bytes When connection begins, RTT = 200 msec increase rate initial rate = 20 kbps exponentially fast available bandwidth may be >> MSS/RTT desirable to quickly ramp up to respectable rate When loss of packet occurs (indicates congestion), then slow down Chapter 3, slide: 82 TCP Slow Start (more) How it is done Host A begins, increase rate exponentially until first loss event: RTT When connection Host B double CongWin every RTT done by incrementing CongWin for every ACK received Summary: initial rate is slow but ramps up exponentially fast time Chapter 3, slide: 83 Refinement: TCP Tahoe Question: When should exponential increase (Slow-Start) switch to linear (AIMD)? Here is how it works: Define a variable, called Threshold Start “Slow-Start” (i.e., CongWin =1 and double CongWin every RTT) At 1st loss event (timeout or 3 Dup ACKs), Set Threshold = ½ current CongWin Start “Slow-Start” (i.e., CongWin = 1 and double CongWin every RTT) When CongWin =Threshold: Switch to AIMD (linear) At any loss event (timeout or 3 Dup ACKs): Set Threshold = 1/2 CongWin Start “Slow-Start” (i.e. start over with CongWin = 1) Chapter 3, slide: 84 Refinement: TCP Tahoe Question: When should exponential increase (Slow-Start) switch to linear (AIMD)? CongWin Chapter 3, slide: 85 More refinement: TCP Reno Loss event: Timeout vs. dup ACKs 3 dup ACKs: fast retransmit client server Timer is set at t0 Timer expires at t1 Chapter 3, slide: 86 More refinement: TCP Reno Loss event: Timeout vs. dup ACKs 3 dup ACKs: fast retransmit Timeout: retransmit client server Timer is set at t0 Any difference (think congestion) ?? 3 dup ACKs indicate network still capable of delivering some segments after a loss Timeout indicates a “more” alarming congestion scenario Timer expires at t1 Chapter 3, slide: 87 More refinement: TCP Reno TCP Reno treats “3 dup ACKs” different from “timeout” How does TCP Reno work? After 3 dup ACKs: CongWin is cut in half congestion avoidance (window grows linearly) But after timeout event: CongWin instead set to 1 MSS; Slow-Start (window grows exponentially) Chapter 3, slide: 88 Summary: TCP Congestion Control When CongWin is below Threshold, sender in slow-start phase, window grows exponentially. When CongWin is above Threshold, sender is in congestion-avoidance phase, window grows linearly. When a triple duplicate ACK occurs, Threshold set to CongWin/2 and CongWin set to Threshold. When timeout occurs, Threshold set to CongWin/2 and CongWin is set to 1 MSS. Chapter 3, slide: 89 TCP Flow Control receive side of TCP connection has a receive buffer: flow control sender won’t overflow receiver’s buffer by transmitting too much, too fast speed-matching app process may be slow at reading from buffer service: matching the send rate to the receiving app’s drain rate One more restriction on sliding window: RcvWindow Chapter 3, slide: 90 TCP Flow control: how it works Rcvr advertises spare room by including value of RcvWindow in segments spare room in buffer = RcvWindow = RcvBuffer-[LastByteRcvd LastByteRead] Sender limits unACKed data to RcvWindow guarantees receive buffer doesn’t overflow Chapter 3, slide: 91 Average throughput of TCP Avg. throughout as a function of window size W and RTT? Ignore Slow-Start Let W, the window size when loss occurs, be constant When window is W, throughput is ?? throughput(high) = W/RTT Just after loss, window drops to W/2, throughput is ?? throughput(low) = W/(2RTT). Throughput then increases linearly from W/(2RTT) to W/RTT Hence, average throughout = 0.75 W/RTT Chapter 3, slide: 92 TCP Fairness Fairness goal: if K TCP sessions share same bottleneck link of bandwidth R, each should have average rate of R/K TCP connection 1 TCP connection 2 bottleneck router capacity R Chapter 3, slide: 93 Fairness (more) Fairness and parallel TCP connections nothing prevents app from opening parallel conections between 2 hosts. Web browsers do this Example: link of rate R supporting 9 connections; 9 TCPs belong to same host, each getting R/9 new app asks for 1 TCP, gets rate R/10 One host gets 9/10 of R and one gets 1/10 of R! Chapter 3, slide: 94 Question: is TCP fair? Again assume two competing sessions only, and consider AIMD: Additive increase gives slope of 1, as throughout increases Multiplicative decrease halves throughput proportionally, when congestion occurs TCP connection 1 TCP connection 2 bottleneck router capacity R Chapter 3, slide: 95 Question: is TCP fair? Again assume two competing sessions only, and consider AIMD: Additive increase gives slope of 1, as throughout increases Multiplicative decrease halves throughput proportionally, when congestion occurs Question: How would (R1, R2) vary R equal bandwidth share when AIMD is used? Does it converge to equal share? loss: decrease window by factor of 2 congestion avoidance: additive increase loss: decrease window by factor of 2 congestion avoidance: additive increase Connection 1 throughput R Chapter 3, slide: 96 Question? Again assume two competing sessions only: Additive increase gives slope of 1, as throughout increases Constant/equal decrease instead of multiplicative decrease!!! (Call this scheme: AIED) R equal bandwidth share Question: How would (R1, R2) vary when AIED is used instead of AIMD? Is it fair? Does it converge to equal share? (R1, R2) Connection 1 throughput R Chapter 3, slide: 97