ECE/CS 372 – introduction to computer networks Lecture 7 Announcements: HW1 is due today LAB2 is due tomorrow Acknowledgement: slides drawn heavily from Kurose & Ross Chapter 3, slide: 1 Chapter 3: Transport Layer Our goals: understand principles behind transport layer services: reliable data transfer flow control congestion control learn about transport layer protocols in the Internet: UDP: connectionless transport TCP: connectionoriented transport TCP congestion control Chapter 3, slide: 2 Transport services and protocols provide logical communication between app processes running on different hosts transport protocols run in end systems send side: breaks app messages into segments, passes to network layer rcv side: reassembles segments into messages, passes to app layer more than one transport protocol available to apps Internet: TCP and UDP application transport network data link physical application transport network data link physical Chapter 3, slide: 3 Transport vs. network layer network layer: logical communication between hosts transport layer: logical communication between processes Household case: 12 kids (East coast house) sending letters to 12 kids (West coast house) Ann is responsible for the house at East coast Bill is responsible for the house at West coast Postal service is responsible for between houses Household analogy: kids = processes letters = messages houses = hosts home address = IP address kid names = port numbers Ann and Bill = transport protocol postal service = networklayer protocol Chapter 3, slide: 4 Internet transport-layer protocols reliable, in-order delivery (TCP) congestion control flow control connection setup unreliable, unordered delivery: UDP no-frills extension of “best-effort” IP services not available: delay guarantees bandwidth guarantees application transport network data link physical network data link physical network data link physical network data link physicalnetwork network data link physical data link physical network data link physical application transport network data link physical Chapter 3, slide: 5 UDP: User Datagram Protocol [RFC 768] “best effort” service, UDP segments may be: lost delivered out of order to app connectionless: no handshaking between UDP sender, receiver each UDP segment handled independently of others Why is there a UDP? less delay: no connection establishment (which can add delay) simple: no connection state at sender, receiver less traffic: small segment header no congestion control: UDP can blast away as fast as desired Chapter 3, slide: 6 UDP: more often used for streaming multimedia apps loss tolerant rate sensitive Length, in bytes of UDP segment, including header other UDP uses DNS SNMP reliable transfer over UDP: add reliability at application layer application-specific error recovery! 32 bits source port # dest port # length checksum Application data (message) UDP segment format Chapter 3, slide: 7 UDP checksum Goal: detect “errors” (e.g., flipped bits) in transmitted segment Sender: Receiver: treat segment contents compute checksum of as sequence of 16-bit integers checksum: addition (1’s complement sum) of segment contents sender puts checksum value into UDP checksum field received segment check if computed checksum equals checksum field value: NO - error detected YES - no error detected. But maybe errors nonetheless? …. Chapter 3, slide: 8 Internet Checksum Example Note When adding numbers, a carryout from the most significant bit needs to be added to the result Example: add two 16-bit integers 1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 wraparound 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 sum 1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0 checksum 1 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1 Chapter 3, slide: 9 Chapter 3 outline Principles of reliable data transfer Connection-oriented Principles of congestion control TCP congestion control transport: TCP Chapter 3, slide: 10 Principles of Reliable data transfer important in app., transport, link layers top-10 list of important networking topics! characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt) Chapter 3, slide: 11 Principles of Reliable data transfer important in app., transport, link layers top-10 list of important networking topics! characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt) Chapter 3, slide: 12 Principles of Reliable data transfer important in app., transport, link layers top-10 list of important networking topics! characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt) Chapter 3, slide: 13 Reliable data transfer: getting started rdt_send(): called from above, (e.g., by app.). Passed data to deliver to receiver upper layer send side udt_send(): called by rdt, to transfer packet over unreliable channel to receiver deliver_data(): called by rdt to deliver data to upper receive side rdt_rcv(): called when packet arrives on rcv-side of channel Chapter 3, slide: 14 Reliable data transfer: getting started We will: incrementally develop sender, receiver sides of reliable data transfer protocol (rdt) consider only unidirectional data transfer use finite state machines (FSM) to specify sender, receiver event causing state transition actions taken on state transition state: when in this “state” next state uniquely determined by next event state 1 event actions state 2 Chapter 3, slide: 15 rdt1.0: reliable transfer over a reliable channel underlying channel perfectly reliable no bit errors no loss of packets separate FSMs for sender, receiver: sender sends data into underlying channel receiver read data from underlying channel Wait for call from above rdt_send(data) packet = make_pkt(data) udt_send(packet) sender Wait for call from below rdt_rcv(packet) extract (packet,data) deliver_data(data) receiver Chapter 3, slide: 16 rdt2.0: channel with bit errors underlying channel may flip bits in packet Receiver can detect bit errors (e.g., use checksum) But still no packet loss questions: (1) how does sender know and (2) what does it do when packet is erroneous: acknowledgements: • positive ack (ACK): receiver tells sender that pkt received OK • negative ack (NAK): receiver tells sender that pkt had erros retransmission: sender retransmits pkt on receipt of NAK new mechanisms in rdt2.0 (beyond rdt1.0): error detection receiver feedback: control msgs (ACK,NAK) rcvr->sender assume ACK/NAK are error free Chapter 3, slide: 17 rdt2.0: operation with no errors rdt_send(data) snkpkt = make_pkt(data, checksum) udt_send(sndpkt) rdt_rcv(rcvpkt) && isNAK(rcvpkt) Wait for Wait for call from ACK or udt_send(sndpkt) above NAK rdt_rcv(rcvpkt) && isACK(rcvpkt) L sender receiver rdt_rcv(rcvpkt) && corrupt(rcvpkt) udt_send(NAK) Wait for call from below rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) extract(rcvpkt,data) deliver_data(data) udt_send(ACK) Chapter 3, slide: 19 : errordt2.0r scenario rdt_send(data) snkpkt = make_pkt(data, checksum) udt_send(sndpkt) rdt_rcv(rcvpkt) && isNAK(rcvpkt) Wait for Wait for call from ACK or udt_send(sndpkt) above NAK rdt_rcv(rcvpkt) && isACK(rcvpkt) L rdt_rcv(rcvpkt) && corrupt(rcvpkt) udt_send(NAK) Wait for call from below rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) extract(rcvpkt,data) deliver_data(data) udt_send(ACK) Chapter 3, slide: 20 rdt2.0 has a fatal flaw! What happens if ACK/NAK is corrupted? That is, sender receives garbled ACK/NAK sender doesn’t know what happened at receiver! can’t sender just retransmit? Sure: sender retransmits current pkt if ACK/NAK garbled Any problem with this ?? Problem: duplicate Receiver doesn’t know whether received pkt is a retransmit or a new pkt Handling duplicates: sender adds sequence number to each pkt receiver discards (doesn’t deliver up) duplicate pkt stop and wait Sender sends one packet, then waits for receiver response Chapter 3, slide: 21 rdt2.1: sender, handles garbled ACK/NAKs rdt_send(data) sndpkt = make_pkt(0, data, checksum) udt_send(sndpkt) rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) || Wait for Wait for isNAK(rcvpkt) ) ACK or call 0 from udt_send(sndpkt) NAK 0 above rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt) rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt) L rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) || isNAK(rcvpkt) ) udt_send(sndpkt) L Wait for ACK or NAK 1 Wait for call 1 from above rdt_send(data) sndpkt = make_pkt(1, data, checksum) udt_send(sndpkt) Chapter 3, slide: 22 rdt2.1: receiver, handles garbled ACK/NAKs rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq0(rcvpkt) rdt_rcv(rcvpkt) && (corrupt(rcvpkt) extract(rcvpkt,data) deliver_data(data) sndpkt = make_pkt(ACK, chksum) udt_send(sndpkt) rdt_rcv(rcvpkt) && (corrupt(rcvpkt) sndpkt = make_pkt(NAK, chksum) udt_send(sndpkt) rdt_rcv(rcvpkt) && not corrupt(rcvpkt) && has_seq1(rcvpkt) sndpkt = make_pkt(ACK, chksum) udt_send(sndpkt) sndpkt = make_pkt(NAK, chksum) udt_send(sndpkt) Wait for 0 from below Wait for 1 from below rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq1(rcvpkt) rdt_rcv(rcvpkt) && not corrupt(rcvpkt) && has_seq0(rcvpkt) sndpkt = make_pkt(ACK, chksum) udt_send(sndpkt) extract(rcvpkt,data) deliver_data(data) sndpkt = make_pkt(ACK, chksum) udt_send(sndpkt) Chapter 3, slide: 23 rdt2.1: discussion Sender: seq # added to pkt two seq. #’s (0,1) will suffice. Why? must check if received ACK/NAK corrupted twice as many states state must “remember” whether “current” pkt has 0 or 1 seq. # Receiver: must check if received packet is duplicate state indicates whether 0 or 1 is expected pkt seq # note: receiver can not know if its last ACK/NAK received OK at sender Chapter 3, slide: 24 rdt2.2: a NAK-free protocol do we really need NAKs?? instead of NAK, receiver sends ACK for last pkt received OK receiver must explicitly include seq # of pkt being ACKed duplicate ACK at sender results in same action as NAK: retransmit current pkt rdt2.2: same functionality as rdt2.1, using ACKs only Chapter 3, slide: 25 rdt3.0: channels with errors and loss New assumption: packet may be lost: underlying channel can also lose packets (data or ACKs) checksum, seq. #, ACKs, retransmissions will be of help, but not enough What else is needed? Approach: timeout policy: sender waits “reasonable” amount of time for ACK retransmits if no ACK received in this time if pkt (or ACK) just delayed (not lost): retransmission will be duplicate, but use of seq. #’s already handles this receiver must specify seq # of pkt being ACKed requires countdown timer Chapter 3, slide: 26 ECE/CS 372 – introduction to computer networks Lecture 8 Announcements: LAB2 is due today HW2 is posted and is due Tuesday next week LAB3 is posted and is due Tuesday next week Acknowledgement: slides drawn heavily from Kurose & Ross Chapter 3, slide: 27 rdt3.0 in action (still stop-n-wait w/ (0,1) sn) Chapter 3, slide: 28 rdt3.0 in action (still stop-n-wait w/ (0,1) sn) Chapter 3, slide: 29 Performance of rdt3.0: stop-n-wait rdt3.0 works, but performance stinks example: R=1 Gbps, RTT=30 ms, L=1000Byte packet: sender receiver first packet bit transmitted, t = 0 last packet bit transmitted, t = L / R RTT first packet bit arrives last packet bit arrives, send ACK ACK arrives, send next packet, t = RTT + L / R Ttransmit = L (packet length in bits) 8.103 b/pkt = R (transmission rate, bps) 109 b/sec = 8 microsec Chapter 3, slide: 30 Performance of rdt3.0: stop-n-wait rdt3.0 works, but performance stinks example: R=1 Gbps, RTT=30 ms, L=1000Byte packet: sender receiver first packet bit transmitted, t = 0 last packet bit transmitted, t = L / R first packet bit arrives last packet bit arrives, send ACK RTT ACK arrives, send next packet, t = RTT + L / R U sender: utilization – fraction of time sender busy sending U sender = L/R RTT + L / R = .008 30.008 = 0.00027 microsec Chapter 3, slide: 31 Performance of rdt3.0: stop-n-wait rdt3.0 works, but performance stinks example: R=1 Gbps, RTT=30 ms, L=1000Byte packet: sender receiver first packet bit transmitted, t = 0 last packet bit transmitted, t = L / R RTT first packet bit arrives last packet bit arrives, send ACK ACK arrives, send next packet, t = RTT + L / R 1KB pkt every 30 msec -> 33kB/sec thruput over 1 Gbps link network protocol limits use of physical resources! Chapter 3, slide: 32 Pipelining: increased utilization sender receiver first packet bit transmitted, t = 0 last bit transmitted, t = L / R first packet bit arrives last packet bit arrives, send ACK last bit of 2nd packet arrives, send ACK last bit of 3rd packet arrives, send ACK RTT ACK arrives, send next packet, t = RTT + L / R Increase utilization by a factor of 3! U sender = 3*L/R = .024 = 0.0008 30.008 Question: link utilization RTT What + L / R is the microsecon Usender ds Chapter 3, slide: 33 So far: rdt3.0 Acknowledgment and retransmission Reliable delivery Sequence numbers Duplication detection Timeout and retransmit Deal with packet losses Stop-n-wait one packet at a time Problem: efficiency Chapter 3, slide: 34 Pipelined protocols Pipelining: sender allows multiple, “in-flight”, yet-to-be-ACK’ed pkts What about the range of sequence numbers then?? What about buffering at receiver?? Two generic forms of pipelined protocols: go-Back-N and selective repeat Chapter 3, slide: 35 Go-Back-N Sender side N pckts can be sent w/o being ACK’ed, “sliding window” Single timer, oldest non-ACK’ed pckt At timeout(n), retransmit pkt n and all successive pkts Receiver side Cumulative ACK Discard out-of-order packets, no buffering Chapter 3, slide: 36 Selective repeat in action GBN? GBN? GBN? Chapter 3, slide: 37 Selective Repeat via comparison Go-Back-N Sliding window N pckts on the fly Cumulative ACK Single timer Discard out-of-order pkts Retransmit all successive pkts at timeout Selective Repeat Sliding window N pckts on the fly Selective ACK One timer for each pckt Buffer out-of-order pksts Retransmit only lost or delayed pkt at timeout Chapter 3, slide: 38 Selective repeat: dilemma Example: seq #’s: 0, 1, 2, 3 window size=3 Chapter 3, slide: 39 Selective repeat: dilemma Example: seq #’s: 0, 1, 2, 3 window size=3 receiver sees no difference in two scenarios! Even though, (a) is a retransmit pkt (b) is a new pkt in (a), receiver incorrectly passes old data as new Q: what relationship between seq # size and window size to avoid duplication problem?? Chapter 3, slide: 40 Duplication problem: illustration Let’s assume window size W = 2 for illustration Consider a scenario where: Only ACK 1 is lost Now let’s assume that seqNbr space n = W =2 That is, seqNbr pattern is 0,1,0,1,0,1,… Chapter 3, slide: 41 Duplication problem: illustration Scenario: n=W=2; only ACK 1 is lost Receiver thinks of retransmission of 2nd Pkt (Pkt1, SN=1) as a 4th/new Pkt (Pkt3,SN=1) Duplication detection problem! Chapter 3, slide: 42 Duplication problem: illustration Recap: When window size W = seqNbr space n, there is a duplication problem receiver can’t tell a new transmission from a retransmission of a lost pkt Let’s now see what happens if we increase n by 1: Now n=3 and seqNbr pattern is: 0,1,2,0,1,2,0… Let’s revisit the scenario again Chapter 3, slide: 43 Duplication problem: illustration Scenario: n=3;W=2; ACK 1 is lost Receiver still thinks of retrans. of 2nd Pkt (Pkt1, SN=1) as a new Pkt (4th w/ SN=1) Increasing n by 1 didn’t solve the duplication detection problem this time!! Chapter 3, slide: 44 Duplication problem: illustration Recap: Increasing seqNbr space n from 2 to 3 didn’t solve the duplication problem yet! Let’s now see what happens if we again increase n by 1: Now n=4 and seqNbr pattern is: 0,1,2,3,0,1,2,3,0… Let’s revisit the scenario with n=4 and see Chapter 3, slide: 45 Duplication problem: illustration Scenario: n=4;W=2; ACK 1 is lost Receiver now drops this retransmission of 2nd Pkt (Pkt1, SN=1) since its SN falls outside Duplication detection problem is now solved when n=4 w=2 Chapter 3, slide: 46 Duplication problem: illustration Recap: Increasing seqNbr space n from 2 to 4 did solve the duplication problem That is, when seqNbr space = twice window size (n = 2w) the duplication problem was solved Hence, in this case: when W ≤ n/2, the problem is solved Can we generalize? Yes, W ≤ n/2 must hold in order to avoid dup problem Chapter 3, slide: 47 SeqNbr space n & window size w relationship: general case seqNbr space n-1 Sender: Suppose ½ n < W < n W segments with seqNbr in [0, W-1] are sent, but their ACKs are not received yet 0 1 W-1 w pkts already sent, but ACKs not received yet = sender’s sliding window SeqNbr space n & window size w relationship: general case Receiver: Now assume: seqNbr expected by receiver receiver received all W segments (SeqNbr in [0,W-1]) n-1 0 1 receiver sent all W ACKs, one for each received segment receiver slides its window: Sliding window (SeqNbr of expected segments) from: W to x = 2W-n-1 x=2W-n-1 W Hint: (n-1-W+1) + (x-0+1)=W x=2W–n-1 Chapter 3, slide: 49 SeqNbr space n & window size w relationship: general case Consider worst case scenario: all W ACKs are lost Sender will re-transmit all the first W seqNbr expected by receiver n-1 segments; i.e., with SeqNbr in [0,W-1] seqNbr overlap 0 But receiver is expecting segments with SeqNbr in [W,n-1] U [0,2W-n-1] Retransmissions with seqNbr falling within [0,2W-n-1] will then be interpreted by receiver as new transmissions ( dup problem) To avoid dup problem, we must then have: 2W-n-1 ≤ -1 W ≤ ½ n x=2W-n-1 W seqNbr retransmitted by sender ECE/CS 372 – introduction to computer networks Lecture 9 Announcements: HW2 and LAB3 are due Tuesday 4th week. A possible room change (Will send email with change if happened) Midterm July 19th, 2012 (Thursday 4th week) Ch1, Ch2, and Ch3. Make sure you review the material, HW, labs Acknowledgement: slides drawn heavily from Kurose & Ross Chapter 3, slide: 51 Chapter 3 outline Principles of reliable data transfer Connection-oriented Principles of congestion control TCP congestion control transport: TCP Chapter 3, slide: 52 TCP Round Trip Time (RTT) and Timeout Why need to estimate RTT? “timeout” and “retransmit” needed to address pkt loss need to know when to timeout and retransmit Ideal world: exact RTT is needed Some intuition What happens if too short: premature timeout unnecessary retransmissions What happens if too long: slow reaction to segment loss Real world: RTTs change over time bcause pkts may take different paths network load changes over time RTTs can only be estimated Chapter 3, slide: 53 TCP Round Trip Time (RTT) and Timeout Technique: Exponential Weighted Moving Average (EWMA) EstimatedRTT(current) = (1-)*EstimatedRTT(previous) + *SampleRTT(recent) 0 < < 1; typical value: = 0.125 SampleRTT: measured time from segment transmission until ACK receipt current value of RTT Ignore retransmission EstimatedRTT: estimated based on past & present; smoother than SampleRTT to be used to set timeout period Chapter 3, slide: 54 TCP Round Trip Time (RTT) and Timeout Technique: Exponential Weighted Moving Average (EWMA) EstimatedRTT(current) = (1-)*EstimatedRTT(previous) + *SampleRTT(recent) 0 < < 1; typical value: = 0.125 Illustration: Assume: we received n RTT samples so far: Let’s order them as: 1,2,3, … ,n (1 is the most recent one, 2 is the 2nd most recent, etc.) EstimatedRTT(n) = (1-)*EstimatedRTT(n-1) + *SampleRTT(1) (EstimatedRTT(n) : estimated RTT after receiving ACK of nth Pkt.) Example: Spse 3 ACKs returned w/ SampleRTT(1), SampleRTT(2), SampleRTT(3) Question: What would be EstimatedRTT after receiving the 3ACKs ? Assume that EstimatedRTT(1) = SampleRTT(3) after receiving 1st ACK Chapter 3, slide: 55 TCP Round Trip Time (RTT) and Timeout Technique: Exponential Weighted Moving Average (EWMA) EstimatedRTT(current) = (1-)*EstimatedRTT(previous) + *SampleRTT(recent) 0 < < 1; typical value: = 0.125 What happens if is too small (say very close 0): A sudden, real change in network load does not get reflected in EstimatedRTT fast enough May lead to under- or overestimation of RTT for a long time What happens if is too large(say very close 1): Transient fluctuations/changes in network load affects EstimatedRTT and makes it unstable when it should not Also leads to under- or overestimation of RTT Chapter 3, slide: 56 Example RTT estimation: RTT: gaia.cs.umass.edu to fantasia.eurecom.fr 350 RTT (milliseconds) 300 250 200 150 100 1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106 time (seconnds) SampleRTT Estimated RTT Chapter 3, slide: 57 TCP Round Trip Time (RTT) and Timeout Setting the timeout timeout = EstimtedRTT, any problem with this??? add a “safety margin” to EstimtedRTT large variation in EstimatedRTT -> larger safety margin see how much SampleRTT deviates from EstimatedRTT: DevRTT = (1-)*DevRTT + *|SampleRTT-EstimatedRTT| (typically, = 0.25) Then set timeout interval: TimeoutInterval = EstimatedRTT + 4*DevRTT Chapter 3, slide: 58 TCP: Overview point-to-point: one sender, one receiver reliable, in-order stream: byte no “message boundaries” pipelined: TCP congestion and flow control set window size socket door send & receive buffers application writes data application reads data TCP send buffer TCP receive buffer RFCs: 793, 1122, 1323, 2018, 2581 full duplex data: bi-directional data flow in same connection MSS: maximum segment size connection-oriented: handshaking (exchange of control msgs) init’s sender, receiver state before data exchange flow controlled: sender will not socket door overwhelm receiver segment Chapter 3, slide: 59 TCP: a reliable data transfer TCP creates rdt service on top of IP’s unreliable service Pipelined segments Cumulative acks TCP uses single retransmission timer Retransmissions are triggered by: timeout events duplicate acks Initially consider simplified TCP sender: ignore duplicate acks ignore flow control, congestion control Chapter 3, slide: 60 TCP sender events: data rcvd from app: Create segment with seq # seq # is byte-stream number of first data byte in segment start timer if not already running (think of timer as for oldest unACK’ed segment) expiration interval: TimeOutInterval timeout: retransmit segment that caused timeout restart timer Ack rcvd: If acknowledges previously unACK’ed segments update what is known to be ACK’ed start timer if there are outstanding segments Chapter 3, slide: 61 TCP seq. #’s and ACKs Seq. #’s: byte stream “number” of first byte in segment’s data ACKs: seq # of next byte expected from other side cumulative ACK Host A User types ‘C’ Host B host ACKs receipt of ‘C’, echoes back ‘C’ host ACKs receipt of echoed ‘C’ simple telnet scenario time Chapter 3, slide: 62 TCP: retransmission scenarios Host A X loss Host B Seq=92 timeout Host B Seq=92 timeout timeout Host A time lost ACK scenario time premature timeout Chapter 3, slide: 63 TCP retransmission scenarios (more) timeout Host A Host B X loss time Cumulative ACK scenario Chapter 3, slide: 64 TCP ACK generation [RFC 1122, RFC 2581] Event at Receiver TCP Receiver action Arrival of in-order segment with expected seq #. All data up to expected seq # already ACKed Delayed ACK. Wait up to 500ms for next segment. If no next segment, send ACK Arrival of in-order segment with expected seq #. One other segment has ACK pending Immediately send single cumulative ACK, ACKing both in-order segments Arrival of out-of-order segment higher-than-expect seq. # . Gap detected Immediately send duplicate ACK, indicating seq. # of next expected byte Arrival of segment that partially or completely fills gap Immediate send ACK, provided that segment starts at lower end of gap Chapter 3, slide: 65 ECE/CS 372 – introduction to computer networks Lecture 10 Announcements: Midterm July 19th, 2012 (Thursday 4th week) Ch1, Ch2, and Ch3. Make sure you review the material, HW, labs Closed notes, book. Allowed only your pen/pencil and a calculator. Acknowledgement: slides drawn heavily from Kurose & Ross Chapter 3, slide: 66 Fast Retransmit client Suppose: Packet 0 gets lost Q: when retrans. of Packet 0 will happen?? Why happens at that time? A: typically at t1 ; we think it is lost when timer expires Can we do better?? server Timer is set at t0 Think of what means to receive many duplicate ACKs it means Packet 0 is lost Why wait till timeout since we know packet 0 is lost => Fast retransmit => better perfermance Why 3 dup ACK, not just 1 or 2 Think of what happens when pkt0 arrives after pkt1 (delayed, not lost) Think of what happens when pkt0 arrives after pkt1 & 2, etc. Timer expires at t1 Chapter 3, slide: 67 Fast Retransmit: recap Receipt of duplicate ACKs indicate lost of segments Sender often sends many segments back-toback If segment is lost, there will likely be many duplicate ACKs. This is how TCP works: If sender receives 3 ACKs for the same data, it supposes that segment after ACK’ed data was lost: fast retransmit: resend segment before timer expires better performance Chapter 3, slide: 68 TCP Flow Control receive side of TCP connection has a receive buffer: flow control sender won’t overflow receiver’s buffer by transmitting too much, too fast speed-matching app process may be service: matching the send rate to the receiving app’s drain rate slow at reading from buffer Chapter 3, slide: 69 TCP Flow control: how it works Rcvr advertises spare room by including value of RcvWindow in segments spare room in buffer = RcvWindow = RcvBuffer-[LastByteRcvd LastByteRead] Sender limits unACKed data to RcvWindow guarantees receive buffer doesn’t overflow Chapter 3, slide: 70 Review questions Problem: Host A Host B TCP connection between A and B B received upto 248 bytes A sends back-to-back 2 segments to B with 40 and 60 bytes B ACKs every pkt it receives Q1: Seq# in 1st and 2nd seg. from A to B ? Q2: Spse: 1st seg. gets to B first. What is seq# in 1st ACK? Chapter 3, slide: 71 Review questions Problem: Host A Host B TCP connection between A and B B received upto 248 bytes A sends back-to-back 2 segments to B with 40 and 60 bytes B ACKs every pkt it receives Q1: Seq# in 1st and 2nd seg. from A to B ? Q2: Spse: 1st seg. gets to B first. What is seq# in 1st ACK? Q3: Spse: 2nd seg. gets to B first. What is seq# in 1st ACK? And in 2nd ACK? Chapter 3, slide: 72 Review questions Problem: Host A Host B TCP connection between A and B B received upto 248 bytes A sends back-to-back 2 segments to B with 40 and 60 bytes B ACKs every pkt it receives Q1: Seq# in 1st and 2nd seg. from A to B ? Q2: Spse: 1st seg. gets to B first. What is seq# in 1st ACK? Q3: Spse: 2nd seg. gets to B first. What is seq# in 1st ACK? And in 2nd ACK? Chapter 3, slide: 73 Review questions Problem: TCP connection between A and B B received upto 248 bytes A sends back-to-back 2 segments to B with 5 and 10 bytes B ACKs every pkt it receives Now suppose: - 2 segs get to B in order. - 1st ACK is lost - 2nd ACK arrives after timeout Question: fill out all pckt seq numbers, and ACKs seq numbers in the timing diagram Chapter 3, slide: 74 Chapter 3 outline Principles of reliable data transfer Connection-oriented Principles of congestion control TCP congestion control transport: TCP Chapter 3, slide: 75 Principles of Congestion Control Oct. ’86: LBL (Lawrence Berkeley Lab) to UC-Berkeley: drop frm 32kbps to 40bps cause: end systems are sending too much data too fast for network/routers to handle manifestations: lost/dropped packets (buffer overflow at routers) long delays (queueing in router buffers) different from flow control! a top-10 problem! Chapter 3, slide: 76 Approaches towards congestion control Two broad approaches towards congestion control: End-end congestion control: no explicit feedback from network congestion inferred from end-system observed loss, delay approach taken by TCP Network-assisted congestion control: routers provide feedback to end systems single bit indicating congestion (DECbit, TCP/IP ECN) explicit rate sender should send at One more restriction on sliding window: CongWin Chapter 3, slide: 77 TCP congestion control Keep in mind: not using them! Too fast: over-utilization => waste of network resources by congesting them! Challenge is then: Too slow: under-utilization => waste of network resources by Not too slow, nor too fast!! Approach: Increase slowly the sending rates to probe for usable bandwidth Decrease the sending rates when congestion is observed => Additive-increase, multiplicative decrease (AIMD) Chapter 3, slide: 78 TCP congestion control: AIMD Additive-increase, multiplicative decrease (AIMD) (also called “congestion avoidance”) additive increase: increase CongWin by 1 MSS every RTT until loss detected (MSS = Max Segment Size) multiplicative decrease: cut CongWin in half after loss congestion window size congestion window 24 Kbytes 16 Kbytes Saw tooth behavior: probing for bandwidth 8 Kbytes time time Chapter 3, slide: 79 TCP Congestion Control: details sender limits transmission: LastByteSent-LastByteAcked CongWin Roughly, rate = How does sender perceive congestion? loss event timeout or 3 duplicate ACKs TCP sender reduces rate CongWin Bytes/sec RTT CongWin is dynamic, function of perceived network congestion (CongWin) after loss event AIMD increases window by 1 MSS every RTT Improvements: AIMD, any problem?? Think of the start of connections Solution: start a little faster, and then slow down =>“slow-start” Chapter 3, slide: 80 TCP Slow Start When connection begins, TCP addresses this via CongWin = 1 MSS Slow-Start mechanism Example: MSS = 500 Bytes When connection begins, RTT = 200 msec increase rate initial rate = 20 kbps exponentially fast available bandwidth may be >> MSS/RTT desirable to quickly ramp up to respectable rate When loss of packet occurs (indicates that connection reaches up there), then slow down Chapter 3, slide: 81 TCP Slow Start (more) How it is done Host A begins, increase rate exponentially until first loss event: RTT When connection Host B double CongWin every RTT done by incrementing CongWin for every ACK received Summary: initial rate is slow but ramps up exponentially fast time Chapter 3, slide: 82 Refinement: TCP Tahoe Question: When should exponential increase (Slow-Start) switch to linear (AIMD)? Here is how it works: Define a variable, called Threshold Start “Slow-Start” (i.e., CongWin =1 and double CongWin every RTT) At 1st loss event (timeout or 3 Dup ACKs), Set Threshold = ½ current CongWin Start “Slow-Start” (i.e., CongWin = 1 and double CongWin every RTT) When CongWin =Threshold: Switch to AIMD (linear) At any loss event (timeout or 3 Dup ACKs): Set Threshold = 1/2 CongWin Start “Slow-Start” (i.e. start over with CongWin = 1) Chapter 3, slide: 83 Refinement: TCP Tahoe Question: When should exponential increase (Slow-Start) switch to linear (AIMD)? CongWin Chapter 3, slide: 84 More refinement: TCP Reno Loss event: Timeout vs. dup ACKs 3 dup ACKs: fast retransmit client server Timer is set at t0 Timer expires at t1 Chapter 3, slide: 85 More refinement: TCP Reno Loss event: Timeout vs. dup ACKs 3 dup ACKs: fast retransmit Timeout: retransmit client server Timer is set at t0 Any difference (think congestion) ?? 3 dup ACKs indicate network still capable of delivering some segments after a loss Timeout indicates a “more” alarming congestion scenario Timer expires at t1 Chapter 3, slide: 86 More refinement: TCP Reno TCP Reno treats “3 dup ACKs” different from “timeout” How does TCP Reno work? After 3 dup ACKs: CongWin is cut in half congestion avoidance (window grows linearly) But after timeout event: CongWin instead set to 1 MSS; Slow-Start (window grows exponentially) Chapter 3, slide: 87 Summary: TCP Congestion Control When CongWin is below Threshold, sender in slow-start phase, window grows exponentially. When CongWin is above Threshold, sender is in congestion-avoidance phase, window grows linearly. When a triple duplicate ACK occurs, Threshold set to CongWin/2 and CongWin set to Threshold. When timeout occurs, Threshold set to CongWin/2 and CongWin is set to 1 MSS. Chapter 3, slide: 88 Average throughput of TCP Avg. throughout as a function of window size W and RTT? Ignore Slow-Start Let W, the window size when loss occurs, be constant When window is W, throughput is ?? throughput(high) = W/RTT Just after loss, window drops to W/2, throughput is ?? throughput(low) = W/(2RTT). Throughput then increases linearly from W/(2RTT) to W/RTT Hence, average throughout = 0.75 W/RTT Chapter 3, slide: 89 TCP Fairness Fairness goal: if K TCP sessions share same bottleneck link of bandwidth R, each should have average rate of R/K TCP connection 1 TCP connection 2 bottleneck router capacity R Chapter 3, slide: 90 Fairness (more) Fairness and parallel TCP connections nothing prevents app from opening parallel conections between 2 hosts. Web browsers do this Example: link of rate R supporting 9 connections; 9 TCPs belong to same host, each getting R/9 new app asks for 1 TCP, gets rate R/10 One host gets 9/10 of R and one gets 1/10 of R! Chapter 3, slide: 91 Question: is TCP fair? Again assume two competing sessions only, and consider AIMD: Additive increase gives slope of 1, as throughout increases Multiplicative decrease halves throughput proportionally, when congestion occurs TCP connection 1 TCP connection 2 bottleneck router capacity R Chapter 3, slide: 92 Question: is TCP fair? Again assume two competing sessions only, and consider AIMD: Additive increase gives slope of 1, as throughout increases Multiplicative decrease halves throughput proportionally, when congestion occurs Question: How would (R1, R2) vary R equal bandwidth share when AIMD is used? Does it converge to equal share? loss: decrease window by factor of 2 congestion avoidance: additive increase loss: decrease window by factor of 2 congestion avoidance: additive increase Connection 1 throughput R Chapter 3, slide: 93 Question? Again assume two competing sessions only: Additive increase gives slope of 1, as throughout increases Constant/equal decrease instead of multiplicative decrease!!! (Call this scheme: AIED) R equal bandwidth share Question: How would (R1, R2) vary when AIED is used instead of AIMD? Is it fair? Does it converge to equal share? (R1, R2) Connection 1 throughput R Chapter 3, slide: 94