CPS 114: Introduction to Computer Networks Lecture 4: Reliable Transmission Xiaowei Yang xwy@cs.duke.edu Overview • Link layer functions – – – – Encoding Framing Error detection Reliable transmission Clock-based Framing •STS-1/OC-1 frame •51.840Mbps •The slowest SONET link • Synchronous Optical Network (SONET) – A complex protocol • Each frame is 125 us long, 810bytes = 125 us * 51.84Mbps – 9 rows of 90 bytes each – First 3 bytes are overhead – First two bytes of each frame has a special pattern marking the start of a frame • When the special pattern turns up in the right place enough times (every 810B), a receive concludes it’s in sync. Synchronized timeslots as placeholder • Real frame data may float inside Overview • Link layer functions – – – – Encoding Framing Error detection Reliable transmission Error detection • Error detection code adds redundant information to detect errors – – – – Analogy: sending two copies of the same message Parity Checksum CRC • Error correcting code: more sophisticated code that can correct errors Two-dimensional parity A sample frame of six bytes • Even parity bit – Make an even number of 1s in each row and column • Detect all 1,2,3-bit errors, and most 4-bit errors Internet checksum algorithm • Basic idea – Add all the words transmitted and then send the sum. – Receiver does the same computation and compares the sums • IP checksum – Adding 16-bit short integers using 1’s complement arithmetic – Take 1’s complement of the result 1’s complement • -x is each bit of x inverted • If there is a carry bit, add 1 to the sum • Example: 4-bit integer – -3: 1100 (invert of 0011) – -4: 1011 (invert of 0100) – -3 + -4 = 0111 + 1 = 1000 (invert of 0111 (-7)) IP checksum implementation (used in labs) • uint16_t cksum (const void *_data, int len) { const uint8_t *data = _data; uint32_t sum; for (sum = 0;len >= 2; data += 2, len -= 2) sum += data[0] << 8 | data[1]; if (len > 0) sum += data[0] << 8; while (sum > 0xffff) sum = (sum >> 16) + (sum & 0xffff); sum = htons (~sum); return sum ? sum : 0xffff; } Remarks • Can detect 1 bit error • But not all two-bits – One increases the sum, and one decreases it • Efficient for software implementation – Needs to be done for every packet inside a router! Cyclic Redundancy Check • A branch of finite fields • High-level idea: – Represent an n+1-bit message with an n degree polynomial M(x) – Divide the polynomial by a k-bit divisor C(x) – k-bit CRC: remainder after divided by a degree-k divisor polynomial – Send Message + CRC that is dividable by C(x) Polynomial arithmetic modulo 2 – B(x) can be divided by C(x) if B(x) has higher degree – B(x) can be divided once by C(x) if of same degree – Remainder of B(x)/C(x) = B(x) – C(x) – Substraction is done by XOR each pair of matching coefficients CRC algorithm 1. Multiply M(x) by x^k. Add k zeros to Message. Call it T(x) 2. Divide T(x) by C(x) and find the remainder 3. Send P(x) = T(x) – remainder • • Append remainder to T(x) P(x) dividable by C(x) An example • 8-bit msg – 10011010 • Divisor (3bit CRC) – 1101 How to choose a divisor • Complicated • Intuition: unlikely to be divided evenly by an error • Find C(x) by looking it up in a book Hardware implementation • Very efficient: XOR operations • 0 to k-1 registers (k-bit shift registers) • If nth (n < k) term is not zero, places an XOR gate • x3 + x2 + 1 Overview • Link layer functions – – – – Encoding Framing Error detection Reliable transmission Reliable transmission • What to do if a receiver detects bit errors? • Two high-level approaches – Forward error correction (FEC): the correction of errors handled in advance by sending – Retransmission (lab 1) • Acknowledgements: a small piece of control information sent back to its peer acknowledging the receipt of a frame – Can be “piggybacked” on data packets • Timeouts: if a sender does not receive an ack after some time, retransmit the original frame • Also called Automatic repeat request (ARQ) Stop-and-wait • Send one frame, wait for an ack, and send the next • Retransmit if times out • Note in the last figure (d), there might be confusion: a new frame, or a duplicate? Sequence number • Add a sequence number to each frame to avoid the ambiguity Stop-and-wait drawback • Revisiting bandwidth-delay product – Total delay/latency = transmission delay + propagation delay + queuing • Queuing is the time packet sent waiting at a router’s buffer • Will revisit later (no sweat if you don’t get it now) Delay * bandwidth product • For a 1Mbps pipe, it takes 8 seconds to transmit 1MB. If the link latency is less than 8 seconds, the pipe is full before all data are pumped into the pipe • For a 1Gbps pipe, it takes 8 ms to transmit 1MB. Stop-and-wait drawback • A 1Mbps link with a 100ms two-way delay (round trip time, RTT) • 1KB frame size • Throughput = 1KB/ (1KB/1Mbps + 100ms) = 74Kbps << 1Mbps • Delay x bandwidth = 100Kb • So we could send ~12 frames before the pipe is full! • Throughput = 100Kb/(1KB/1Mbps + 100ms) = 926Kbps Sliding window • Key idea: allowing multiple outstanding (unacked) frames to keep the pipe full Sliding window on sender • Assign a sequence number (SeqNum) to each frame • Maintains three variables – Send window size (SWS) that bounds the number of unacked frames the sender can transmit – Last ack received (LAR) that denotes the sequence number of the last ACK received – Last frame sent (LFS) that denotes the sequence number of the last frame sent • Invariant: LFS – LAR ≤ SWS Slide window this way when an ACK arrives • Sender actions – When an ACK arrives, moves LAR to the right, opening the window to allow the sender to send more frames – If a frame times out before an ACK arrives, retransmit Sliding window on receiver • Maintains three window variables – Receive window size (RWS) that bounds the number of out-of-order frames it’s willing to receive – Largest acceptable frame (LAF) denotes the sequence number of that frame – Last frame received (LFR) denotes the sequence number of last frame received • Invariant – LAF – LFR ≤ RWS • When a frame with SeqNum arrives – Discards it if out of window • Seq ≤ LFR or Seq > LAF – If in window, decides what to ACK • Cumulative ack: a variable SeqNumToAck denotes the largest sequence number not acked but all frames with a sequence number smaller than it have been acked • Acks SeqNumToAck even if higher-numbered packets have been received • Sets LFR = SeqNumToAck, LAF = LFR + RWS • Updates SeqNumToAck Finite sequence numbers • Things may go wrong when SWS=RWS, SWS too large • Example – 3-bit sequence number, SWS=RWS=7 – Sender sends 0, …, 6; receiver acks, expects (7,0, …, 5), but all acks lost – Sender retransmits 0,…,6; receiver thinks they are new • SWS < (MaxSeqNum+1)/2 – Alternates between first half and second half of sequence number space as stop-and-wait alternates between 0 and 1 Sliding window protocol (SWP) implementation typedef u_char SwpSeqno; typedef struct { SwpSeqno SeqNum; /* sequence number of this frame */ SwpSeqno AckNum; /* ack of received frame */ u_char Flags; /* up to 8 bits' worth of flags */ } SwpHdr; Your code will look very different! typedef struct { /* sender side state: */ SwpSeqno LAR; /* seqno of last ACK received */ SwpSeqno LFS; /* last frame sent */ Semaphore sendWindowNotFull; SwpHdr hdr; /* preinitialized header */ struct sendQ_slot { Event timeout; /* event associated with send-timeout */ Msg msg; } sendQ[SWS]; /* receiver side state: */ SwpSeqno NFE; /* seqno of next frame expected */ struct recvQ_slot { int received; /* is msg valid? */ Msg msg; } recvQ[RWS]; } SwpState; static int sendSWP(SwpState *state, Msg *frame) { struct sendQ_slot *slot; hbuf[HLEN]; /* wait for send window to open */ semWait(&state->sendWindowNotFull); state->hdr.SeqNum = ++state->LFS; slot = &state->sendQ[state->hdr.SeqNum % SWS]; store_swp_hdr(state->hdr, hbuf); msgAddHdr(frame, hbuf, HLEN); msgSaveCopy(&slot->msg, frame); slot->timeout = evSchedule(swpTimeout, slot, SWP_SEND_TIMEOUT); return sendLINK(frame); } static int deliverSWP(SwpState state, Msg *frame) { SwpHdr hdr; char *hbuf; hbuf = msgStripHdr(frame, HLEN); load_swp_hdr(&hdr, hbuf) if (hdr->Flags & FLAG_ACK_VALID) { /* received an acknowledgment---do SENDER side */ if (swpInWindow(hdr.AckNum, state->LAR + 1, state->LFS)) { do { struct sendQ_slot *slot; slot = &state->sendQ[++state->LAR % SWS]; evCancel(slot->timeout); msgDestroy(&slot->msg); semSignal(&state->sendWindowNotFull); } while (state->LAR != hdr.AckNum); } } if (hdr.Flags & FLAG_HAS_DATA) { struct recvQ_slot *slot; /* received data packet---do RECEIVER side */ slot = &state->recvQ[hdr.SeqNum % RWS]; if (!swpInWindow(hdr.SeqNum, state->NFE, state->NFE + RWS - 1)) { /* drop the message */ return SUCCESS; } msgSaveCopy(&slot->msg, frame); slot->received = TRUE; if (hdr.SeqNum == state->NFE) { Msg m; while (slot->received) { deliverHLP(&slot->msg); msgDestroy(&slot->msg); slot->received = FALSE; slot = &state->recvQ[++state->NFE % RWS]; } /* send ACK: */ prepare_ack(&m, state->NFE - 1); sendLINK(&m); msgDestroy(&m); }} return SUCCESS; } static bool swpInWindow(SwpSeqno seqno, SwpSeqno min, SwpSeqno max) { SwpSeqno pos, maxpos; pos = seqno - min; /* pos *should* be in range [0..MAX)*/ maxpos = max - min + 1; /* maxpos is in range [0..MAX]*/ return pos < maxpos; } Flow control with sliding window • Remark: perhaps the best-known algorithm in computer networking • Multiple functions – Reliable deliver frames over a link – In-order delivery to upper layer protocol – Flow control • Not to overun a slow slower – Congestion control (later) • Not to congest the network Other ACK mechanisms • NACK: negative acks for packets not received – unnecessary, as sender timeouts would catch this information • SACK: selective ACK the received frames – + No need to send duplicate packets – - more complicated to implement – Newer version of TCP has SACK Conclusion • A lot for today – CRC – Reliability • FEC • Stop-and-Wait • Sliding window