Network Protocols Mark Stanovich Operating Systems COP 4610 Protocol • An agreement between two parties as to how information is to be transmitted • A network protocol abstracts packets into messages Physical Reality vs. Abstraction Physical reality: packets Abstraction: messages Limited size Arbitrary size Unordered Ordered Unreliable Reliable Machine-to-machine Process-to-process Only on local area network Routed anywhere Asynchronous Synchronous Insecure Secure Physical Reality vs. Abstraction Physical reality: packets Abstraction: messages Limited size Arbitrary size Unordered Ordered Unreliable Reliable Machine-to-machine Process-to-process Only on local area network Routed anywhere Asynchronous Synchronous Insecure Secure Arbitrary-Size Messages • Can be built on top of limited-size ones – By splitting a message into fix-sized packets • Checksum can be computed on each fragment or the whole message Internet Protocol (IP) • Provides unreliable, unordered, machine-tomachine transmission of arbitrary-size messages Process-to-Process Communications • Built on top of machine-to-machine communications through the use of addresses • Each message contains the destination address to talk to the correct machine User Datagram Protocol (UDP) • Provides unreliable, unordered, user-to-user communication • Built on the top of IP • Generally lower latency at the cost of reliability • Sometimes referred to as Unreliable Datagram Protocol Ordered Messages • Built on top of unordered ones • Use sequence numbers to indicate the order of arrival – Specific to a connection • If packet 3 arrives before packet 2, wait for packet 2. • Always deliver packets in order, to user applications Reliable Message Delivery • Built on top of unreliable delivery • Problem: Network infrastructure can garble messages – Packets can be dropped if network buffers are full Solution • Checksum each message • At a receiver, discard messages with mismatching checksums • A receiver acknowledges if a packet is received properly • A sender resends the same message after not hearing the acknowledgment for some time (a timeout period) A Minor Problem • A sender may send twice, if the first acknowledge is lost • The receiver needs to discard duplicate packets Implications • A sender needs to buffer messages that are not yet acknowledged • The receiver must track messages that could be duplicates Transmission Control Protocol (TCP) • Provides a reliable byte stream between two processes on different machines over the Internet sequence number: 1 checksum: fa73cd10 Transmission Control Protocol • Fragments the byte stream into packets and hands them to IP TCP Message Categories • Sender – Sent and acknowledged – Sent and not acknowledged – Not yet sent • Receiver – Forwarded to application – Received and buffered – Not yet received More on the Sequence Number • Need a way to recycle sequence numbers – Each TCP packet has a time-to-live field • If the packet is not delivered in X seconds – The packet is dropped – Sequence numbers can be reused – An epoch number used to identify which set of sequence numbers is being used • Incremented at each boot • Stored on disk Congestion • Implications of timeout period at a sender – Too long unnecessary waiting – Too short a message is transmitted when an acknowledgement is in transit • Network congestion delayed acknowledgement timeout data retransmission more congestion TCP Solution • Slow start: TCP starts by sending a small amount of data – If no timeout, more data is sent – If timeout, TCP reduces the amount of data being sent The Two Generals’ Problem • Two generals are on the tops of two mountains… – They communicate only through messengers… • They need to coordinate the attack… – If they attack at the same time, they win… – If they attack at different times, they will…die… The Two Generals’ Problem • Question: can they guarantee a synchronized attack? The Two Generals’ Problem Illustrated General X 11am OK? General Y 11am sounds good So, 11am it is. Yeah, what if you don’t get this ack? The Two Generals’ Problem Over an unreliable network, we cannot guarantee that two computers will coordinate an action Distributed Transaction • Multiple machines agree to do something atomically, but not necessarily at exactly the same time • Mechanism: two-phase commit Two-Phase Commit Account X Account Y Phase 1: ask if each can commit 1. Begin transaction Ask Y for $1 Enough cash 2. Write “Y = Y - $1” Ready to commit Phase 2: commit 3. Write “X = X + $1” 4. Commit Ask Y to commit 5. Commit Scenarios • If Y crashes between 1 and 2 – Y will wake up and do nothing – X will timeout and abort the transaction • If X crashes before step 4 – X will wake up and abort the transaction • If X crashes between 4 and 5 – Y will timeout and ask X for the transaction status Scenarios • If Y crashes between 2 and 5 – Y will wake up and check the log – When X sends Y the commit message, Y will commit – Y can also timeout and ask X the current status