Basic Networking & Interprocess Communication Vivek Pai Nov 27, 2001 Everyone’s Getting Sick Including me – – – – Didn’t assign reading in syllabus Did get rough grades computed Still have one feedback missing Putting off new project this week 2 Mechanics Next project assigned next Tuesday Only 5 projects instead of 6 Target: use threads, synchronization – Probably using webserver again – Not building on filesystem – Suggestions welcome We’ve completely neglected the dining philosophers problem! 3 Dining Philosophers 4 Possible Solutions Philosophers go in order Place numbers on forks If stuck, drop fork, try again Defer by age or some other quantity Stab each other 5 Deadlock “Solutions” Eliminate parallelism Order resources, grab in order Determine priorities on contention Restart & randomize Deadlock performance: cost of avoiding/detecting deadlock versus work frequency & work thrown away 6 Original Lecture Goals Basic networking - Introduce the basics of networking, the new semantics versus standard file system calls, and how this affects the programming model. Discuss network basics such as naming, ports, connections, protocols, etc. Interprocess communication - Show how “networking” is useful within a single machine to communicate data. Give examples of different domains, how they are implemented, and the effects within the kernel. Show how networking and interprocess communication can be used to allow easy distribution of applications. 7 Communication You’ve already seen some of it – Web server project(s) Machines have “names” – – – – Human-readable names are convenience “Actual” name is IP (Internet Protocol) address For example, 127.0.0.1 means “this machine” nslookup www.cs.princeton.edu gives 128.112.136.11 8 Names & Services Multiple protocols – ssh, ftp, telnet, http, etc. – How do we know how to connect? Machines also have port numbers – 16 bit quantity (0-65535) – Protocols have default port # – Can still do “telnet 128.112.136.11 80” 9 But The Web Is Massive Possible names >> possible IP addresses – – – – World population > possible IP addresses Many names map to same IP addr Use extra information to disambiguate In HTTP, request contains “Host: name” header Many connections to same (machine, port #) – Use (src addr, src port, dst addr, dst port) to identify connection 10 Circuit Switching versus Packet Switching Circuit – reserve resources in advance – Hold resources for entire communication – Example: phone line Packet – break data into small pieces – – – – Pieces identify themselves, “share” links Individual pieces routed to destination Example: internet Problem: no guarantee pieces reach 11 Who Got Rich By Packet Switching? 12 The “End To End” Argument Don’t rely on lower layers of the system to ensure something happens If it needs to occur, build the logic into the endpoints Implications: – Intermediate components simplified – Repetition possible at endpoints (use OS) What is reliability? 13 Do Applications Care? Some do Most don’t – Use whatever OS provides – Good enough for most purposes What do applications want? – Performance – Simplicity 14 Reading & Writing A file: – – – – – Is made into a “descriptor” via some call Is an unstructured stream of bytes Can be read/written OS provides low-level interaction Applications use read/write calls Sounds workable? 15 Network Connections As FDs Network connection usually called “socket” Interesting new system calls – – – – – socket( ) – creates an fd for networking use connect( ) – connects to the specified machine bind( ) – specifies port # for this socket listen( ) – waits for incoming connections accept( ) – gets connection from other machine And, of course, read( ) and write( ) 16 New Semantics Doing a write( ) – – – – What’s the latency/bandwidth of a disk? When does a write( ) “complete”? Where did data actually go before? Can we do something similar now? What about read( ) – When should a read return? – What should a read return? 17 Buffering Provided by OS – Memory on both sender and receiver sides – Sender: enables reliability, quick writes – Receiver: allows incoming data before read Example – assume slow network – write(fd, buf, size); – memset(buf, 0, size) – write(fd, buf, size); 18 Interprocess Communications Shared memory – Threads sharing address space – Processes memory-mapping the same file – Processes using shared memory system calls Sockets and read/write – Nothing prohibits connection to same machine – Even faster mechanism – different “domain” – Unix domain (local) versus Internet (either) 19 Sockets vs Shared Memory Sockets – Higher overhead – No common parent/file needed – Synchronous operation Shared memory – Locking due to synchronous operation – Fast reads/writes – no OS intervention – Harder to use multiple machines 20 Even More Semantics How do you express the following: – Do (task) until (message received) – Do (this task) until (receiver not ready) – Do (task) until (no more data) Problem: implies knowing system behavior Related: what happens when buffer fills/empties? (hint: think of filesystem) 21 Synchronous vs Asynchronous Synchronous: do it now, wait until over Asynchronous: start it now, check later Somewhat related: Blocking: wait until it’s all done Nonblocking: only do what can be done without blocking 22 Transferring Large Files OS buffers are 16-64KB Large files are >> buffer size Assume two clients – Each requests a different large file – Both are on slow networks How do you design your server? 23 Server Design Choices Processes – Each client handled by a different process Threads – Each client handled by a different thread Single process – Use nonblocking operations, multiplex 24 Processing Steps Accept Conn Read Request Find File Send Header Read File Send Data end 25 Blocking Steps Disk Blocking Accept Conn Read Request Find File Send Header Read File Send Data end Network Blocking 26 Concurrency Architecture Overlap disk, network, & application-level processing Architecture how steps are overlapped Note: implications for performance 27 Multiple Processes (MP) Process 1 Accept Conn Read Request Find File Send Header Read File Send Data Read Request Find File Send Header Read File Send Data Process N Accept Conn Pro: simple programming – rely on OS Cons: too many processes caching harder 28 Multiple Threads (MT) Accept Conn Read Request Find File Send Header Pro: shared address space lower “context switch” overhead Cons: many threads requires kernel thread support synchronization needed Read File Send Data 29 Single Process Event Driven (SPED) Accept Conn Read Request Find File Send Header Read File Send Data Event Dispatcher Pro: single address space no synchronization Cons: must explicitly handle pieces in practice, disk reads still block 30 Homework Read about the select( ) system call Reading from syllabus I may send an e-mail with papers 31