CPS110: More Networks March 30, 2009 Landon Cox

advertisement
CPS110:
More Networks
Landon Cox
March 30, 2009
Virtual/physical interfaces
Applications
Ordered
messages
Reliable
messages
Byte
streams
OS
Unordere
d
messages
Hardware
Unreliabl
e
messages
Distinct
messages
Ordered messages
 Networks can re-order IP messages
 E.g. Send: A, B. Arrive: B, A
 How should we fix this?
 Assign sequence numbers (0, 1, 2, 3, 4, …)
Ordered messages
 Do what for a message that arrives out of order?
 (0, 1, 3, 2, 4)
a.
Save #3 and deliver after #2 is delivered
 (this is what TCP does)
b.
c.
Drop #3, deliver #2, deliver #4
Deliver #3, drop #2, deliver #4
b. and c. are ordered, but not reliable (messages are dropped).
Relies on the reliability layer to handle lost messages.
Ordered messages
 For a notion of order, first need “connections”
 Why?
 Must know which messages are related to each other
 Idea in TCP
 Open a connection
 Send a sequence of messages
 Close the connection
 Opening a connection ties two sockets together
 Connection is socket-to-socket unique: only these sockets can use it
 Sequence numbers are connection specific
Virtual/physical interfaces
Applications
Ordered
messages
Reliable
messages
Byte
streams
OS
Unordere
d
messages
Hardware
Unreliabl
e
messages
Distinct
messages
Reliable messages
 Usually paired with ordering
 TCP provides both ordering and reliability
 Hardware interface
 Network drops messages
 Network duplicates messages
 Network corrupts messages
 Application interface
 Every message is delivered exactly once
Detecting and fixing drops
 How to fix a dropped message?
 Have sender re-send it
 How does sender know it’s been dropped?
 Have receiver tell the sender
 Receiver may not know it’s been sent
 Like asking in the car,
 “If we left you at the theater, speak up.”
Detecting and fixing drops
 Have receiver acknowledge each message
 Called an “ACK”
 If sender doesn’t get an ACK
 Assume message has been dropped
 Resend original message
 Is this ok for the sender to assume?
 No. ACKs can be dropped too (or delayed)
Detecting and fixing drops
 Possible outcomes
 Message is delayed or dropped
 ACK is delayed or dropped
 Strategy
 Deal with all as though message was dropped
 Worst case if message wasn’t dropped after all?
 Need to deal with duplicate messages
 How to detect and fix duplicate messages?
 Easy. Just use the sequence number and drop duplicate.
What about corruption?
 Messages can also be corrupted
 Bits get flipped, etc
 Especially true over wireless networks
 How to deal with this?
 Add a checksum (a little redundancy)
 Checksum usually = sum of all bits
 Drop corrupted messages
What about corruption?
 Dropping corrupted messages is elegant
 Transforms problem into a dropped message
 We already know how to deal with drops
 Common technique
 Solve one problem by transforming it into another
1. Corruption  drops
2. Drops  duplicates
3. Drop any duplicate messages (very simple)
Virtual/physical interfaces
Applications
Ordered
messages
Reliable
messages
Byte
streams
OS
Unordere
d
messages
Hardware
Unreliabl
e
messages
Distinct
messages
Byte streams
 Hardware interface
 Send information in discrete messages
 Application interface
 Send data in a continuous stream
 Like reading/writing from/to a file
Byte streams
 Many apps think about info in distinct messages
 What if you want to send more data than fits?
 UDP max message size is 64 KB
 What if data never ends?
 Streamed media
 TCP provides “byte streams” instead of messages
Byte streams
 Sender writes messages of arbitrary size
 TCP breaks up the stream into fragments
 Reassembles the fragments at destination
 Receiver sees a byte stream
 Fragments are not visible to either process
 Programming the receiver
 Must loop until certain number of bytes arrive
 Otherwise, might get first fragment and return
Byte streams
 UDP makes boundaries visible
 TCP makes boundaries invisible
 (loop until you get everything you need)
 How to know # of bytes to receive?
1. Size is contained in header
2. Read until you see a pattern (sentinel)
3. Sender closes connection
Sentinels
 Idea: message is done when special pattern arrives
 Example: C strings
 How do we know the end of a C string?
 When you reach the null-termination character (‘\0’)
 Ok, now say we are sending an arbitrary file
 Can we use ‘\0’ as a sentinel?
 No. The data payload may contain ‘\0’ chars
 What can we do then?
Course administration
 Last day for Project 2 submissions
 89, 89, 89, 89, 87, 86, 83, 79, 79, 78, 76, 75, 49, 33, 28
 Project 3 will be out next week
 Hack into a server
 Socket programming how-to posted
 Ryan will cover in discussion section this week
 Will post example client/server code
 Any questions?
Distributed systems
You use distributed systems everyday.
Both on your PC and behind the scenes.
Motivation for distributed apps
1. Performance
Motivation for distributed apps
1. Performance
2. Co-location
 Can locate computer near local resources
 Examples of local resources

People, sensors, sensitive database
Motivation for distributed apps
1. Performance
2. Co-location
3. Reliability
 Not all computers will go down at once
 Due to floods, fires, earthquakes, etc
 Better chance of continuous service
Motivation for distributed apps
1.
2.
3.
4.
Performance
Co-location
Reliability
Already have multiple machines
 Can’t put everyone on one machine
 Try to stitch existing machines together
Building up to distributed apps
 Needed two things in multi-threaded programs
1. Atomic primitives to control thread interleavings
 (interrupt disable/enable, atomic test&set)
2. Way to share information between threads
 (shared address space)
 These won’t work for distributed applications
 No interrupt disable/enable or atomic test&set
 No shared memory
Building up to distributed apps
 Can only communicate through messages
 Send messages
 Receive messages
 If no shared data, are there race conditions?
 Sure, message might reflect an inconsistent state
 Races can cause other problems too
 Incorrect event ordering (fire missile after command)
 Mutual exclusion (two people adding last spot in class)
Send/receive primitives
 So we need sharing and synchronization
1. Obvious we can use send/receive to share
 Each message communicates information
 Messages instead of load/store to shared memory
2. Can we use send/receive for synchronization?
 Depends on the atomicity of our hardware primitives
 “Try to build large atomic regions from small ones.”
Send/receive primitives
 Atomic operation provided by hardware
 Can send a single Ethernet frame atomically
 Ethernet detects simultaneous sends
 Called a collision
 Ethernet either allows one or none
 A bigger problem in wireless networks (why?)
 Idea: build up from atomic send
X
Send/receive primitives
 Interleaved incoming packets
 OS separates the packets
 Forms whole messages from fragments
 Passes messages up to various apps
 How does OS know which packet goes to which app?
 Port number in L4 (TCP, UDP, etc) header
Send/receive primitives
 Interleaved outgoing packets
 OS ensures packets are whole
 One app’s packets won’t change another’s
 How does OS control access to the NIC?
 Device driver “serializes” send requests
 Use locks inside driver code
Client-server
 Many different distributed architectures
 Client-server is the most common
 Also called request-response
 Basic interaction
“GET /images/fish.gif HTTP/1.1”
Client-server
 Clients are machines you sit in front of
 Send request to server
 Wait for a response
 Clients generally initiate communication
 Writes are similar
POST /cgi-bin/uploadfile.pl HTTP/1.1
Content-Type: application/x-www-form-urlencoded
Content-Length: 49
filename=foo.txt&file+content=content+of+foo.txt
Producer-consumer
Soda drinker
(consumer)
Delivery person
(producer)
Vending machine
(buffer)
Producer-consumer
 Have a server manage the coke machine
 Clients can call two functions
 client_produce ()
 client_consume ()
 Both send a request to the server
 Both return when the request is done
Producer-consumer
client_produce () {
send message to add coke
wait for response
}
client_consume () {
send message to get coke
wait for response
}
server () {
receive request (from any producer or consumer client)
if (request is from a producer) {
wait for empty spot in machine
put coke in machine
} else {
wait for coke in machine
take coke out of machine
Anything
missing?
}
Need to wait for cokes/space.
send response
}
Producer-consumer
client_produce () {
send message to add coke
wait for response
}
client_consume () {
send message to get coke
wait for response
}
server () {
receive request (from any producer or consumer client)
if (request is from a producer) {
wait for empty spot in machine
put coke in machine
} else {
wait for coke in machine
Does this work?
take coke out of machine
}
No. Now we’ll deadlock.
send response
}
Producer-consumer
client_produce () {
send message to add coke
wait for response
}
client_consume () {
send message to get coke
wait for response
}
server () {
receive request (from any producer or consumer client)
if (request is from a producer) {
wait for empty spot in machine
put coke in machine
} else {
wait for coke in machine
How do we solve this?
take coke out of machine
}
Use threads at the server.
send response
}
Producer-consumer
server () {
while (1) {
receive request (from any producer or consumer client)
if (request is from a producer) {
create a thread to handle the produce reqeust
} else {
create a thread to handle the consumer request
}
}
}
server_consume () {
server_produce () {
lock
lock
while (machine is empty)
while (machine is full)
wait
wait
take coke from machine
put coke in machine
send response to client
send response to client
unlock
unlock
}
}
Producer-consumer
 Note how we used send/receive
 Used to share information
 (request/response messages)
 Provides before/after constraint
 (client waits for server)
 Receive is like wait/down, send is like signal/up
 Solution creates a thread per request
 Can we lower the per-request overhead?
Producer-consumer
 Keep a pool of worker threads
 When main loop gets a request
 Pass request to an existing worker thread
 When worker thread is done
 Wait for next request from dispatcher
 Similar to disk scheduler in P1
Other approaches
 Don’t have to use worker threads
 Just want slow operations to run in parallel
 What are the slow operations?
 Producing/consuming a coke
 Receiving a network request
Other approaches
 Want server to work while waiting for requests
 Could poll (using the select() system call)
 Instead of blocking in recv()
 Poll to see if there is a new message, call recv() if there is
 Also must avoid calling wait() if there is no coke/space
 Could use signals (SIGIO)
 Incoming network messages generate a signal
 The signal interrupts the thread blocked in wait()
 Threads are a much easier solution!
Network abstractions
 We’ve been using send/receive
 Client sends a request to the server
 Server receives request
 Server sends response to client
 What else in CS is this interaction like?
 Calling a function
Remote procedure call (RPC)
 RPC makes request/response look local
 Provide a function call abstraction
 RPC isn’t a really a function call
 In a normal call, the PC jumps to the function
 Function then jumps back to caller
 This is similar to request/response though
 Stream of control goes from client to server
 And then returns back to the client
The RPC illusion
 How to make send/recv look like a function call?
 Client wants
 Send to server to look like calling a function
 Reply from server to look like function returning
 Server wants
 Receive from client to look like a function being called
 Wants to send response like returning from function
RPC stub functions
 Key to making RPC work
 Stub functions
RPC stub functions
call
return
return
call
Client
stub
Server
stub
send
recv
send
recv
RPC stub functions
 Client stub
1) Builds request message with server function name and parameters
2) Sends request message to server stub
 (transfer control to server stub)
8) Receives response message from server stub
9) Returns response value to client
 Server stub
3) Receives request message
4) Calls the right server function with the specified parameters
5) Waits for the server function to return
6) Builds a response message with the return value
7) Sends response message to client stub
RPC notes
 Client makes a normal function call
 Can’t tell that it’s to a remote server (sort of)
 Server function is called like a normal function
 Can’t tell that it’s returning to a remote client
 Control returns to client as from a function
 Common technique to add functionality
 Leaving existing component interfaces in-place
 Implement both sides between existing layers
RPC example
 Client calls produce(5)
Client stub:
Server stub:
// must be named produce!
int produce (int n) {
int status;
send (sock, &n, sizeof(n));
recv (sock, &status,
sizeof(status));
return status;
}
// could be named anything
void produce_stub () {
int n;
int status;
recv (sock, &n, sizeof(n));
// produce func on server!
status = produce (n);
send (sock, &status,
sizeof(status));
}
Server stub code can be generated automatically (C/C++: rpcgen, Java: rmic)
What info do you need to generate the stubs?
Input parameter types and the return value type.
Java RMI example
 Remote calculator program
Problems with RPC
 How is RPC different from local function calls?
 Hard to pass pointers (and global variables)
 What happens if server dereferences a passed-in pointer?
 Pointer will access server’s memory (not client’s)
 How do we solve this?
 Send all data reachable from pointer to the server
 Change the pointers on the server to point to the copy
 Copy data back when server function returns
Example RPC with pointers






On client: int a[100];
Want to send “a” (a pointer to an array)
Copy entire array to server
Have server’s pointer point to copy of a
Copy array back to client on return
What if a is more complicated? A linked list?
 Have to marshal the transitive closure of pointer
Problems with RPC
 Data may be represented differently
 Machines have different “endianness”
 Byte 0 may be least or most significant
 Must agree on standard network representation
 RPC has different failure modes
 Server or client can fail during a call
 In local case, client and server fail simultaneously
Finishing RPC
 Where have you used RPC in CPS 110?
 Project 2 infrastructure
 libvm_app.a and libvm_pager.a hide send/recv
 C library interface to system calls is similar
 Makes something look like a function call
Structuring a concurrent system
 Talked about two ways to build a system
Alternative structure
 Can also give cooperating threads an address space
 Each thread is basically a separate process
 Use messages instead of shared data to communicate
 Why would you want to do this?
 Protection
 Each module runs in its own address space
 Reasoning behind micro-kernels
 Each service runs as a separate process
 Mach from CMU (influenced parts Mac OS X)
 Vista’s handling of device drivers
Rest of the semester
 On Wednesday we’ll begin security
 Project 3 will be a new security project
 After security, we’ll cover file systems
 Probably a guest lecture along the way
 Finish up with Google
Download